Prime numbers refresher¶

This page is a beginner-friendly refresher for experiments about prime numbers. It is intentionally broad: it covers the prime-related ideas most commonly explored with computation (distribution, sieves, primality testing, gaps, and prime patterns).

Core definitions¶

Primes and composites¶

A positive integer \(p \ge 2\) is prime if its only positive divisors are \(1\) and \(p\). If \(n \ge 2\) is not prime, it is composite.

Equivalently,

\[ p \text{ is prime } \iff \forall\, a,b \in \mathbb{N}: \; p = ab \Rightarrow (a=1 \text{ or } b=1). \]

Greatest common divisor and coprimality¶

The greatest common divisor of integers \(a,b\), written \(\gcd(a,b)\), is the largest positive integer dividing both. We say \(a\) and \(b\) are coprime if \(\gcd(a,b)=1\).

Coprimality appears constantly in modular arithmetic and in many prime-related proofs.

Congruences (modular arithmetic)¶

For integers \(a,b\) and \(m\ge 2\),

\[ a \equiv b \pmod m \]

means \(m\) divides \(a-b\). Many prime experiments (residue classes, sieves, primality tests) are naturally expressed modulo \(m\).

Prime factorization (the “atomic” viewpoint)¶

The Fundamental Theorem of Arithmetic says every integer \(n\ge 2\) factors uniquely (up to ordering) as

\[ n = \prod_{i=1}^k p_i^{\alpha_i}, \]

where the \(p_i\) are distinct primes and the \(\alpha_i\) are positive integers. This makes primes the “building blocks” of the integers. [Hardy and Wright, 2008]

Key quantitative language¶

Infinitely many primes¶

Euclid’s classic argument shows there are infinitely many primes. This matters experimentally because it justifies questions like “how often do primes occur as numbers grow?” [Hardy and Wright, 2008]

The prime counting function and the Prime Number Theorem¶

Let \(\pi(x)\) be the number of primes \(\le x\). The Prime Number Theorem (PNT) states

\[ \pi(x) \sim \frac{x}{\log x}\quad (x\to\infty). \]

Informally: primes thin out, and the “typical” gap size near \(x\) is about \(\log x\). [Apostol, 1976]

For experiments, two very common comparisons are:

\(\pi(x)\) vs. \(x/\log x\),
\(\pi(x)\) vs. \(\mathrm{Li}(x)\) (the logarithmic integral), which is often a better approximation in practice. [Apostol, 1976, Rosser and Schoenfeld, 1962]

Explicit bounds (useful sanity checks)¶

Many sources provide explicit inequalities for \(\pi(x)\), the \(n\)th prime \(p_n\), and related functions. These bounds are useful to validate computations and pick experiment ranges safely. [Axler, 2019, Dusart, 2018, Rosser and Schoenfeld, 1962]

Computational building blocks¶

Generating primes: sieves¶

If you need all primes up to a bound \(N\), a sieve is usually best.

Sieve of Eratosthenes (idea). Start with a boolean array “possibly prime,” then repeatedly cross out multiples of each found prime. Time is roughly \(O(N\log\log N)\), memory is \(O(N)\) booleans.

Segmented sieve. For large \(N\), store only a window \([L,R]\) at a time; this keeps memory bounded while preserving speed. Segmented sieves are often the right choice once \(N\) grows beyond typical RAM-friendly sizes. [Crandall and Pomerance, 2005]

Testing primality (single numbers)¶

If you need to test the primality of individual (possibly large) integers, common strategies are:

Trial division up to \(\sqrt{n}\) (great for small \(n\), too slow for large),
Probabilistic tests such as Miller–Rabin, fast and extremely reliable with good parameters, [Rabin, 1980]
Deterministic polynomial-time primality testing (AKS), important theoretically but rarely used for practical large-number work. [Agrawal et al., 2004]

In an experiment repo, it is common to use Miller–Rabin for speed and then either: (1) accept “probable prime” status (with a stated error bound) or (2) confirm with stronger checks for the sizes you care about. [Crandall and Pomerance, 2005]

Factorization (often the bottleneck)¶

Many arithmetic functions become easy once a number is factored, but factoring large integers is hard. Even “toy” factorization methods (e.g., Pollard’s rho) make excellent experiments because performance varies wildly with input structure. [Crandall and Pomerance, 2005]

Prime gaps and prime patterns¶

Prime gaps¶

Let \(p_n\) be the \(n\)th prime. The prime gap sequence is

\[ g_n = p_{n+1}-p_n. \]

Heuristically, “typical” gaps near \(p\) are size \(\log p\), but gaps vary a lot. Two natural experimental viewpoints are:

local statistics: histogram of \(g_n\) in a range,
record gaps: the largest gap seen up to \(x\) (maximal gaps / first occurrences). [Caldwell, n.d., Cramér, 1936, Nicely, 1999]

A common normalization is \(g_n/\log p_n\), which helps compare different scales.

Twin, cousin, and sexy primes (prime pairs)¶

A prime pair is a pair of primes \((p,p+d)\) with fixed even difference \(d\):

twin primes: \(d=2\),
cousin primes: \(d=4\),
sexy primes: \(d=6\).

These are all instances of “prime constellations.”

Prime \(k\)-tuples (constellations)¶

A prime \(k\)-tuple is a set of offsets \(\{h_1,\dots,h_k\}\) such that \(p+h_i\) are simultaneously prime. Hardy–Littlewood heuristics predict asymptotic counts for many such patterns (including twins). [Hardy and Littlewood, 1923]

Modern sieve breakthroughs show bounded prime gaps exist (there are infinitely many \(g_n\) below some constant), but we still do not know whether the twin prime conjecture is true. [Maynard, 2015, Zhang, 2014, D. H. J. Polymath, 2014]

Primes in structure (residue classes and progressions)¶

Residue classes¶

Primes (except 2) are odd, so they lie in residue classes modulo 2. More generally, you can study how primes distribute among residue classes modulo \(q\). Experiments here include “prime races” and small biases in counts.

Arithmetic progressions of primes¶

A classical result (Dirichlet’s theorem) says that if \(\gcd(a,q)=1\), then the arithmetic progression

\[ a,\; a+q,\; a+2q,\dots \]

contains infinitely many primes. A landmark modern result (Green–Tao) shows primes contain arbitrarily long arithmetic progressions. [Green and Tao, 2008]

For experiments, you can stay at an elementary level (searching for 3-term or 4-term prime APs in ranges) while still getting meaningful patterns.

Other prime families you may include (optional)¶

These are often nice “side quests” that still fit under “Prime Numbers”:

Sophie Germain primes: \(p\) prime and \(2p+1\) prime,
safe primes: \(q\) prime with \((q-1)/2\) prime,
Mersenne primes: \(2^p-1\) prime (you already have a dedicated page), [contributors, 2025]
palindromic / repunit primes: “pattern-based” primes useful for search experiments.

What experiments typically visualize¶

Typical “prime numbers lab” plots and tables include:

Prime density: \(\pi(x)\) vs. \(x/\log x\), and error curves \(\pi(x)-x/\log x\). [Apostol, 1976]
Bounds and approximations: compare \(\pi(x)\) with explicit inequalities (sanity checks). [Dusart, 2018, Rosser and Schoenfeld, 1962]
Sieve scaling: runtime/memory vs. \(N\) for classic vs. segmented sieves. [Crandall and Pomerance, 2005]
Gaps: histograms of \(g_n\), normalized gaps \(g_n/\log p\), record gaps vs. \(\log^2 x\). [Cramér, 1936, Nicely, 1999]
Prime pairs: counts of twin/cousin/sexy primes in \([1,x]\), compared to Hardy–Littlewood style heuristics. [Hardy and Littlewood, 1923]
Bounded gaps story: “what bound was known when?” (Zhang → Maynard → Polymath) plus empirical gap data. [Maynard, 2015, Zhang, 2014, D. H. J. Polymath, 2014]
Primality tests: speed vs. integer size (bits), and false-positive rates for weak tests vs. Miller–Rabin. [Agrawal et al., 2004, Rabin, 1980]
Progressions: frequency of 3-term prime APs in ranges; maximal AP length in a window. [Green and Tao, 2008]

For “ground truth” sequences and curated lists, OEIS and PrimePages are practical reference points. [Caldwell, n.d., OEIS Foundation Inc., 2025, OEIS Foundation Inc., 2025]

Practical numerical caveats¶

Pick the right primitive: if you need many primes up-to \(N\), use a sieve; if you need primality of a few large numbers, use primality testing.
Memory matters for sieves: a naive boolean list of length \(N\) can be huge; segmented sieves avoid this.
State what “prime” means in code: if you use Miller–Rabin, your results are “probable primes” unless you add a deterministic guarantee for your size range. [Rabin, 1980]
Avoid floating-point traps: use natural logs, guard against \(\log(0)\), and remember that \(\log x\) changes slowly.
Be precise with pair-counting: decide whether you count pairs by their smaller prime \(p\) (common), and avoid double-counting.
Separate discovery from verification: for pattern-hunting (e.g., long APs), do fast screening first, then verify candidates carefully.

References¶

See References.

[Agrawal et al., 2004, Apostol, 1976, Cramér, 1936, Crandall and Pomerance, 2005, Dusart, 2018, Green and Tao, 2008, Hardy and Littlewood, 1923, Hardy and Wright, 2008, Maynard, 2015, Rabin, 1980, Rosser and Schoenfeld, 1962, Zhang, 2014, D. H. J. Polymath, 2014]