# E019: Prime counting and a PNT baseline

**Published run: $\text{n}_{\max} = 200\,000$ and $\pi(\text{n}_{\max}) = 17\,984$**
```{figure} ../_static/experiments/e019_hero_200000.png
:width: 80%
:alt: Prime counting vs x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_2_200000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_3_200000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) ratio (E019)
```

**Published run: $\text{n}_{\max} = 2\,000\,000$ and $\pi(\text{n}_{\max}) = 148\,933$**
```{figure} ../_static/experiments/e019_hero_2000000.png
:width: 80%
:alt: Prime counting vs x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_2_2000000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_3_2000000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) ratio (E019)
```

**Published run: $\text{n}_{\max} = 10\,000\,000$ and $\pi(\text{n}_{\max}) = 664\,579$**
```{figure} ../_static/experiments/e019_hero_10000000.png
:width: 80%
:alt: Prime counting vs x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_2_10000000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_3_10000000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) ratio (E019)
```

**Published run: $\text{n}_{\max} = 30\,000\,000$ and $\pi(\text{n}_{\max}) = 1\,857\,859$**
```{figure} ../_static/experiments/e019_hero_30000000.png
:width: 80%
:alt: Prime counting vs x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_2_30000000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) (E019)
```

```{figure} ../_static/experiments/e019_hero_3_30000000.png
:width: 80%
:alt: Approximation error pi(x) - x/log(x) ratio (E019)
```

*Prime counts shown are $\pi(\text{n}_{\max})$, i.e., the number of primes $\le \text{n}_{\max}$.*
In this experiment we evaluate $\pi(x)$ for all integers $x$ with $2 \le x \le \text{n}_{\max}$.

**Tags:** `number-theory`, `quantitative-exploration`, `visualization`  
See: {doc}`../tags`.

## Highlights

- Visual sanity check for your prime pipeline: compute $\pi(x)$ from a sieve mask and compare to a standard smooth baseline.
- Two complementary plots: $\pi(x)$ vs. $x/\log(x)$, and the finite-range error curve $\pi(x) - x/\log(x)$.
- Ratio plot: $\pi(x) / (x/\log(x))$ makes relative deviation visible when the two curves nearly overlap.
- Writes reproducible artifacts (`params.json`, `report.md`, and figures) into `out/e019/`.

## Goal

Visualize how well the Prime Number Theorem baseline $x/\log(x)$ tracks the prime-counting function $\pi(x)$ on a **finite**
 range and makes the approximation error visible.

## Background (quick refresher)

### Prime-counting function

The **prime-counting function** $\pi(x)$ is the number of primes $p$ with $p \le x$.

### The PNT baseline

The Prime Number Theorem (PNT) states that

$$
\pi(x) \sim \frac{x}{\log x}
\quad \text{as } x \to \infty.
$$

Here $\log$ is the natural logarithm.
The symbol “$\sim$” is **asymptotic**: it does not claim the two expressions are close for small or moderate $x$.

### Optional background pages

- {doc}`../background/prime-counting-approximations`
- {doc}`../background/prime-counting-bounds`
- {doc}`../background/prime-numbers`

## Research question

On the finite range $2 \le x \le \text{n}_{\max}$:

1. How close is $x/\log(x)$ to $\pi(x)$ as $x$ grows?
2. What does the finite-range error $\pi(x) - x/\log(x)$ look like (trend, scale, rough smoothness)?

## Why this qualifies as a mathematical experiment

- **Finite procedure:** compute primes up to $\text{n}_{\max}$, then compute the cumulative count $\pi(x)$.
- **Observable(s):** the comparison curve $\pi(x)$ vs. $x/\log(x)$ and the finite-range error curve.
- **Parameter space:** $\text{n}_{\max}$ (and, if you extend it, sampling choices such as linear vs. log-spaced grids).
- **Outcome:** figures plus a short report capturing the key observation and caveats.
- **Reproducibility:** outputs saved to `out/e019/` with a parameter snapshot.

## Experiment design

- **Computation:**
  - Build a prime mask for $\{0,1,\dots,\text{n}_{\max}\}$ using a sieve.
  - Convert the mask to a cumulative array so that $\pi(x)$ can be read off quickly.
  - For each integer $x$ with $2 \le x \le \text{n}_{\max}$, compute the baseline $x/\log(x)$.
- **Axes (what the plots mean):**
  - In the first plot, the horizontal axis is $x$ and the vertical axis is a **count**:
    it shows $\pi(x)$ and $x/\log(x)$ for the same $x$ values.
  - In the second plot, the horizontal axis is $x$ and the vertical axis is the **count difference**
    $\pi(x) - x/\log(x)$.
- **Outputs (typical):**
  - `figures/fig_01_pi_vs_x_logx.png`
  - `figures/fig_02_pi_minus_x_logx.png`
  - `figures/fig_03_pi_over_x_logx_ratio.png`
  - `params.json`
  - `report.md`

## How to run

```bash
make run EXP=e019
```


Override the upper bound (computes primes up to $\text{n}_{\max}$):

```bash
make run EXP=e019 ARGS="--n-max 5000000"
```
Direct invocation (always works):

```bash
uv run --extra dev python -m mathxlab.experiments.e019 --out out/e019
# Override the upper bound:
uv run --extra dev python -m mathxlab.experiments.e019 --out out/e019 --n-max 5000000
```

## Notes / pitfalls

- **Finite-range behavior:** the PNT statement is asymptotic; interpret these plots as “finite-range behavior”.
- **Parameter exposure:** use `--n-max` to change $\text{n}_{\max}$; keep published snapshots consistent with `params.json` / `report.md`.
- **Scale masking:** on a linear axis, $\pi(x)$ and $x/\log(x)$ are close in *relative* terms, so runs with different $\text{n}_{\max}$ can look very similar. Use the ratio plot $\pi(x) / (x/\log(x))$ for a relative view.
- **Log domain:** avoid $x=0$ or $x=1$ because $\log(x)$ is $0$ or undefined there; we start at $x=2$.
- **Visual overclaims:** a smooth-looking error curve does not imply a theorem about error terms.

## Extensions

- Add a second baseline $\mathrm{Li}(x)$ and compare $\pi(x) - \mathrm{Li}(x)$.
- Add explicit bounds (Rosser–Schoenfeld / Dusart) and visualize which ranges they dominate.
- Add a sampling mode:
  - **linear**: evaluate all integers $x$ (current version), and
  - **log**: evaluate a log-spaced grid to “see” larger scales more evenly.

## Published run snapshot

```{include} ../reports/e019.md
:start-after: "<!-- REPORT:BEGIN -->"
:end-before: "<!-- REPORT:END -->"
```

::: {dropdown} params.json (snapshot)
:open:

```{literalinclude} ../params/e019.json
:language: json
```

:::

## References

- Prime-counting function overview: {cite}`WikipediaContributors2025PrimeCountingFunction`.
- Analytic number theory background and PNT context: {cite}`apostol1976introanalyticnumbertheory`.
- Classic reference for $\pi(x)$, error terms, and explicit-formula viewpoints: {cite}`titchmarsh1986`.

See also {doc}`../references`.

## Related experiments

- {doc}`e020` (Prime counting: compare $\pi(x)$ to $\mathrm{Li}(x)$)
- {doc}`e021` (Prime counting: explicit bounds for $\pi(x)$)
- {doc}`e024` (Ulam spiral: prime distribution visualization)

## Parameters (example)

- $\text{n}_{\max}$: `2_000_000` (range: $2 \le x \le \text{n}_{\max}$)

### Recommended $\text{n}_{\max}$ range

This experiment classifies primes up to $\text{n}_{\max}$ using a sieve, so runtime and memory scale roughly with $\text{n}_{\max}$.

A practical range that works well for most runs:

- **Minimum (still meaningful):** $\text{n}_{\max} = 200\,000$
- **Default / recommended:** $\text{n}_{\max} = 2\,000\,000$
- **Comfortable upper range on typical laptops (varies):** $\text{n}_{\max} = 10\,000\,000$
- **Above ~$\text{n}_{\max} = 30\,000\,000$:** expect noticeably higher runtime (and potentially memory pressure), depending on the machine and sieve implementation.

## Summary

We compute the prime-counting function $\pi(x)$ up to $\text{n}_{\max}=2\,000\,000$ and compare it to the smooth baseline
$x/\log(x)$ on the same range.

## Key observations

- The baseline $x/\log(x)$ tracks the overall growth of $\pi(x)$ but underestimates it on this range.
- The error curve $\pi(x)-x/\log(x)$ is positive and grows to about $11\,084$ at $x=2\,000\,000$.

## Interpretation

This is exactly the kind of behavior the PNT suggests: $\pi(x)$ is “close to” $x/\log(x)$ in a relative sense, but the absolute difference
can still be large on finite ranges. The plots are therefore best read as **finite-range behavior** rather than as a direct numerical validation
of the asymptotic statement.

## Caveats

- The PNT statement is asymptotic: do not treat any single finite-range plot as proof.
- Visualization choices (sampling density, line smoothing, axis scaling) can make convergence look better or worse than it is.
- For better baselines on moderate ranges, comparing to $\mathrm{Li}(x)$ is often more informative (see the related experiment).