Development¶

This page describes the development workflow and the conventions used in this repository.

Workflow overview¶

Use the Makefile targets for everything. They wrap uv run ... so you do not need to activate a virtual environment manually.

Typical day-to-day:

make final

Documentation-only build:

make docs

Clean caches and build artifacts:

make clean

Remove the virtual environment (full reset):

make clean-venv

Makefile workflow¶

The Makefile is the single entry point for development tasks (env setup, quality checks, docs builds, and running experiments).

Prefer make final before pushing.
Prefer make docs to validate documentation changes.
Use make run EXP=<id> to execute an experiment and write artifacts to out/<id>/.

Dependency groups¶

This summary is included from the Makefile documentation:

uv installs dependencies from your pyproject.toml. This project uses “extras”:

default: runtime dependencies (needed to run the package)
dev: developer tools (ruff, mypy, pytest, …)
docs: documentation tools (sphinx, theme, myst, bibtex, …)

What the Makefile uses¶

Most developer commands run via:
```
uv run --extra dev ...
```
Documentation build runs via:
```
uv run --extra docs ...
```

Why `docs-deps` uses `--all-extras`¶

make docs-deps runs:

uv sync --all-extras

because the docs pipeline also runs a small helper under the dev extra before invoking Sphinx.

Virtual environment behavior¶

Where is the venv?¶

The venv is always created at:

.venv/

Minimum Python version¶

The Makefile checks PYTHON_MIN (default is 3.14). If your Python is older, make python-check fails.

Why `UV_LINK_MODE=copy` exists¶

On Windows, hardlinks can fail or warn on multi-drive setups (e.g. repo on D: but cache on C:). So the Makefile defaults to:

UV_LINK_MODE=copy

You can override it, but the default is chosen to reduce Windows pain.

Common “reset” if your venv is broken¶

make clean-venv
make install-dev

Formatting (ruff)¶

Formatting is purely about how code looks, not what it does.

Targets¶

Target	What it does	When to use it
`make format`	formats selected repo paths	normal formatting
`make format-check`	checks formatting only (no changes)	CI-like check
`make fmt`	“broad auto-fix”: ruff fixes + formats everything	when you want the repo cleaned up quickly

Typical usage¶

Before committing:

make fmt

If CI says “formatting changed”:

make format

Linting (ruff)¶

Linting looks for potential bugs and bad patterns, for example:

unused imports
variables shadowing names
common correctness issues ruff knows how to detect

Targets¶

Target	What it does	Does it edit files?
`make lint`	ruff check-only	no
`make lint-fix`	ruff with auto-fix	yes

Typical usage¶

Use make lint when you only want to see problems.
Use make lint-fix when you want ruff to fix safe issues automatically.

Typing (mypy)¶

Typing checks help catch issues like:

calling functions with wrong argument types
returning the wrong types
forgetting to handle None

Target¶

make mypy

It runs:

mypy mathxlab tests experiments

Common tip for juniors¶

If mypy errors look scary, start with the first error in the output. Many later errors are “follow-up noise” caused by an earlier wrong type.

Tests (pytest): fast / slow / perf¶

This repo separates tests using pytest markers:

fast tests: not slow and not perf
slow tests: slow and not perf
perf tests: perf

Markers are set in tests like:

import pytest

@pytest.mark.slow
def test_big_case():
    ...

Coverage and threshold¶

Tests collect coverage for the library packages (not for experiment scripts). Coverage fails the target if it drops below 80%.

Stable temp paths¶

pytest is invoked with stable temp directories:

temp_pytest_cache
temp_pytest

They are cleaned by make clean.

`pytest` (fast tests)¶

make pytest

What it does:

runs only fast tests (not slow and not perf)
runs with coverage
fails if coverage < 80%

This is the main “developer loop” test target.

`pytest-xdist` (fast tests in parallel)¶

make pytest-xdist

Runs fast tests using xdist:

-n auto --dist=load

Use this when the test suite gets bigger and you want faster local feedback.

`pytest-slow` (two-phase: fast then slow)¶

make pytest-slow

What it does (important detail):

deletes .coverage
runs fast tests first in best-effort mode (failures in this phase do not fail the target — this is intentional in the Makefile)
runs slow tests with:
- xdist by default (PYTEST_XDIST_SLOW=-n auto --dist=load)
- --cov-append to combine coverage from both phases
- coverage threshold (80%)

Warning

If you want “fast tests must pass”, run make pytest separately. make pytest-slow is designed to always get through the slow suite even if fast tests are currently failing.

Performance tests (pytest-perf): what they are and how to use them¶

Performance tests are still pytest tests, but marked with @pytest.mark.perf. They exist to catch accidental slowdowns (e.g. a function becomes 10× slower).

Why the Makefile forces thread counts to 1¶

Math/scientific libraries sometimes use multiple CPU threads automatically (BLAS/OpenMP/etc.). That makes timings noisy and hard to compare.

So perf targets set:

OMP_NUM_THREADS=1
MKL_NUM_THREADS=1
OPENBLAS_NUM_THREADS=1
NUMEXPR_NUM_THREADS=1

This makes timings more reproducible across machines and runs.

`pytest-perf` (run perf-marked tests)¶

make pytest-perf

Runs only:

-m "perf"

and shows progress output.

Use this when:

you changed a performance-sensitive function
you want to ensure you didn’t introduce an obvious regression

`pytest-perf-baseline` (update accepted baseline numbers)¶

make pytest-perf-baseline

Same as pytest-perf, but also passes:

--perf-update-baseline

Use this only when:

you intentionally changed performance (e.g. algorithm changed)
and you want to accept the new timings as the baseline

Warning

Baseline updates should be done on a reasonably idle machine. If you run baseline updates while your system is busy, you may “bake in” slow/noisy numbers.

Performance suite (non-pytest): snapshots and comparisons¶

In addition to pytest-perf, the Makefile also provides a small “perf runner” pipeline:

`perf` (dev snapshot)¶

make perf

Runs:

python mathxlab/tools/run_perf.py --mode dev --overwrite

This is typically used during development to write/update a performance snapshot.

`perf-release` (release snapshot)¶

make perf-release

Runs:

python mathxlab/tools/run_perf.py --mode release --overwrite

This is usually for “release-ish” measurements (more stable, fewer surprises).

`perf-compare` (compare two snapshots)¶

make perf-compare A=v0.1.0 B=v0.2.0

Runs:

python mathxlab/tools/compare_perf.py --a … --b …

Use this when you want a readable report of “what got faster/slower” between two saved snapshots.

Tip

If you’re not sure what valid values for A and B are, run make perf once and look at the output written by the script. It usually prints the snapshot identifiers/paths it created.

Run logs¶

Experiments are Python modules like:

mathxlab/experiments/e001.py
mathxlab/experiments/e002.py
…

Run one experiment (recommended: via Make)¶

make run EXP=e001

This:

creates out/e001/ (if missing)
creates out/e001/logs/
writes a log file:
- out/e001/logs/run_e001.log
runs the experiment with:
- python -m mathxlab.experiments.e001 --out out/e001 -v

Forward CLI arguments¶

make run EXP=e001 ARGS="--seed 123 --n 200000"

Note

On Windows, always quote ARGS="..." if it contains spaces.

Run all experiments¶

make out

This finds all mathxlab/experiments/e???.py files and runs them sequentially.

Full reference¶

For the complete target-by-target reference and troubleshooting, see Makefile.

Formatting, linting, typing, tests¶

Formatting: Ruff formatter
Linting: Ruff
Typing: mypy
Tests: pytest

CI formatting behavior¶

In CI, formatting runs in check mode (ruff format --check). Locally it formats in place.

Experiment authoring guidelines¶

When adding a new experiment:

Add a new module under mathxlab/experiments/, e.g. e002_...py.
Prefer deterministic outputs:
- --seed argument if randomness is involved
- write results to a single --out directory
Keep the experiment runnable as a module:
- python -m mathxlab.experiments.e002
Update the docs:
- add a short entry to Experiments Gallery
- optionally add a dedicated page under docs/experiments/ later

Report contract for algorithmic experiments (Phase 2 and later)¶

For experiments involving algorithms (primality tests, factorization, explicit bounds), the out/e###/report.md file is part of the experiment’s scientific contract.

It must state (when applicable):

Deterministic vs probabilistic. Always label this explicitly.
Probability of error for probabilistic methods (bases / repetitions).
Correctness cross-checks against a trusted reference for CI-safe ranges.
Known counterexamples / failure modes (e.g., Carmichael numbers for Fermat).
Finite-range behavior vs asymptotics (do not oversell small N).

Use this drop-in template for report sections (copy/paste into report.md or generate it in code):

Algorithmic guarantees¶

Method: (name the algorithm)
Status: DETERMINISTIC | PROBABILISTIC

If probabilistic¶

Randomness / bases: (list bases used or how randomness was sampled)
Conservative error statement:
- For Miller-Rabin, a common bound is: P(false prime) <= 4^{-k} after k independent random bases.
- If you use a fixed base set (engineering choice), state the intended input range.

Correctness cross-check¶

State how you validated correctness for a CI-safe range.

Reference: (e.g., sieve ground truth, deterministic trial division)
Checked range: (e.g., n <= 1_000_000)
Result: mismatches = 0 (or list the smallest mismatch as a witness)

Known counterexamples / failure modes (when applicable)¶

Use this section when the method is known to fail on structured inputs.

Fermat test: Carmichael numbers pass for all coprime bases (smallest: 561).
Fermat base-2 pseudoprime: 341 = 11 * 31 passes 2^(n-1) mod n = 1 but is composite.
Miller-Rabin: specific bases can be fooled by strong pseudoprimes (state the bases used).
Pollard rho: may stall for unlucky seeds; retries and parameter changes are expected.

Runtime knobs (CI-safe)¶

List the knobs that keep runtime bounded in CI.

n_max: (upper bound)
sample_size: (how many candidates were tested)
max_rounds / max_retries: (for randomized algorithms)
seed: (if randomness is involved)

Finite-range behavior (for asymptotics / explicit bounds)¶

When referencing an asymptotic statement (e.g., PNT), explicitly separate:

Theory statement: what is true as x -> infinity.
Finite range used here: x in [A, B].
Where it becomes meaningful: state a measurable criterion (e.g., relative error < 5%) and the smallest x where that holds.

Documentation¶

Docs are built with Sphinx + MyST.

Build locally:

make install-docs
make docs

Deployed website:

GitHub Pages from the docs workflow

Contributing (high-level)¶

Create a feature branch.
Open a PR against main.
CI must pass before merge.
Keep PRs small and well-scoped.

Development¶

Workflow overview¶

Makefile workflow¶

Dependency groups¶

What the Makefile uses¶

Why docs-deps uses --all-extras¶

Virtual environment behavior¶

Where is the venv?¶

Minimum Python version¶

Why UV_LINK_MODE=copy exists¶

Common “reset” if your venv is broken¶

Formatting (ruff)¶

Targets¶

Typical usage¶

Linting (ruff)¶

Targets¶

Typical usage¶

Typing (mypy)¶

Target¶

Common tip for juniors¶

Tests (pytest): fast / slow / perf¶

Coverage and threshold¶

Stable temp paths¶

pytest (fast tests)¶

pytest-xdist (fast tests in parallel)¶

pytest-slow (two-phase: fast then slow)¶

Performance tests (pytest-perf): what they are and how to use them¶

Why the Makefile forces thread counts to 1¶

pytest-perf (run perf-marked tests)¶

pytest-perf-baseline (update accepted baseline numbers)¶

Performance suite (non-pytest): snapshots and comparisons¶

perf (dev snapshot)¶

perf-release (release snapshot)¶

perf-compare (compare two snapshots)¶

Run logs¶

Run one experiment (recommended: via Make)¶

Forward CLI arguments¶

Run all experiments¶

Full reference¶

Formatting, linting, typing, tests¶

CI formatting behavior¶

Experiment authoring guidelines¶

Report contract for algorithmic experiments (Phase 2 and later)¶

Algorithmic guarantees¶

If probabilistic¶

Correctness cross-check¶

Known counterexamples / failure modes (when applicable)¶

Runtime knobs (CI-safe)¶

Finite-range behavior (for asymptotics / explicit bounds)¶

Documentation¶

Contributing (high-level)¶

Why `docs-deps` uses `--all-extras`¶

Why `UV_LINK_MODE=copy` exists¶

`pytest` (fast tests)¶

`pytest-xdist` (fast tests in parallel)¶

`pytest-slow` (two-phase: fast then slow)¶

`pytest-perf` (run perf-marked tests)¶

`pytest-perf-baseline` (update accepted baseline numbers)¶

`perf` (dev snapshot)¶

`perf-release` (release snapshot)¶

`perf-compare` (compare two snapshots)¶