We are currently working on new rules for what content should and shouldn't be allowed on this website, and are looking for feedback! See Esolang:2026 topicality proposal to view and give feedback on the current draft.
PrimeIndex
PrimeIndex is an esoteric programming language by User:AndrewBayly in which the source code, read as one large integer N, is encoded as the N-th prime number. Decoding a PrimeIndex program means recovering N from a given prime — that is, computing how many primes precede it. This is not merely slow; it is, as far as anyone has proven, an open problem in computational number theory whether it can be done in time polynomial in the number of digits of the prime. PrimeIndex is the fifth member of an "asymmetry" family of languages by the same author, and is generally considered the most mathematically elegant of the set: its entire mechanism is described by one sentence, with no arbitrary constants, no bitwise operations, and no auxiliary encoding key.
- PrimeScript — hard to execute (a prime is used as a multiplicative encoding key; decoding requires factoring)
- Square — hard to execute (decoding requires a permutation search to reassemble blocks)
- Trapdoor — hard to write (a valid program must satisfy a modular constraint)
- Hard — hard to write and execute (both directions require computing bits of π at factorial-indexed positions)
- PrimeIndex — hard to write and execute (both directions require locating a specific prime's ordinal position, an open problem)
PrimeIndex is a sister language to PrimeScript: PrimeScript uses a prime as an encoding key multiplied against the source, while PrimeIndex dispenses with the key entirely and makes a prime be the program.
Language overview
To encode a tinylisp program in PrimeIndex:
- Read the program's UTF-8 bytes as the digits of one large base-256 integer N.
- Compute the N-th prime number, pN.
- pN is the PrimeIndex program.
To decode a PrimeIndex program (a prime P):
- Confirm P is actually prime. (See Program validity below — this step is cheap and mandatory.)
- Compute π(P), the count of primes less than or equal to P. This is N.
- Interpret N as a stream of bytes and decode as text.
- Compile and execute the recovered text as tinylisp.
Because encoding requires a search (see below) while decoding requires only a single evaluation, PrimeIndex is unusual among its siblings in having a genuine cost asymmetry between its own two directions, in addition to being hard in both directions relative to writing an arbitrary program by hand.
Why this is actually hard
Computing π(x) — equivalently, finding the index of a given prime — has no known algorithm running in time polynomial in the number of digits of x. Every method developed since the 19th century (Legendre's formula, Meissel's refinement, the Lehmer extension, the analytic method of Lagarias–Miller–Odlyzko, the combinatorial Deléglise–Rivat and Gourdon algorithms) achieves running time sub-linear in x itself — typically around O(x2/3) — but every one of these is still exponential in x's bit-length, since x is itself exponential in its own digit count. Whether a genuinely polynomial-time (in digit-length) algorithm for π(x) exists is, as far as has been published, an unresolved question in computational number theory — comparable in spirit to the open status of integer factorization, though the two problems are not known to be equivalent.
This is not a designed obstacle in the way Hard's π-digit indexing or Trapdoor's modular constraint are. It piggybacks directly on a problem number theorists have been chasing since 1870, with a public trail of record-setting computations: Meissel computed π(108) by hand in the 1870s; Lehmer extended the method in the 1950s; Lagarias, Miller and Odlyzko gave an analytic algorithm in 1985; Deléglise and Rivat improved the combinatorial approach in 1996; and the current record, computed by Baugh and Walisch in 2022, is π(1029). Every time that record advances, the set of PrimeIndex programs that can actually be decoded grows a little — making PrimeIndex likely the only esoteric programming language whose practical expressiveness tracks an actively published area of mathematics.
Mapping source length onto this:
| Source length | N (integer) | N-th prime ≈ | Cost of finding its index |
|---|---|---|---|
| 1 byte | ≤ 255 | ~1,600 | instant |
| 4 bytes | ~4.3×109 | ~9.5×1010 | seconds on ordinary hardware |
| 8 bytes | ~1.8×1019 | ~8×1020 | days on ordinary hardware |
| 12 bytes | ~7.9×1028 | ~5.3×1030 | beyond the current (2022) world record |
Because encoding requires a binary search over roughly log2(N) evaluations of π(x) to locate the N-th prime, while decoding requires only one, writing a given PrimeIndex program is on the order of 30–60 times more expensive than reading it back, for realistic program lengths.
Program validity
A value is only a valid PrimeIndex program if the encoder could plausibly have produced it — and the encoder only ever emits primes. A decoder that accepts arbitrary integers therefore has a strictly larger domain than the encoder's range, which is inconsistent with the language's own definition. Consequently, decoding must begin by verifying primality of the input and rejecting composite values outright, before any index-finding is attempted.
Usefully, this check is cheap: primality testing (e.g. Miller–Rabin) runs in time polynomial in the digit-length of the candidate, in sharp contrast to finding that candidate's index. PrimeIndex therefore exhibits, in miniature, the "verification is easy, search is hard" asymmetry familiar from complexity theory — checking that a piece of text is a valid PrimeIndex program is fast even for enormous numbers; discovering what it decodes to is the genuinely open-problem-hard part.
Example
The single-character tinylisp program 7 (which evaluates to 7, tinylisp treating bare numerals as self-evaluating) has ASCII value 55. The 55th prime is 257. Encoding 7 therefore produces the PrimeIndex program 257; decoding 257 recovers index 55, byte value 55, and text 7. This is comfortably within the trivial range for both directions — the entire interesting behaviour of the language happens once a program exceeds a handful of bytes, at which point the required prime already grows out of reach of ordinary computation, and eventually out of reach of double-precision floating point arithmetic entirely, independent of the number-theoretic difficulty.
As with Hard, there is no guarantee that a successfully decoded integer, once turned into bytes and then text, is valid tinylisp — 7 works only because tinylisp's grammar is permissive enough that a bare numeral is already a complete program.
Computational class
tinylisp is Turing complete, and the map from source integer to its index-th prime is a bijection between the positive integers and the primes, so PrimeIndex is Turing complete in the fully formal sense: every computable function has a corresponding PrimeIndex program. As with Hard, this is Turing completeness in its least useful form — no program longer than a handful of bytes can currently be produced or decoded by any known method, and unlike Hard's fixed asymptotic wall, PrimeIndex's limit is a moving one, tied to the current state of the art in analytic and combinatorial number theory.
Truth-machine
As with Hard, a truth-machine cannot be written in PrimeIndex in any practical sense. Any tinylisp program capable of branching on input and looping indefinitely is well beyond the few bytes for which the corresponding prime's index can currently be found. The truth-machine exists as a specific, well-defined integer — but locating or verifying it as a prime's index is, today, equivalent to setting a new prime-counting record.
External resources
- Reference implementation and interactive demo — a browser-based (HTML/JS) implementation featuring a real Meissel-style π(x) evaluator (φ(x,a) digit recursion with a wheel-sieve base case, memoization, and the Meissel P2 correction term), validated against brute-force sieving and the published π(109), π(1010) and π(1011) reference values; live encode/decode with real timing, an analytical time estimator calibrated to the implementation's own measured throughput for inputs beyond exact-precision range, and a Miller–Rabin validity check on decode.