Smallest universal function
The smallest universal function of a programming language is the smallest function (or similar) definable in the language which exhibits computational universality, it is a universal machine. Only Turing complete languages can implement universal functions. The task of finding the smallest universal function is a golfing exercise. All of these functions are interpreters, albeit not necessarily for a commonly described language.
Rules
- The program must define something which takes input, typically via function parameters or variable/counter initialization, and then operates on it in such a way that an output is generated according to the inputs
- There can be either 2 inputs (input and program) or 1 input (program), so long as the function can still compute any function's value of the challenge's category
- Eval, exec, and similar methods of builtin self-interpretation are disallowed
- Builtin cross-interpretation is also disallowed, but the rules for this are very language-specific. This disqualifies things like HQ9+B's
B
instruction.
- Builtin cross-interpretation is also disallowed, but the rules for this are very language-specific. This disqualifies things like HQ9+B's
- The inputs to the function must be able to be given as either finite literals or finite data input
- Infinite inputs and inputs which have user definable side effects outside of the function operation are disallowed (e.g. custom classes)
- Depending on the challenge, a normalized output must be given
- Universal semidecider: no output restrictions
- Universal decider: must return the language's boolean values
- Universal computer: must be able to return all finite strings of a given description
- All ASCII strings, all strings of bits, all integer values, etc. are all valid outputs, as long as all of them can be reached by the valid inputs
- Programs must be able to produce arbitrary elements of any recursively enumerable language, up to some trivial encoding (it must be decodable by a transducer)
Non-esoteric languages
x86 machine code
The page for the golf challenge gives a list of Turing complete languages which came about from size optimization. The smallest one listed is MiniMAX with between 14 and 32 bytes of machine code (depending on what I/O conventions are used).
Python
2-Tag (champion)
All of these implement 2-tag. They all take in lists of integers as input a, with either a list of lists or a dict from ints to lists for the program r. The halting condition is if a halt symbol is reached, which is any negative or zero integer.
The decider's answer is given as the final halting symbol on a nearly empty tape, a 0 halting symbol is true, with a negative one being false.
The computer returns the entire word of the tag system's output, discarding the halting symbol.
- Semidecider: 41 bytes
def f(a,r): while a[0]>0:a=a[2:]+r[a[0]]
- Decider: 56 bytes
def f(a,r): while a[0]>0:a=a[2:]+r[a[0]] return a==[0]
- Computer: 55 bytes
def f(a,r): while a[0]>0:a=a[2:]+r[a[0]] return a[1:]
(1, 0)-L
This semidecider implements a very strict form of (1, 0)-L-systems. The inputs are a list of integers, and a dictionary from pairs of integers to lists of integers (which may be empty). All pairs which can occur in the program must be specified in the ruleset, no default fallbacks are allowed. A decider and computer could be made using this base function in a similar way as the 2-tag example.
The procedure is to pair up a shifted copy of the word (with 0 at the start) with a non-shifted copy, pairing up prior elements with current elements. These pairs then index into a dictionary of productions. The productions are concatenated to form the word. The halting condition is if the first symbol becomes 0 or negative.
- Semidecider: 57 bytes
def f(a,r): while a[0]>0:sum(map(r.get,zip([0]+a,a)),[])
Theoretical devices
Turing machines
There are many contenders for the smallest universal Turing machine, depending on the metric. Turing machine sizes are often notated with the score to golf being the product . According to the rules, weakly universal machines are disallowed, so the very small (2, 3) machine and similar are disqualified.
Reversible Turing machines
Reversible Turing machines are a common model of reversible computing. The scoring system between RTMs and TMs is the same. A 98 state, 10 symbol machine has been described by Morita.
Tag systems
Tag can be scored based on the number of symbols, the number of rules, and the production sum. Generally, the number of symbols and rules is the same (except for halting states, which do not have rules). The production sum is the sum of all the lengths of the rule productions. This set of 3 rules describes a 3 symbol tag system that has a production sum of 6:
a -> bc b -> a c -> aaa
UT19 is a 2-tag system which consists of 19 symbols, 19 rules, and a production sum of 40.
Diophantine equations
Diophantine equations are polynomial equations where the only valid solutions are integers. It was proven that any recursively enumerable set can be represented as a Diophantine equation. Universal Diophantine equations have since been found. The scoring system for these equations is based on the degree (highest fixed power) and number of variables in the equation. Note that , , are all degree 2. It has been proven that Diophantine equations in degrees less than 3 and variables less than 2 are solvable, and thus can only model decidable problems. Universal degree 4 and variable count 9 equations are known, although if the scoring is based on the product then the smallest is in the degree 4, 58 variable equation described by Jones.
L-systems
L-systems are cellular grammars which model biological growth. L-systems operate by replacing every occurence of a symbol in a string with its replacement at once. Contextual L-systems, those which can conditionally replace based on the symbol's surroundings, are Turing complete. Scoring is based on the number of symbols and the number of replacement rules.
It is unknown if there are purpose designed universal L-systems like there are universal Turing machines or universal Diophantine equations. The following are rough upper bound estimates based on translation schemes.
- (1, 0)-L: ca. 200 symbols, ca. 1300 rules (translated from UT19)
- (1, 1)-L: symbols, rules (translated from a universal unrestricted grammar)
Unrestricted grammars
Unrestricted grammars are a restricted form of semi-Thue systems and are able to characterize all outputs of a given Turing machine. Universal grammars are therefore guaranteed to exist, but they are not well covered in the literature. One should also note that the linguistic concept of universal grammar is distinct from a computationally universal grammar.
Counter machines
The scoring system for counter machines can be based on the number of states (instructions) and counters/registers. The Gregušová-Korec universal Minsky machine has 37 states and uses 8 counters.
Reversible counter machines
Reversible counter machines are the reversible counterpart to counter machines. The scoring system for reversible counter machines is the same as for regular counter machines. A machine with a particularly small product is a 9 counter, 97 state machine described by Alhazov and Verlan.