MiniMAX

From Esolang
Jump to navigation Jump to search

MiniMAX is an esoteric programming language created by User:ais523 in 2006, in an attempt to create a ridiculously small interpreter. It was inspired by SMETANA.

Program structure

A MiniMAX program consists of a series of 3-word commands (a word could be a byte, two bytes, or a line of code, depending on the interpreter). Each word contains an integer. The words are interpreted as follows:

  1. Word of data to send to the previous command
  2. Word containing distance (in words) from the end of the previous command to the start of the next command
  3. Word which receives data from the next command (this value is disregarded; when the next command executes, it is filled in with the word of data that the next command sends back).

The first and 'zeroth' commands are in program locations chosen by the interpreter. (Thus, there is no such thing as a portable MiniMAX interpreter). The zeroth command need not be within the program; the interpreter need only document the distance it is from the start or end of the program, and guarantee that sending data to it from the first command will not affect the program if it is outside the program's bounds.

Many interpreters may wish to implement some way to exit the program, or I/O. For these, there are some reserved constructs:

  • An offset of 0 words from the previous to the next command is reserved; the implementation may assign any meaning to such a construct (including treating the next command as being immediately after the previous command, as an offset of 0 might suggest), but must document what it is. (The documentation doesn't count towards the size of the implementation, as it is likely to come out longer).
  • Specifying a location for the next command outside the bounds of the program is reserved. Specific locations may have specific meanings (for instance, exit the program). Unless an implementation documents an effect of going out of bounds, this is undefined behaviour.
  • A byte with ASCII code 7F (hexadecimal) cannot appear anywhere in the program but the last character, and must appear as the last character. This code was chosen because it isn't a printable character, and interpreters can specify an input format that doesn't use it in any legal offset (for instance, offsets are decimal numbers with a word per line, or a word is 16 bytes long and offsets are doubled and stored in littleendian two's-complement notation).
  • Apart from the zeroth and first commands, it is undefined behaviour for two consecutive commands to overlap.

When writing extensions, MiniMAX implementors are encouraged to put the shortness of the interpreter ahead of the usability of the interface.

MiniMAX also has one special feature for bounds extension. The third word of a command need not actually be within bounds; if it is out of bounds, the length of the program will be extended by one word, and that word will be filled with whatever value is sent to it from the next command. An implementation need not permit unlimited bounds extension, but implementations are encouraged to allow as much bounds extension as possible.

Framing ambiguity

Framing ambiguities are legal in MiniMAX; in fact, they're the only way for a program to retrieve stored data. Although each command is three words long, a command can start at any word of the program. (In practice, commands are often effectively four words long, because the offset from the end of one command to the start of the next cannot be 0, although it can be negative). This means that commands can overlap, so the receiving word of one command is the offset of another.

Computational class

A MiniMAX interpreter that allows unlimited bounds extension is probably Turing-complete (see below). If unlimited bounds extension is not allowed, as on all real computers (which do not have infinite storage), MiniMAX is a bounded-storage machine.

MiniMAX is known to be Turing-complete because it is possible to compile extended Smallfuck into it. The extension required (a new Smallfuck command) is shown below:

& (EXTEND)
Move one cell to the right and zero it. If at the rightmost cell of the tape, instead add a new cell to the right of the current cell and zero it.

By using temporaries and &, Smallfuck can be given an unbounded tape, despite the limit in its specification.

For the details of the compilation, see MiniMAX Turing-completeness proof. An interpreter can limit the distance from one command to the next-but-one (the previous to the next command), but the limit will probably be enough to compile a Brainfuck interpreter written in Smallfuck into MiniMAX, as long as I/O extensions are allowed (to both Smallfuck and MiniMAX).

Interpreting MiniMAX

The main loop of a MiniMAX interpreter can be written in 8 bytes of 8086 machine language, disassembling as follows:

MOVSW                         # one byte
LODSW                         # one byte
ADD SI,AX                     # two bytes
XCHG SI,DI                    # two bytes
JMP SHORT # to the MOVSW line # two bytes

This interpreter is not complete, as it needs code to read in the program, initialize registers, and exit. If the JMP is changed to a JNZ, the loop will end if the next command location is at memory location ds:0000 (which will be a specific location outside the bounds of the program, and which the interpreter would document as '135 words before the start of the program' or similar). This loop expects the code to be in memory stored in littleendian doubled two's-complement notation, using 16-bit words (an odd value stored in memory will cause a framing ambiguity at the byte level, which is an interpreter extension). It expects SI to point to the first word of the first instruction and DI to point to the third word of the zeroth instruction (which must be writable). Allowing for 5 bytes of machine-language code to set up SI and DI (both pointing at the same memory location, the first word of the MiniMAX code), and one byte to exit after a jump to ds:0000 (one byte is sufficient for a DOS COM file, although more may be needed under other operating systems; however, many operating systems would require extra data, say in the headers, which DOS COM files don't need), we get a 14-byte interpreter. Although the interpreter does not read in a program, this requirement can be circumvented by documenting that the program to interpret must be appended to the interpreter. (This requirement may seem a bit suspect, as it means that the null program is an interpreter for x86 machine language; a 32-byte interpreter with I/O extensions and program readin has been written).

The 14-byte interpreter (DOS .COM executable written in 8086 machine code), in full, disassembled, is shown below:

MOV SI,010E  # 3 bytes; 010E is the byte after the RET line
MOV DI,SI    # 2 bytes
MainLoop:    # assembler directive, 0 bytes
MOVSW        # 1 byte
LODSW        # 1 byte
ADD SI,AX    # 2 bytes
XCHG SI,DI   # 2 bytes
JNZ MainLoop # 2 bytes
RET          # 1 byte
<append program to interpret here>
      # Total: 14 bytes

The Mov, Add, Xchg sequence is what leads to the MAX in MiniMAX's name. (The Mini comes partly from its ridiculously short implementations, and partly from the Mov, INt sequence that the 32-byte implementation uses for its IO extensions).

Example

Here is an example run of a MiniMAX program (that doesn't do anything useful, it's just an example). Each running command is shown emphasized. This is designed for an interpreter that starts both the first and zeroth commands at the first word of the program.

Command # Program Description
0
1
2
3
1
-1
2
-1
-3
1
-3
-2
  Never actually runs.
1
1
2
3
1
-1
2
-1
-3
1
-3
-2
  Second command is 2 words to the right of the zeroth's end.
2
1
2
3
1
-1
2
-1
-3
1
-3
-2
  3rd command is 1 word left of the 1st's end; the 2 is copied from this command's first word to the 1st command's third word.
3
1
2
2
1
-1
2
-1
-3
1
-3
-2
  4th command is 1 word right of the 2nd's end; the 2 is copied from this command's first word to the 2nd command's third word.
4
1
2
2
1
-1
2
-1
2
1
-3
-2
  5th command is 2 words left of the 3rd's end; the -3 is copied from this command's first word to the 3rd command's third word.
5
1
2
2
1
-3
2
-1
2
1
-3
-2
  6th command is 3 words left of the 4th's end; the 1 is copied from this command's first word to the 4th command's third word.
6
1
2
2
1
-3
2
-1
2
1
-3
-2
1
And so on...

Note that all MiniMAX code is nonportable, because the interpreter might start the first, or especially the zeroth, command elsewhere. (The zeroth command need not be inside the bounds of the program.) Note also that if a command overlaps with the next command, and is not the zeroth, implementations will not necessarily handle the exact timings of changes in a command correctly, so avoiding this construct is important. (This does not happen in the example, at least the portion which has been shown.) This example has also not been tested.

See also