Portable Minsky Machine Notation

From Esolang
Jump to navigation Jump to search

Portable Minsky Machine Notation (PMMN) is a syntax for Minsky machines that was created by User:ais523 in 2015. Like Bitwise Cyclic Tag, it is equivalent (when seen as an esoteric programming language) to a mathematical formalization that has existed for a long time, but has the advantage that it has a concrete syntax, and thus it is possible to write compilers to and from it and interpreters for it.

Definition

Portable Minsky Machine Notation expresses Minsky machines. As such, the program's commands all refer to one of a (fixed) number of internal counters. In Portable Minsky Machine Notation, these are named using integers written in decimal: 0, 1, etc.. The counters have a minimum value, but no maximum value; they can store arbitrarily large integers. (The actual identity of the minimum value makes no difference to the language; 0 is the most common choice, although in some cases 1 will be easier to implement.) At the start of the program, each counter is at its minimum value. Note that although the counters themselves must be able to store unbounded integers, the input program may not directly mention an integer with value greater than 2 billion (thus indirectly placing a limit on the maximum number of counters that can be used); this restriction is to make it easier to construct a Portable Minsky Machine Notation parser, via removing the need for the parser to handle bignum arithmetic.

There are two commands that access memory. inc(counter) adds 1 to the value of a counter. dec(counter) subtracts 1 from the value of a counter, unless it is already at its minimum value. dec (but not inc) can also be used as the test of a control flow command, in order to read memory in addition to writing it; when used as a test, dec has its usual effect, but in addition, returns true if the decrement changed the counter's value, or false if the counter was at its minimum value already. These two commands are followed by a semicolon when used as part of a block or program.

There are also control flow commands: if and while. These are followed by a test (i.e. a dec command) in parentheses, followed by a block of code (which is just a sequence of commands enclosed by braces, { and }). An if command can additionally be followed by else and another block of code. These have the meanings that C, Java and Perl programmers would expect: if runs its first block if the test succeeds (and its second block, if it exists and if the test fails); while runs its block if the test succeeds, then when the block completes, jumps back to just before the while statement (thus causing the test to run again, and if it succeeds, causing the block to run again and another jump back, potentially infinitely many times).

Programs can also contain comments. These start with /*, end with a matching */, and cannot be nested. Whitespace is insignificant, except that it cannot appear inside a counter number, keyword, or comment marker.

Here's the complete grammar in a BNF-like format (with a program being a commandlist):

digit       ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
counter     ::= digit | counter digit
test        ::= "dec" '(' counter ')'
command     ::= "inc" '(' counter ')' ';'
             |  test ';'
             |  "if" '(' test ')' block
             |  "if" '(' test ')' block "else" block
             |  "while" '(' test ')' block
block       ::= '{' commandlist '}'
commandlist ::= command
             |  commandlist command

Extensions

These aren't part of the language itself, but are commonly provided by interpreters, and can be thought of as implementing a "larger" language along similar lines.

RLE extension

In order to avoid the need for long repetitive strings of inc command to encode increments by large numbers, an inc_by command is provided as syntactic sugar. The syntax is inc_by(counter, amount), where counter is a counter number and amount is a nonnegative integer expressed in decimal; this is exactly equivalent to a sequence of amount inc(counter) instructions. (Other repetitive sequences can be compressed via using inc_by to increment a temporary counter, and a while loop decrementing that counter in order to run them repeatedly.)

I/O extension

An extra two commands are provided that aren't part of the mathematical definition of a Minsky machine, but allow programs to do input and output. This is useful for testing, or when compiling programs from one language into another.

The input(counter) command reads one octet from standard input. If the read fails (either because of an error, or because standard input is at end-of-file), it does nothing. Otherwise, it adds 1 plus the numerical value of that octet (i.e. from 1+0 to 1+255) to the value of that counter. It cannot be used as a test.

The output(counter) command sets the value of the given counter to its minimum value. If it was already at its minimum value, nothing happens. Otherwise, 1 is subtracted from the difference between the counter's previous value and its minimum value, and the resulting number is interpreted as an octet and output to standard output. Programs should avoid trying to output octets with values outside the range 0 to 255, as this makes no sense and interpreters might not deal with the situation consistently.

On systems where newline is represented by something other than the octet with value 10, implementations should provide a command-line option to convert the platform-specific newline code to and from octet 10 on input and output respectively.

Computational class

Because Minsky machines are Turing complete, so is Portable Minsky Machine Notation.

Portable Minsky Machine Notation with 2 counters

Portable Minsky Machine Notation is Turing complete even with just 2 counters. We sketch how to simulate a Minsky machine with 2 counters x and y and N states. We encode states as 2x * 3y * N + s where s is the current simulated state (1..N). The main loop looks as follows (if (!x) b1 else b2 is sugar for if (x) b2 else b1),

while (dec(0)) {
    inc(0);
    while (dec(0)) {
         if (!dec(0))      <code for state 1>
         else if (!dec(0)) <code for state 2>
         ...
         else if (!dec(0)) <code for state N>
         else {
             inc(1);
             inc(0);
         }
    }
    while (dec(1)) { inc(0); }
}

This divides the encoded value by N, and executes different code for each possible remainder. The code for each state takes 2x * 3y in the second counter and is responsible for calculating the new encoded state from that. A state can either increment a counter, decrement a counter, or halt.

Increment

Multiply value by 2*N or 3*N and add the next state.

while (dec(1)) { inc_by(0, <2 if incrementing x, 3 if incrementing y>); }
while (dec(0)) { inc_by(1, N); }
inc_by(1, <next state>);

Decrement

Divide value by 2 or 3, then either multiply by N and add the next state or restore the old value and then multiply by N and update the state.

inc(1);
while (dec(1)) {
    if (!dec(1)) { // remainder 0
        while (dec(0)) { inc(1); }
        while (dec(1)) { inc_by(0, N); }
        inc_by(0, <next state for non-zero counter>);
    } else if (!dec(1)) { // remainder 1
        while (dec(0)) { inc_by(1, <2 if decrementing x, 3 if decrementing y>); }
        inc_by(1, 1);
        while (dec(1)) { inc_by(0, N); }
        inc_by(0, <next state for zero counter>);
    } else if (<decrementing y> && !dec(1)) { // remainder 2
        while (dec(0)) { inc_by(1, 3); }
        inc_by(1, 2);
        while (dec(1)) { inc_by(0, N); }
        inc_by(0, <next state for zero counter>);
    } else {
        inc(1);
        inc(0);
    }
}
while (dec(0)) { inc(1); }

Halt

Replace the encoded value by 0.

while (dec(1)) { }

The limitation of this construction is that it can only deal with decision problems; at the end of the program, both counters are 0. It's easy to extend this to a finite number of values (the main loop can check for that at the end and leave the corresponding value in the second counter), but it's not obvious whether one can go beyond that to arbitrary computable functions.

See also