Minsky machine to SMITH translation

From Esolang
Jump to navigation Jump to search

This is an article that provides information on one way of translating Minsky machines into SMITH. Thus, this also proves by translation that SMITH is a Turing-complete language.

This translation technique creates programs that use the COR and BLA instructions in abundance, and overwrite parts of their code during execution.

A translator program

Written in Python. Give a MM program file name as argument.

# Minsky Machine to SMITH translator
# written by Keymaker

import sys

if len(sys.argv) < 2:
    print "Error, need a Minsky Machine program as input."
    sys.exit()
try:
    f = open(sys.argv[1])
except IOError:
    print "Error in reading the program."
    sys.exit()
a = f.read().split("\n")
f.close()

m = []
ia = 2
for i in a:
    i = i.lower()
    if len(i) > 0:
        if i[0] == "#":
            continue
        n = i.split(" ")
        m.append(n) # hopefully your input is valid
        if len(n) == 2:
            ia = ia + 1
        elif len(n) == 4:
            ia = ia + 3
        elif len(n) == 5:
            ia = ia + 9

r = []
r.append("SUB R0, 1")
r.append("MOV R1, " + str(ia))
r.append("COR +" + str(ia) + ", +0, R1")
r.append("BLA +1, NOP, R3")

re = []
def getreg(reg):
    global re
    y = 0
    while y < len(re):
        if re[y] == reg:
            return y + 4
        y = y + 1
    re.append(reg)
    return y + 4

def tillinst(x):
    global m
    x = x - 1
    c = 0
    i = 0
    while i < x:
        a = m[i]
        if len(a) == 2:
            c = c + 1
        elif len(a) == 4:
            c = c + 3
        elif len(a) == 5:
            c = c + 9
        i = i + 1
    return c

x = 0
while x < len(m):
    c = m[x]
    if c[1] == "inc":
        r.append("SUB R" + str(getreg(c[2])) + ", R0")
        r.append("MOV R3, " + str(tillinst(int(c[3]))))
        r.append("BLA -" + str(tillinst(x + 1) + 2 + 2) + ", NOP, R1")
    elif c[1] == "dec":
        r.append("MOV R2, R" + str(getreg(c[2])))
        r.append("NOT R2")
        r.append("MUL R2, 3")
        r.append("BLA +1, NOP, R2")
        r.append("SUB R" + str(getreg(c[2])) + ", 1")
        r.append("MOV R3, " + str(tillinst(int(c[3]))))
        r.append("BLA -" + str(tillinst(x + 1) + 6 + 2) + ", NOP, R1")
        r.append("MOV R3, " + str(tillinst(int(c[4]))))
        r.append("BLA -" + str(tillinst(x + 1) + 8 + 2) + ", NOP, R1")
    elif c[1] == "halt":
       r.append("STOP")
    x = x + 1

for i in r:
    print i

The Minsky machine program

This translator does not do any error-checking and generally trusts the user to provide perfectly valid data. Valid data in this case means completely empty lines, comment lines beginning with a "#", and MM instruction lines. MM instructions are the following:

  • x INC A y
  • x DEC B y z
  • x HALT (this is not a necessary instruction but nonetheless supported)

Every line begins with a line number (the first in the program being "1") and the line numbers must be successive and there may not be gaps; each successive line must have a line number exactly 1 greater than the previous line. The translator also lower-cases everything while processing. x INC A y works by increasing counter called A, then jumping to line y. x DEC B y z works by seeing if B has value 0, if it has, it jumps to line z, otherwise it decreases B and then jumps to line y. x HALT simply halts the program.

In the final product, registers used in the Minsky Machine program begin from SMITH register R4, and register names ("A") are given SMITH register numbers in the order they are encountered in the source read by the translator program. Registers R0-R3 are used by the internal program logic.

On translation

This method, as mentioned, creates SMITH code that actively overwrites parts of it. Overwriting is a defined part of SMITH as it is now, even if it was not so when SMITH was first released. BLA instruction is used merely to save space: it would be possible to have a number of NOP instructions in the program and simply use COR to copy them over something, but BLA is much more convenient.

An INC instruction creates 3 lines of SMITH, a DEC 9 lines, and HALT creates only 1.

The way the resulting MM-to-SMITH programs work is the following: First there are two lines that are executed only once during the entire program, lines that initialize registers R0 and R1. R0 is set value -1, so that it can be used in incrementing (x - -1 -> x + 1). R1 is set the length of the entire program (minus the 2 initialization lines) in SMITH lines. R2 is used in a calculation during DEC, and R3 is used to define what instruction will be executed next. (Technically R2 and R3 could be one register, as at that point whatever the content of R3 were could be replaced with the data of R2, and then again with a new value for R3, but there are two registers for clarity.)

The program, generally, works simply:

  • Execute initialization lines.
  • Loop:
    • Take copy of the entire source, excluding the 2 initialization lines, and place it right at the end of the 'current source'. This new copy of the program is called 'next source'.
    • Set blank/NOP the amount of instructions that R3 has as its value. On the very first cycle the value is 0 which is right, since the program always begins in the very first MM instruction (line 1 in MM). If, say, MM program has instructions INC INC DEC as its three first instructions, and current "line" to be executed would be fourth, then R3 would be 15. (INC is worth 3 lines, DEC is worth 9 lines of SMITH code; 3 + 3 + 9 = 15.) Thus, the program would then set blank the 15 first lines of SMITH, execute the NOPs, and then arrive at the right line.
    • Execute instruction, whatever it is. Both INC and DEC set all the instructions of the current source blank/NOP, so that nothing else gets executed -- until the next cycle. Yes, it would be possible to set blank only the instructions after, but it s easier to simply overwrite the entire current source using the program length set in R1. So now, after execution, the instruction pointer moves over NOPs till it reaches the first line of the next source. The loop begins again: A copy of the entire source is taken, the right amount of instructions blanked, and so on. Execution halts if there is a halt in the MM program, translated into SMITH simply as STOP.

The usage of program 'memory' thus increases by the length of the SMITH program (excluding the two initialization lines that set the registers) on every cycle. Executing a large program with a long run-time is, needless to say, quite memory consuming.

An example of translation

Here is a Minsky Machine program that computes 7 * 7:

# set 7 into register A
1 INC a 2
2 INC a 3
3 INC a 4
4 INC a 5
5 INC a 6
6 INC a 7
7 INC a 8

# take copy of register A into B and C
8 DEC a 9 11
9 INC b 10
10 INC c 8

# repeat until B is zero, on every cycle empty C into A and D, copy D back to C
11 DEC b 12 17

12 DEC c 13 15
13 INC a 14
14 INC d 12

15 DEC d 16 11
16 INC c 15

# halt, now A should have 49 as its value
17 halt

Run through the translator program, the following SMITH program is received:

SUB R0, 1
MOV R1, 75
COR +75, +0, R1
BLA +1, NOP, R3
SUB R4, R0
MOV R3, 3
BLA -4, NOP, R1
SUB R4, R0
MOV R3, 6
BLA -7, NOP, R1
SUB R4, R0
MOV R3, 9
BLA -10, NOP, R1
SUB R4, R0
MOV R3, 12
BLA -13, NOP, R1
SUB R4, R0
MOV R3, 15
BLA -16, NOP, R1
SUB R4, R0
MOV R3, 18
BLA -19, NOP, R1
SUB R4, R0
MOV R3, 21
BLA -22, NOP, R1
MOV R2, R4
NOT R2
MUL R2, 3
BLA +1, NOP, R2
SUB R4, 1
MOV R3, 30
BLA -29, NOP, R1
MOV R3, 36
BLA -31, NOP, R1
SUB R5, R0
MOV R3, 33
BLA -34, NOP, R1
SUB R6, R0
MOV R3, 21
BLA -37, NOP, R1
MOV R2, R5
NOT R2
MUL R2, 3
BLA +1, NOP, R2
SUB R5, 1
MOV R3, 45
BLA -44, NOP, R1
MOV R3, 72
BLA -46, NOP, R1
MOV R2, R6
NOT R2
MUL R2, 3
BLA +1, NOP, R2
SUB R6, 1
MOV R3, 54
BLA -53, NOP, R1
MOV R3, 60
BLA -55, NOP, R1
SUB R4, R0
MOV R3, 57
BLA -58, NOP, R1
SUB R7, R0
MOV R3, 45
BLA -61, NOP, R1
MOV R2, R7
NOT R2
MUL R2, 3
BLA +1, NOP, R2
SUB R7, 1
MOV R3, 69
BLA -68, NOP, R1
MOV R3, 36
BLA -70, NOP, R1
SUB R6, R0
MOV R3, 60
BLA -73, NOP, R1
STOP

If run in the official interpreter with the debuggin mode -d (which is actually the only way to know what happens, there being no output), at the end the register R4 ("A") will have the value 49.