Malbolge

From Esolang
Jump to: navigation, search

Malbolge, invented by Ben Olmstead in 1998, is an esoteric programming language designed to be as difficult to program in as possible. The first "Hello, world!" program written in it was produced by a Lisp program using a local beam search of the space of all possible programs. It is modelled as a virtual machine based on ternary digits.

Etymology[edit]

The language is named after Malebolge [sic], the eighth level of hell in Dante's Inferno, which is reserved for perpetrators of fraud. The actual spelling Malbolge is also used for the sixth hell in the Dungeons and Dragons roleplaying game.

Memory initialization[edit]

The program is fed directly into the virtual machine's memory starting at position 0. Any whitespace is skipped, including newlines. Only valid instructions and whitespace are allowed in the source, but a bug in the reference implementation makes it possible to enter instructions with a code greater than 126, which hang that interpreter if executed. There's only one officially allowed NOP code that can be entered, even if many possible NOPs are possible at run time (each non-instruction that is a printable character is a NOP).

Each word in the remainder of the memory is initialized as the result of applying the crazy operation (see below) to the contents of the previous and the second previous word (in that order).

Virtual machine description[edit]

The virtual machine is based on trits (trinary, i.e. ternary or base 3, digits). Each machine word is ten trits wide, making it range from 0 to 2222222222 ternary = 59048 decimal. Each memory position holds a machine word; the addresses are one machine word wide too. Both data and code share the same memory space.

(At this place it may be interesting to know that in the 1960s and 1970s trit-based computers were developed in the "eastern bloc" and still some engineers think about trit-based computers when talking about new non-silicon-based technologies.)

There are three registers, each of which holds one machine word: the code register C which is a pointer to the instruction that is about to be executed, the data register D used for data manipulation and the accumulator A also used by several instructions to manipulate data.

If the instruction to execute is not in the range 33-126, execution stops (the reference interpreter hangs in this case due to a bug). Otherwise, in order to determine the actual instruction to execute, the value pointed to by the C register is added to the C register itself and the result divided by 94, taking the remainder. Depending on the result, the following instructions are executed:

(C+[C])%94 Description Pseudocode Op
4 Set code pointer to the value pointed to by the current data pointer. C = [D] i
5 Output the character in A, modulo 256, to standard output. PRINT(A%256) <
23 Input a character from standard input and store it in A. A = INPUT /
39 Tritwise rotate right. A = [D] = ROTATE_RIGHT([D]) *
40 Set data pointer to the value pointed to by the current data pointer. D = [D] j
62 Tritwise "crazy" operation (see table below). A = [D] = CRAZY_OP(A, [D]) p
68 No operation. NOP o
81 Stop execution of the current program. STOP v

(The Op column specifies the operation letter according to the original specification.)

Note: The table above uses a criterium for input and output instruction codes that matches the reference interpreter instead of the one from the specification, which are reversed with respect to each other.

If the result is not any of the values in the table, a NOP (no operation) is executed. However no NOP instructions are allowed in the source code except those for which (C+[C])%94 = 68.

As an example, in the first memory position (as well as in positions which are multiple of 94) the following instructions are allowed in the source code: ' ( < B Q b c u; in the second memory position the list is: & ' = A P a b t, and so on, decreasing the code at each step and wrapping from ! to ~.

The "crazy" operation is defined according to the following table. It operates on corresponding trits of each machine word (it's a "tritwise" operation).

A
0 1 2
[D] 0 1 0 0
1 1 0 2
2 2 2 1

After each instruction is executed, if the memory position pointed to by C is in the range 33-126 then it is encrypted using the following translation table:

ORIGINAL:    !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
TRANSLATED:  5z]&gqtyfr$(we4{WP)H-Zn,[%\3dL+Q;>U!pJS72FhOA1CB6v^=I_0/8|jsb9m<.TVac`uY*MK'X~xDl}REokN:#?G"i@

(that is, ! becomes 5, " becomes z, and so on). If it is not in the range 33-126 then the result is undefined (the reference interpreter has a bug in this case that can potentially cause a crash).

After that encryption step is performed, both C and D are incremented modulo 310 (59049 decimal) and the execution cycle is repeated.

Computational class[edit]

It has been argued that Malbolge is a bounded-storage machine with interactive input; that is, that it is Turing-complete if one ignores its memory limitations (and a finite-state automaton if one does not.) The difficulty in showing this, however, is in showing that 59049 memory words are enough to implement the FSA that simulates a universal Turing machine, as practical Malbolge programs consume lots of memory. But thanks to Hisashi Iizawa, who wrote a 99 bottles of beer program in just 21945 instructions of Malbolge code [1], there are now arguments favouring the BSM hypothesis.

Sample programs[edit]

A cat program in Malbolge (this one does not stop on EOF; one which does is several orders of magnitude more complex):

(aBA@?>=<;:9876543210/.-,JH)('&%$#"!~}|{zy\J6utsrq
ponmlkjihgJ%dcba`_^]\[ZYXWVUTSRQPONMLKJIHGF('C%$$^
K~<;4987654321a/.-,\*)
j
!~%|{zya}|{zyxwvutsrqSonmlO
jLhg`edcba`_^]\[ZYXWV8TSRQ4
ONM/KJIBGFE>CBA@?>=<;{9876w
43210/.-m+*)('&%$#"!~}|{zy\
wvunslqponmlkjihgfedcEa`_^A
\>ZYXWPUTSRQPONMLKJIH*FEDC&
A@?>=<;:9876543210/.-m+*)(i
&%$#"!~}|{zyxwvutsrqpRnmlkN
ihgfedcba`_^]\[ZYXWVU7SRQP3
NMLKJIHGFEDCBA@?>=<;:z8765v
3210/.-,+*)('&%$#"!~}_{zyx[
vutsrqjonmlejihgfedcba`_^]@
[ZYXWVUTSRo

The same program translated to normalized Malbolge for clarity:

jivvvvvvvvvvvvvvvvvvvvvvv<ivvvvvvvvvvvvvvoj/ivvvvv
vvvvvvvvvvjivvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv**v**j<
v*oopvvvvvvvvv/vvvv/vv
j
ppopppp*ooooooooooooo*ooooj
o*oopoooooooooooooooo*ooooj
ooo*ooopooopooooooooo*ooooj
oooooooo*oooooooooooooooooj
ooopopooooooooooooooo*ooooj
o*oooopoooooooooooooo*ooooj
ooooooooooooooooooooo*ooooj
ooooooooooooooooooooo*ooooj
ooooooooooooooooooooo*ooooj
ooooooooooooooooooooo*ooooj
ooooooooooooooooooooo*ooooj
oooooopoooopooooooooooooooj
ooooooooooi

(Normalized Malbolge means with the instructions decrypted to match the original specification, but in this variant every instruction has a different value when interpreted as data, depending on the address modulo 94. The term normalized Malbolge was coined by Andrew Cooke as can be seen in the External resources section below.)

See also[edit]

External resources[edit]