Malbolge programming

From Esolang
Jump to: navigation, search

The Malbolge language has eluded many attempts to use it for some years. That was before Lou Scheffer published his cryptanalysis of the encryption algorithm it uses. This article walks the way he opened, explaining many of his findings in detail and other caveats that must be taken into account not mentioned by him.


Malbolge's probably biggest piece of evilness comes from the encryption of instructions as they are executed. That encryption, fortunately, has a few weaknesses that make writing Malbolge programs feasible.

Jumps are not encrypted

First of all let's examine the order in which instruction fetching and execution is performed:

  1. Fetch the instruction from [C] and decrypt it.
  2. Execute the instruction.
  3. Encrypt the byte at [C].
  4. Increment C.

In the case of the jump instruction, whose execution has the effect of changing the value of the code pointer C in step 2 above, the memory position that gets encrypted via step 3 is the final value of the code pointer, i.e. the address previous to the target of the jump, instead of the jump instruction itself which remains unencrypted. That's a great advantage as we will see later.

Cycles of length 2

Another weakness that can be exploited is the existence of length-2 cycles in the encryption step. Every possible instruction can be written as a 2-cycle in which the other instruction in the cycle is a NOP, that is, when an instruction is executed it turns into a NOP and when that NOP is executed it turns into the original instruction.

Here's a table of these memory addresses modulo 94 allowing period-2 instructions where the second is a NOP, together with the instruction itself (there are exactly two possible addresses for each of these instructions, so both are listed):

Address (mod 94) Alternative address (mod 94) Instruction
25 29 <
43 47 /
59 63 *
60 64 j
82 86 p

The remaining three instructions are irrelevant here: v is executed at most once per program, i is not encrypted as explained above, and o is part of the immutable NOPs mentioned below and the cycle length is not important.

Immutable NOPs

The last weakness that makes it practical to write Malbolge programs is the existence of at least one immutable NOP for every possible memory address. An immutable NOP is an instruction that always behaves as a NOP no matter the number of times it is executed. However very few addresses allow to directly enter immutable NOPs into the code because the range of allowed values in the source code is very limited. (Lou Scheffer notes that such values can always be entered by using a bug in the reference interpreter which allows values greater than 126 to be entered, then rotating with the * instruction, but other authors consider that making use of that bug is cheating.)

Note the distinction between the o instruction and any of the other NOPs: the former may be entered directly into the code while the latter can only be obtained by indirect means. Both can be immutable NOPs, though, depending on the address. The following addresses modulo 94 allow to directly enter an immutable NOP into the code by using the o instruction:

6, 8, 10, 17, 26, 27, 37, 42, 48, 51, 82, 86, 88, 92

In memory addresses other than these the o (NOP) instructions will mutate into a non-NOP instruction after a variable number of executions. For example, an o executed at position 0 modulo 94 will become a j after 29 executions.

The values 70 and 74 form the 2-cycle: an instruction with value 70 becomes 74 when executed and 74 becomes 70 and this is independent of the memory address (what is dependent is the meaning of that value). They are immutable NOPs for most addresses modulo 94. The following table lists all the addresses that are an exception to this rule, together with sample values which are immutable NOPs for them:

Address (mod 94) Value
7 60
11 58
24 60
25 60
28 60
29 60
43 60
47 58
59 60
60 58
63 60
64 60
82 62, 80
86 60, 76

The importance of immutable NOPs can easily be seen by realizing that the data pointer D is incremented with every instruction executed, just as the code pointer, so in order to position the data pointer as required it's sometimes necessary to advance it by inserting NOP instructions in the code.

Putting all together

We'll see now how to take advantage of these weaknesses to write actual programs.

Program flux

Let's assume we want to execute the following code (as in the cat program):

  1. Input a character.
  2. Output the character.
  3. Jump to step 1.

Easy as it looks, it's not so easy to write in Malbolge. To start with, this must be translated to the following:

  1. Input a character (at address 43 mod 94)
  2. Jump to address 25 mod 94.
  3. Output a character (at address 25 mod 94)
  4. Jump to address 60 mod 94.
  5. Reload the data pointer to start again.
  6. Jump to address 43 mod 94.

Since the instructions are at addresses which form a 2-cycle, in the first execution all instructions will perform their function while in the second they will act as NOPs, then the cycle will start again.

Now it starts to look more promising, doesn't it? Unfortunately that's not the whole story.

Store operation

Crazy as it is, the crazy operator can be very useful. At a minimum it provides a mechanism for storing values into memory, which is necessary for almost any operation except the most basic ones as the above. For example, checking for EOF before printing the character is not possible without using at least a partial version of the store operation, because there's no way to examine the contents of the accumulator without altering it.

It has to be made in several steps. Prior to the store itself the memory address <addr> in which it is performed must be initialized to 1111111111 ternary (29524 decimal). This can be done in this way:

  1. Set A to 1111111111t (29524 decimal).
  2. Op into <addr>.
  3. Set A to 2222222222t (59048 decimal).
  4. Op into <addr>.
  5. Set A to 0.
  6. Op into <addr>.

After step 6, the memory word at <addr> will contain 1111111111t no matter the initial value it contained. Now <addr> is ready for a value to be stored in it.

The store of the accumulator A into memory now consists of two steps: first OP the value into a location already containing 1111111111t; (this will swap the 0's and 1's in A); then OP the resulting value into <addr> (this will unswap the 0's and 1's, leaving the original value stored in both A and <addr>). Note that this method makes it necessary to perform the above 6 steps into just another memory address. That's part of the fun of writing Malbolge programs.

Load operation

There are two simple(ish) ways of loading a value from memory into A; a straightforward one is to perform ten rotates, but this is very costly. There's another one using the crazy operator: load A with 2222222222t (this requires just one rotation), then OP into the address with the value to load, then load A again with 2222222222t, then OP again into that address. After these operations the accumulator will hold the value originally present in that address.

Not so easy as it looks

Now that we've reached the point where Scheffer stops let's continue from there, pointing out the problems that are encountered on the way. For now we'll assume that the memory can be preloaded at will, even if that's not the case.

Data pointer load operations need fixup

Let's remember the algorithm for writing a cat program which does not stop on EOF:

  1. Input a character (at address 43 mod 94)
  2. Jump to address 25 mod 94.
  3. Output a character (at address 25 mod 94)
  4. Jump to address 60 mod 94.
  5. Reload the data pointer to start again.
  6. Jump to address 43 mod 94.

The first point to note is that step 5 can't be performed every two cycles: it must take effect immediately or otherwise the jump addresses have to be repeated again, meaning an avoidable waste of preinitialized data space. The solution to this problem is to revert the j instruction at step 5 just after it's executed. The way of doing this is to re-execute or "pseudo-execute" it (a pseudo-execution means to jump to the instruction immediately following it, so that the encryption step is performed on it anyway). We add a step 5b executed after step 5:

5b. Encrypt the j instruction of step 5.

This will complete the simplified version of the cat program. The design involves working with both the data and the code section at the same time, keeping track especially of the value of the D register, as follows (the choice of 97 as the start of the data area is arbitrary):


Address Content
97 60
98 42
99 (unused)
100 24
101 (unused)
102 59
103 96


Step C register D register Content of [D] Ins. C after D after
1 43 99 - / - -
2 44 100 24 i 24 -
3 25 101 - < - -
4 26 102 59 i 59 -
5 60 103 96 j - 96
5b 61 97 60 i 60 -
6 61 98 42 i 42 -

(Remember that both C and D are incremented between steps, and that the final value of C prior to the increment is the position that gets encrypted.)

This will loop forever; every two iterations it will input a byte and output it, and execute two NOPs at the next iteration. It does not stop on EOF; instead it outputs 2222222222t mod 256 = 168 repeatedly.

Now it should be apparent that it's possible to achieve execution of arbitrary code by means of jumping to the places where the instructions are to be executed, but there are still a few problems that need to be addressed: the overhead due to data manipulation and the initialization of memory with the desired values.

Initialization of memory

We've assumed that the memory can be preloaded at will, even if that's not the case. Why? Because it's possible (in principle) to create an initialization routine that is executed only once, with no loops or repetitions, so that there's no need to care about self-encryption. That routine's mission is to set up the memory just as needed. The problem is that the routine can very expensive in terms of number of instructions used: many instructions may be needed for one single memory word to reach the desired value. Fortunately it's much easier in Malbolge to write execute-once code than reusable code.

Overhead due to data manipulation

With the programming method exposed above, when a memory word at address <addr> needs to be manipulated (e.g. for a store operation), for each instruction that acts on it there's a need of:

  • A data pointer that positions the data register before it via a j instruction (it can be before or after <addr>).
  • A program pointer to fix up that j instruction.
  • A program pointer that jumps to the instruction that manipulates <addr>.
  • A program pointer that jumps to the next instruction, which is probably a j if more operations on <addr> are needed.
  • A certain number of immutable NOPs around the instruction that make the data register skip other words used to manipulate the value at <addr> (if it's the only instruction manipulating <addr>, then these are not necessary).

As we've explained things, it's not possible to reuse an instruction which already has the desired number of immutable NOPs around it in the same turn of the main loop, because the second time it's executed it acts as a NOP. This means that either of these two approaches has to be taken: either the instructions have to be allocated (in different addresses which have the same value mod 94) or they have to be fixed up on the fly in the same manner as the j instruction was. The second approach is by far the most desirable one because it means just one extra program pointer per instruction, while the first one implies initialization of lots of immutable NOPs and allocation of many extra instructions. So we add one extra word of overhead to the ones already stated:

  • A program pointer to fix up the data manipulation instruction for reuse.

Now it may be more apparent why the approach of making ten rotate operations for one load was not the preferred one, as that would mean ten times the overhead explained above plus the initialization of it all.

Example: How to store the accumulator into memory (long)

Here's an example of a subprogram that stores the accumulator into a memory position, for the purpose of illustrating how the influence of the overhead mentioned above applies to Malbolge programs. Please note that the code is untested and there might be bugs; it's more an example than anything else. It's likely to be possible to reduce the total size, but not much. The code pointers use labels while the data addresses are expressed numerically. Address 200 is chosen as the target of the store/load operation and address 300 is chosen as the ancillary address that has to be filled with 1111111111t as part of the store operation. All o instructions are assumed to be immutable NOPs. The labels for code pointers represent the address of the memory word prior to the address where the label points to. The labels ending in "f" are the fixup addresses of the corresponding instructions. In general, special care must be taken to ensure that the address prior to the target of a jump always contains a character in the printable ASCII range (33-126) or the reference interpreter may crash.

Data starting at address 185:

Address Content
185 L3f
186 L7b
187 L3f
188 L1
189 0
190 L1f
191 L5
192 L3f
193 L1
194 2222222222t
195 L1f
196 L4
197 1111111111t
198 L1f
199 L2
200 (initial value is irrelevant)
201 L2f
202 L3
203 191
204 L4f
205 L5f
206 L7f
207 (unused)
208 L3
209 186
210 L3
211 297
212 (jump to continuation of program)

Data starting at address 288:

Address Content
288 L3f
289 L1
290 0
291 L1f
292 L5b
293 L3f
294 L1
295 2222222222t
296 L1f
297 L4b
298 L3f
299 L2
300 (initial value is irrelevant)
301 L2f
302 L3
303 292
304 L4f
305 L5f
306 L7f
307 (unused)
308 L3
309 287
310 L6
311 (unused)
312 L3
313 184


Label D register Content of [D] Ins. Comment
L1 197 1111111111t * Load A with all 1's
L1f 198 L1f i
L1f 199 L2 i The * is now fixed up and we can jump to L2.
L2 200 ??????????t p Op all 1's into [200]
L2f 201 L2f i
L2f 202 L3 i
L3 203 191 j Rewind D for next pass over [200]
L3f 192 L3f i
L3f 193 L1 i
L1 194 2222222222t * Load A with all 2's
L1f 195 L1f i
L1f 196 L4 i
L4 197 (1111111111t) o
L4b 198 (L1f) o
199 (L2) o
200 ??????????t p Op all 2's into [200]
L4f 201 (L2f) o
202 (L3) o
203 (191) o
204 L4f i
L4f 205 (L5f) o The p is now fixed up.
206 (L7f) o
207 (-) o
208 L3 i
L3 209 186 j Rewind D again for next pass over [200]
L3f 187 L3f i
L3f 188 L1 i
L1 189 0 * Load A with all 0's
L1f 190 L1f i
L1f 191 L5 i
L5 192 (L3f) o
L5b 193 (L1) o
194 (2222222222t) o
195 (L1f) o
196 (L4) o
197 (1111111111t) o
198 (L1f) o
199 (L2) o
200 ??????????t p After this 'p', both A and [200] hold 1111111111t.
L5f 201 (L2f) o
202 (L3) o
203 (191) o
204 (L4f) o
205 L5f i
L5f 206 (L7f) o
207 (-) o
208 (L3) o
209 (186) o
210 L3 i
L3 211 297 j Change D to manipulate [300]
L3f 298 L3f i
L3f 299 L2 i
L2 300 ??????????t p A still holds 1111111111t
L2f 301 L2f i
L2f 302 L3 i
L3 303 292 j
L3f 293 L3f i
L3f 294 L1 i
L1 295 2222222222t * Load A with all 2's
L1f 296 L1f i
L1f 297 L4b i
L4b 298 (L3f) o
299 (L2) o
300 ??????????t p Op all 2's into [300]
L4f 301 (L2f) o
302 (L3) o
303 (292) o
304 L4f i
L4f 305 (L5f) o
306 (L7f) o
307 (-) o
308 L3 i
L3 309 287 j
L3f 288 L3f i
L3f 289 L1 i
L1 290 0 *
L1f 291 L1f i
L1f 292 L5b i
L5b 293 (L3f) o
294 (L1) o
295 (2222222222t) o
296 (L1f) o
297 (L4b) o
298 (L3f) o
299 (L2) o
300 ??????????t p Op all 0's into [300] (after this, [300] has all 1's)
L5f 301 (L2f) o
302 (L3) o
303 (292) o
304 (L4f) o
305 L5f i
L5f 306 (L7f) o
307 (-) o
308 (L3) o
309 (287) o
310 L6 i
L6 Do something that loads the accumulator with the value to store.
L7 288 (L3f) o
289 (L1) o
290 (0) o
291 (L1f) o
292 (L5b) o
293 (L3f) o
294 (L1) o
295 (2222222222t) o
296 (L1f) o
297 (L4b) o
298 (L3f) o
299 (L2) o
300 1111111111t p Op A into dummy position
L7f 301 (L2f) o
302 (L3) o
303 (292) o
304 (L4f) o
305 (L5f) o
306 L7f i
L7f 307 (-) o
308 (L3) o
309 (287) o
310 (L6) o
311 (-) o
312 L3 i
L3 313 184 j
L3f 185 L3f i
L3f 186 L7b i
L7b 187 (L3f) o
L7 188 (L1) o
189 (0) o
190 (L1f) o
191 (L5) o
192 (L3f) o
193 (L1) o
194 (2222222222t) o
195 (L1f) o
196 (L4) o
197 (1111111111t) o
198 (L1f) o
199 (L2) o
200 1111111111t p Complete the store operation
L7f 201 (L2f) o
202 (L3) o
203 (191) o
204 (L4f) o
205 (L5f) o
206 L7f i
L7f 207 (-) o
208 (L3) o
209 (186) o
210 (L3) o
211 (297) o
212 ... i Go on with program logic

Label values:

Label What it points to Value
L1 * 58
L1f fixup * at address 59 59
L2 p 81
L2f fixup p at address 82 82
L3 j 63
L3f fixup j at address 64 64
L4 o×3,p,o×3 172
L4b o×2,p,o×3 173
L4f fixup p at address 176 176
L5 o×8,p,o×4 261
L5b o×7,p,o×4 262
L5f fixup p at address 270 270
L6 depends on the program ???
L7 o×12,p,o×5 351
L7b o×13,p,o×5 350
L7f fixup p at address 364 364

This code requires preinitialization of 49 data words and 24 immutable nops, just to store the accumulator into memory; it does not include the code for loading it afterwards which needs lots of immutable NOPs to skip over data words that are used by the store operation. Perhaps now it's clearer why there are some doubts on whether 59049 words are enough for a complete program using this programming technique.

Reducing the number of immutable NOPs

Instead of controlling the program flow with the code register, which means that all commands are written in their correct order filled with immutable NOPs, it is possible to control the program flow using the data register. This technique has been developed by Hisashi Iizawa et al. and has been used in their 99 bottles of beer program.

Using this technique every instruction in Malbolge (except Jmp) will only occur once in the whole program at a position where the encryption function has a 2-cycle, followed by a Jmp instruction (there may be some exceptions, but this is the basic idea). Thus we know that after the execution of every instruction a Jmp instruction will follow.

Example: Loading 0t1111111111 into the A register

Code pointer to rotate instruction.
Code pointer to rotate instruction + 1 (restore)

The next data word will be a pointer to another Malbolge command, e.g. the crazy instruction, followed by the cell used by it.

When we control the program flow with the D register, branches can be executed by MoveD. If we use a MoveD instruction with a 2-cycle of the encryption function without a restore operation, we can build a loop that will be executed two times.

Using this technique it's possible to save many immutable NOPs, so the number of cells that have to be initialized decreases dramatically. Simple programs in Malbolge still need many memory cells. However, with this technique programs like a cat that halts on EOF are possible.

Programming with memory addresses is very confusing, and it may be time-consuming to manually write code in Malbolge to initialize the memory cells that are needed. To avoid this one can use a Malbolge assembler. Such an assembler can be found in the external resources: LMAO (Low-level Malbolge Assembler, Ooh!). LMAO comes with some examples written in the Malbolge assembler language HeLL (Hellish Low-level Language), so the user can understand the way HeLL works. One of the examples is a cat program that halts on EOF.

External resources