BF instruction minimalization

From Esolang
Jump to navigation Jump to search

Our goal is to reduce the number of BF instructions as far as possible (without regard to symmetric style), but not using tricks such as cycling through instructions with one instruction and executing them with the other. The idea is that the symbols representing the instructions are well defined and the functions they represent do not change during runtime. I/O is a requirement for BF-completeness. We start with the standard BF instructions:

8 instructions: ><+-[]., 

Formalizing requirements

Our goal of reducing the number of BF instructions without using tricks may require some formalism. Consider some language X. For the purposes of this page, we will call X BF-equivalent if there exists a simple mapping from X to BF and from BF to X. By simple mapping we mean that each command in X can be substituted by a finite number commands in BF which perform the same function. Conversely, each command in BF may also be substituted by a finite number of commands in X which perform the same function. Consider replacing "+" and ">" with "}". The simple mapping from X to BF would be

X  BF
-- ---
}  +>
<  <
-  -
[  [
]  ]
.  .
,  ,

The simple mapping from BF to X would be

BF X
-- -
>  -}
<  <
+  }<
-  -
[  [
]  ]
.  .
,  ,

Thus X is BF-equivalent. Notice that these substitution must be context-free. Braincrash and other Turning tarpits do not meet this requirement because there is not a single sequence of BF commands which may be substituted for "!" that performs the same function (it is context dependent). Unary and other encoding schemes are excluded for the same reason. Someone correct me if I've made any mistakes -- User:Orby

Further formalizing requirements

Another way of asking, "What is the smallest brainfuck derivative?" is to ask, "What is the rank of the brainfuck monoid?". Brainfuck is a concatenative language. Concatenative languages are monoids. Monoids have a rank (e.g. the cardinality of the smallest generating set). Clearly rank(bf) > 1, otherwise bf would be commutative.

Boolfuck (7 instructions)

BF is normally 8 or more bits wide. Let's just use one bit (and assume an 8-bit BF), that gives us a new intermediate language (similar to Boolfuck):

7 instructions: ><@[].,

The new instruction "@" flips the bit (this takes care of the previous "+" and "-" at once). Here is a conversion chart:

BF bit
-- ---
>  >>>>>>>>>
<  <<<<<<<<< 
+  >>>>>>>>[<]@>[@>]<<<<<<<<<[@]
-  @>>>>>>>>@[<@]>[>]<<<<<<<<<[@]
[  @[>>>>>>>>@[<@]>[>]<<<<<<<<<[@]>>>>>>>>[<]@>[@>]<<<<<<<<<@[@
]  >>>>>>>>>@<<<<<<<<<]>>>>>>>>>[<<<<<<<<<@>>>>>>>>>@]<<<<<<<<<]
.  >.<
,  >,<

BitChanger (6 instructions)

Next, it can be observed that the sequence ">@ < >@" performs the same function as ">" (see BitChanger):

6 instructions: }<[].,

The new instruction "}" is defined in terms of the previous ">@".

BF BitChanger
-- ----------
>  }<}}<}}<}}<}}<}}<}}<}}<}}<}
<  <<<<<<<<< 
+  }<}}<}}<}}<}}<}}<}}<}}<}[<]<}}<}[<}}<}]<<<<<<<<<[<}]
-  <}}<}}<}}<}}<}}<}}<}}<}}[<<}]}<}[}<}]<<<<<<<<<[<}]
[  <}[}<}}<}}<}}<}}<}}<}}<}}<}<}[<<}]}<}[}<}]<<<<<<<<<[<}]}<}}<}}<}}<}}<}}<}}<}}<}[<]<}}<}[<}}<}]<<<<<<<<<<}[<}
]  }<}}<}}<}}<}}<}}<}}<}}<}}<}<}<<<<<<<<<]}<}}<}}<}}<}}<}}<}}<}}<}}<}[<<<<<<<<<<}}<}}<}}<}}<}}<}}<}}<}}<}}<}<}]<<<<<<<<<]
.  }<}.<
,  }<},<

Combined I/O expansion (5 instructions)

5 instructions: }<[];
; = [.<]<[,<]

A five instruction solution, by User:jix can be formed by combining input and output. Simplified by User:calamari:

intermediate  jix
------------  ---
}             }
<             <
[             [
]             ]
.             [<}]}<}[<}]}<}[<}]<};
,             [<}]}<}[<}]<}}<}[<}];

We now have five instructions:

} >@
< <
[ [
] ]
; [.<]<[,<]

Skip If Zero expansion (4 instructions)

4 instructions: < } ^ ;

Four instruction possibility, by User:calamari. The program is contained in the unstated loop, +[ instructions ]:

instruction  meaning
-----------  -------
<            <
}            }
^            skips the next instruction if the byte at the pointer is 0
;            ; by jix

It's not immediately clear how to achieve loops of specified sections of code with this architecture (although it may be possible).

--Quintopia 23:34, 24 September 2008 (UTC)

Store the id of the loop and conditionally execute it's contents. ^ should be skip until next ^ if 0 at pointer and no-op for every second ^. I'm assuming that ^ works this way below. Example:

--<-<<+[+[<+>--->->->-<<<]>]<<--.<++++++.<<-..<<.<+.>>.>>.<<<.+++.>>.>>-.<<<+.

Can be divided into multiple parts and given each part an id

1 --<-<<+[
2 +[
3 <+>--->->->-<<<]
4 >]
5 <<--.<++++++.<<-..<<.<+.>>.>>.<<<.+++.>>.>>-.<<<+.

On the tape, every second cell is used for data, others are for the loop id. We do some command conversion

+ to >+< to -}}<< to }< 255 times and }}<<
- to >-< to -}-< to }< 255 times and } and }< 255 times and }<<
[ to >^<+>^< to -}^<}^< to }< 255 times and }^<}^<
] to >^<->^< if it contains no nested loops
> to ^}-}}<<<^ 255 times
< to ^}<<<}-}^ 255 times
insert +, so }< after the end of each part and ^}<^ 255 times after last part

The only problem with this is that ^ can't be nested, therefore this approach is impossible. ^ must have a pair that's different from it to make this work. Another way would be that ^ skips 2 commands instead of 1 and if one of them is a ^, it also skips that ^'s skipped commands, so ^^^}<}< would be entirely skipped if the cell at the pointer is 0. --GDavid (talk) 13:09, 26 August 2018 (UTC)

I/O goes left (3 instructions)

A three instruction possibility by User:ihope127:

( >@[
] ]
; [.<]<[,<]

This exploits the fact that ; always causes the pointer to move left.


Notes

On Jix's I/O

With the code "; = [.<]<[,<]" in the 5-instruction set by jix you can never output value 0, because it will skip the loop. The following can:

?         Core
--------------
;         [<,>]@[<.>@]

BoolFuck  ?
--------------
.         }[<}];
,         }[<}]<};

But the first loop "[<,>]" looks like an infinite one. We could make it like this: [<,>@]@[<.>@] so that the first loop breaks sometime but then it will be impossible to enter the second loop when appropriate as the value of the bit that selects the I/O operation is lost in the first loop. We could use two bits for selection or copy the bit in the cell next to it. -- (Tritonio) 150.140.227.112 22:50, 6 August 2008 (UTC)

[<[.<]]<[,<] would be able to print a single zero. --Quintopia 22:41, 24 September 2008 (UTC)

Combined <[

You can also combine < with [ (proposed by Aeron), creating <[ = |

?         Core
--------------
|         <[

BoolFuck  ?
--------------
<         |]
[         }|]}|

This creates a 4-source-instruction set with a 7-core-instruction set:

?         Core
--------------
}         >@
|         <[
]         ]
;         [<,>]@[<.>@]

BoolFuck  ?
--------------
@         |]}
<         |]
>         }|]}
[         }|]}|
]         ]
.         }}|]}|<}];
,         }}|]}|<}]<};

This looks inefficient, but it's a possibility.

But hang on...

User:Thematrixeatsyou has spotted a flaw with Aeron's version. If the Brainfuck architecture effectively remains unmodified, and the bit at the pointer equals 1, the program will enter an infinite loop. So it would require an alternative architecture, like the Gregor Richards version of BitChanger.

AshuraTheHedgehog's attempt

><+.,[]	removed -
><+.[]	. = , iff index is odd
>+.[]	> = < iff current cell is odd
>+[]	+ = . iff flag = 1 [uses bits]
>+;	; = [, ], [, ], etc
:;	Combined > and +

Iamcalledbob's attempt

This attempt starts the pointer at #1. The pointer cannot reach #0.

><@.,[]       change + and - into @, or change the bit(also change the #0 value)
><@[]         stdio is unnecessary.
><@|          |= [ or ]  like this: || = []
:<|           combined > and @
:|            : = < if the bit pointed is 1
|             | = : when #0 = 1

This attempt starts the pointer at #1. The pointer cannot reach #0.

><+-.,[]
|             | determines on input--- all the instructions above

Some considerations

There are some limits that need to be considered and that could show problems with some of these variants.

One instruction minimalizations

First of all a one instruction minimalization is impossible. Imagine that a minimalization has one command, and the + command in BF requires a of these commands and the > command requires b of these commands. If a program contains ab commands, it is impossible to tell whether it consists of a >s or b +s - two very different programs.

This assumes, of course, the instruction in question behaves the same way every time regardless of the previous state of the program, as is required to be able to call it an instruction minimalization rather than an encoding scheme. This is why Unary, for example, does not count as an instruction minimalization, nor do schemes encoding the standard instructions as a Turning tarpit.

Loop minimalizations

Many minimalizations use a system where [] is minimalized into a single | instruction. This is ambiguous, as a program like |||| cannot be determined to mean either [[]] or [][]. It is possible for such a language to be Turing-complete via specifying in advance how to match the markers (e.g. always matching the leftmost marker with the rightmost marker), but it will probably be unable to directly represent all brainfuck programs, and thus is not really a minimalization. (A minimalization that worked along these lines would need to imply a "universal" loop structure that can represent all possible loop structures, with the "unused" loops "commented out" by ensuring that the tape pointer was always 0 upon entering them, using tricks similar to the header comment.)

However...

There is other point, however, that, although it might be considered "cheating", has interesting implications. Imagine a BF equivalent where all commands are mapped to three bit binary strives. This is said to be invalid because its commands (0 and 1) have no mapping. However, we know BF to be Turing-complete, so we can implement any computable algorithm in it. This includes the commands +-[].,<>. Then, the 0 and 1 commands can increment some form of counter. Every third command, a conditional will execute, and use a sort of "switch/case" structure in order to execute the corresponding command. Thus, a language like this can actually be shown to have a direct mapping to BF, and therefore would be valid.

The reason such a scheme would be considered "cheating" is that, as mentioned above, it requires instructions to change behavior based on some internal state beyond what is on the tape, or else use the tape to encode said state, which would make it either not an instruction minimalization, not BF, or both.

TonyBrown148's attempt (2 instructions currently)

Start with these 8 commands:

+-<>,.[]

To make it easier, let's say that (abc) means 255 abc.

Step 1(Overflow control)

+<>,.[]

We will make the cell overflow when it reaches 256, so we don't need - because it can be replaced by (+).

Step 2(> and + combined)

<},.[]

} equals > and +, so > is (}<)} and + is <}.

Step 3(Merge I/O)

<};[]

The semicolon is , if current cell is 0 and . otherwise. It's okay because a character with ASCII 0 is invisible.

Step 4(Merge loop)

<};|

a single | is [ and || is ], if two |s are together, insert a (<})<} (it does nothing) between them.

Step 4.5(Change | )

<>;|

I changed the } into > to make it easier to the way of using |. The | is a [ if the next command is < and ] if the next command is >.

Step 4.75(Well... Change | again)

<>;|

But this time the | works like this:

|A|B|

Where A and B are a couple of >s which indicates where to start executing. x >s means executing from the x-th command after it. If this command reaches the end of the code, start from the beginning. Use A if current cell is non-zero and B if not. I call this the if-else-goto structure. Remember: this should count as 1 command no matter how many characters are there.

Step 5(merge < and |)

<>;

Three commands! The | is now the < but if the if-else-goto structure is <><>< (|>|>| in step 4.75) which does nothing, execute the normal < .

Step 6(merge < and ;)

<>

This time, if the two parts in the A and B of the if-else-goto has at least 2 >s (which means it skips the command after it), do ; instead of skipping. So 2 commands are possible!