mlatu-6
mlatu-6 is a concatenative esoteric programming language designed by User:Zhil based on the theory of concatenative calculus. It is a subset of mlatu.
Language Overview
mlatu-6 is a term rewriting language consisting of 6 primitive combinators and quotations. A quotation is a sequence of terms (a term is either a combinator or a quotation) wrapped in parentheses. When a primitive appears directly after an appropriate amount of quotations, a reduction can be made. The primitives are:
Command | Description |
---|---|
+
|
Copies the preceding quotation |
-
|
Removes the preceding quotation |
<
|
Removes a level of nesting from the preceding quotation |
>
|
Adds a level of nesting to the preceding quotation |
,
|
Concatenates the contents of the two preceding quotations into a new quotation |
~
|
Swap the contents of the two preceding quotations |
Uppercase letters of the latin alphabet are often used to denote an opaque sequence of terms. These can't be further reduced, even under strong reduction, and come in handy when you are trying to work out the stack effect of a combinator.
Usually programs are presented without whitespace, but interpreters may add a reduction to remove whitespace, so that it can be used to style programs.
Concatenative Theory
In the concatenative calculus, the stack possesses only 6 core capabilities: we can copy the top element, we can remove it, we can wrap it in a quote, we can unwrap it from its quote, we can concatenate the top two quotes, and we can swap the top two quotes. In fact, it only has 5 core capabilities, swapping can be constructed from the other 5 capabilities and the ability for quotations to nest, through great effort, but this was not known at the time of creating this language, and it results in much more of a tarpit. You will occasionally hear people refer to this tarpit language as mlatu-5.
If we make a base of 6 combinators that each do exactly one of these capabilities, in the simplest way possible, and give them simple 1-character names, we end up with exactly mlatu-6. The choice of these 6 combinators is unique. This makes it an excellent language for studying concatenative calculus.
Actually, nested quotations also possess the ability to swap quotes around, which is why we can construct ~
from the other 5 combinators with this equivalence:
~ = >(->)(-)>,>,+,<->,<>,>,+,<->,<
Evaluation Order
mlatu-6, unlike mlatu, allows reduction inside of quotations (so called transparent quotations), a strategy called strong reduction. By removing the pattern matching from mlatu, mlatu-6 seems to have the Church-Rosser property, meaning that while there might be multiple reduction paths, if they reach a normal form, it is unique.
So, for any given problem, there might be multiple reduction paths. Some of these paths might be infinite, even though finite paths exist. All finite paths should reduce to the same normal form, however.
mlatu on the other hand uses weak reduction, meaning we don't reduce inside of quotations. Some mlatu-6 interpreters might only offer weak reduction as well (and every mlatu interpreter is, in fact, a weak mlatu-6 interpreter).
Normal Order Reduction
To avoid dealing with this issue, we can define normal order reduction, which will avoid the infinite paths and guarantees a normal form if one exists (again, proof needed). To achieve this normal order, reductions cannot happen inside a quotation unless the outside environment of that quotation is already inert (in normal form). On top of that, we will perform reductions from left to right by convention, so that normal order resembles the equivalent stack machine evaluation, but this does not alter the end result, so implementers of normal order interpreters should be able to evaluate in any order, provided we follow the first rule (proof needed).
This type of reduction is called left-most outermost reduction.
A Simple Program
(+<)+<
Consider this simple program. The only operator that appears directly after a quotation is the second +
, so that will be our first reduction:
(+<)(+<)<
We copied the quotation, next up we can reduce the third <
:
(+<)+<
We arrived back at the initial state! Since the language is deterministic, this means there's no point in further reduction, we have reached an infinite loop. When our program can no longer be reduced, it is finished. Interpreters will often just stop after a set number of reductions too though.
Here is the same program, with a more compact display of the evaluation sequence:
(+<)+< (+<)(+<)< (+<)+<
The underlined term is the one which is to be evaluated at that step, with the bolded character being the combinator which will be applied to the preceding quotes. This styling also italicizes other valid evaluation paths, as can be seen in:
(+)(+-),<(~)+,- (++-)<(~)+,- ++-(~)+,- ++-(~)(~),- ++-(~~)- ++-
We could have also taken the italicized path first to arrive at the same final expression, but taking the leftmost valid path is typical.
Counting Program Length
The infinite program from the previous section is the smallest non-halting program at 6 symbols. Any other non-halting programs are at least 7 characters long. You will also notice that loops always seem to resemble this basic loop. Let's talk a bit more about program length:
(A)+
The length of this program is 1. When counting program length, we don't count the inert quotations that tend to precede the program. We can also write down the stack effect of this program: (A) -- (A)(A)
. This syntax means that the input is on the left, and the output is on the right. The "A" here is more like a variable, it could be any sequence of combinators, and it will be copied from the input to the output (twice in fact, in this case). We don't always have to include the inert quotations:
~
By itself, this program can't be reduced further, but we can trivially see that it would reduce further if we preceded it with two more combinators, so we say that the program has the stack effect (B)(A) -- (A)(B)
(~
is the swap operator, remember?).
In general, if we have a program which performs some stack effect when it's put after some amount of quotations, we can refer to it as its own combinator. They are often given a name, and help when constructing a larger program.
Base Lengths
Now that we have a method of counting program length, we can take a concatenative combinator, figure out the smallest program with the same stack effect as the combinator, and consider the length of that program to be the length of the combinator. k
which has the stack effect (B)(A) -- A
has a length of 3, since the smallest program with this stack effect is ~-<
.
The combinator cake
with stack effect (B)(A) -- ((B)A)(A(B))
seems to have a minimal length of 12 thanks to the program >~>>,+<~,~<
. In concatenative calculus, {k, cake}
forms a complete base, just like our 6 primitive combinators form one (complete means any combinator can be composed from either of these bases). We can add up the lengths of k
and cake
and arrive at a "base length" of 15. The mlatu-6 combinator base trivially has a base length of 6. A base length of 5 is possible because of the ~
equivalence defined above.
Now let's do something clever. Let's invent three new combinators as follows:
z = -< (B)(A) -- B hop = ~> (B)(A) -- (A)((B)) baba = ,+ (B)(A) -- (BA)(BA)
Here's why these three new combinators are really cool:
< = ()z ~ = hop< - = ()~z > = ()hop hop- + = ()baba , = baba-
This program is not valid mlatu-6, but we often use this syntax of including combinator names to make it clear what code does. To make it valid mlatu-6, we replace the names with their program and then remove whitespace.
We can still get all 6 combinators from the 3! This means that {a, b, c}
is a complete base, still with length 6! Isn't that cool? It seems like this is actually the ONLY such 3-combinator complete proper base where each combinator is length 2. A proof has not yet been found however.
Can we do a 2-combinator base of length 6? We sure can! User:PkmnQ came up with this marvel:
twee = ~,>+ (B)(A) -- ((AB))((AB)) z = -< (B)(A) -- B < = ()z u = (())twee z<z > = ()twee u - = >u ~ = >(>)twee u<<>twee u<< + = ()twee<~< , = ~twee u<
Meanwhile Moja came up with this one:
cwud = ,~>+ (C)(B)(A) -- (BA)((C))((C)) z = -< (B)(A) -- B < = ()z ~ = ()cwud z + = ()()cwud cwud z z - = ()~z > = ()()cwud-~- , = (()~)~>cwud z~<cwud z-
Here's the famous {k, cake}
base mentioned earlier:
k = ~-< = (B)(A) -- A cake = >~>>,+<~,~<, = (B)(A) -- ((B)A)(A(B)) - = ()k > = ()cake- ~ = >cake k < = ()~k + = ()cake<~< , = ((<)cake k<)cake()k cake()k
Can we go even further? Sure! We will construct a combinator that is able to construct z
and one other combinator we'll call unknown
.
double = ~>~>,~,< = (C)(B)(A) -- (B)(A)C base = (a)(---b) double z = -< (base) base = ((a)(---b)double)(a)(---b)double = (a)(---b)(a)(---b)double = (a)(a)(---b)---b = b if we set b = z, then (base) base = z (z) base = (a)(---z)z = a we pick unknown for a, base = (unknown)(---z) double = (unknown)(----<)~>~>,~,< = ((unknown)(----<))~,< (base) base = z (z) base = unknown
So for twee
and cwud
, we get ((~,>+)(----<))~,<
and ((,~>+)(----<))~,<
respectively.
From the base
combinator we can construct the 2-combinator base, and then from there the 6-combinator base.
When Moja discovered this base, she originally had base = (a)((((b)k)k)k) double
. But Zhil discovered that this was equivalent to base = (a)(---b) double
, which is much simpler.
Simplifying Reductions
Some constructs can always be replaced with others. For example, you can always replace +~
with +
because the top two items are the same after a copy. Here's a list of reductions you can do (empty right hand side means you can just remove them altogether):
+- == >< == ~~ == (), == +~ == + >- == - -- == ,- ~-- == -- ~,- == ,- +>~ == >+< +>~> == >+ >+<> == >+ >,<< == ,<
Busy Beavers
A busy beaver is the halting program that runs for the most reductions or produces the largest state before halting from any program of the same length. For mlatu-6, we study two kinds of busy beavers: reduction busy beavers count the number of reduction steps under normal order reduction, while size busy beavers count the output size after reduction.
Reduction Busy Beavers - Champions
Size | Reductions | Program |
---|---|---|
3 | 1 | ()+
|
4 | 2 | ()+<
|
5 | 3 | ()+>~
|
6 | 5 | (+)+<<
|
7 | 7 | (+)+<<<
|
8 | 12 | (+,+)+<<
|
9 | 37 | (+,+)+<<<
|
10 | >=6182 | (+,+)+<<<<
|
11 | >3*2^2059 | (+,+)+<<<<<
|
Size Busy Beavers - Champions
Size | Final size | Program |
---|---|---|
2 | 2 | ()
|
3 | 4 | ()+
|
4 | 8 | ()>+
|
5 | 12 | ()>>+
|
6 | 18 | ()>>++
|
7 | 24 | ()>>+++
|
8 | 66 | (+,+)+<<
|
9 | >=18416 | (+,+)+<<<
|
10 | >2^2060 | (+,+)+<<<<
|
Kerby Combinators
Brent Kerby's page on concatenative calculus contains a list of combinators. Here are these combinators and the smallest programs we've found for them. There might be mistakes in this list and the programs listed might not yet have been confirmed to be minimal. We also included versions with and without parentheses for the longer ones.
zap == (A) -- 1: - i == (A) -- A 1: < unit == (A) -- ((A)) 1: > dup == (A) -- (A)(A) 1: + cat == (B)(A) -- (BA) 1: , swap == (B)(A) -- (A)(B) 1: ~ m == (A) -- (A)A 2: +< z == (B)(A) -- B 2: -< t == (B)(A) -- (A)B 2: ~< nip == (B)(A) -- (A) 2: ~- swat == (B)(A) -- (AB) 2: ~, tack == (B)(A) -- (B(A)) 2: >, sap == (B)(A) -- AB 3: ~,< k == (B)(A) -- A 3: ~-< rep == (A) -- AA 3: +,< take == (B)(A) -- (A(B)) 3: ~>, run == (A) -- A(A) 4: +>,< dip == (B)(A) -- A(B) 4: ~>,< cons == (B)(A) -- ((B)A) 4: ~>~, w == (B)(A) -- (B)(B)A 6: (+)~,< 7: ~>+,~,< c == (C)(B)(A) -- (B)(C)A 6: (~)~,< 11: >~>~,~>,<~< poke == (C)(B)(A) == (A)(B) 7: >~>,~-< peek == (B)(A) -- (B)(A)(B) 8: >(+)~,<~ 9: >~>+,~,<~ dig == (C)(B)(A) -- (B)(A)(C) 8: (~)~>,<~ 9: >~>~,~>,< bury == (C)(B)(A) -- (A)(C)(B) 8: ~(~)~>,< 9: >~>,~>,<~ flip == (C)(B)(A) -- (A)(B)(C) 8: >~>,~>,< 9: ~(~)~>,<~ b == (C)(B)(A) -- ((C)B)A 9: (~>~,)~,< 13: >~>,~>,<>~,~< sip == (B)(A) -- (B)A(B) 11: >(+)~,<~>,< 12: >~>+,~,<~>,< cake == (B)(A) -- ((B)A)(A(B)) 12: >~>>,+<~,~<, (by Garklein and Moja) 19: >~>,+(>~,),~(>,),,< (by Forth Truffle) s == (C)(B)(A) -- ((C)B)(C)A 15: ((+>)~>,<,~)~,< (by Moja and Forth Truffle) 23: >~>,~>,<>>+,~>,<,>~,~,< (by Moja) s' == (D)(C)(B)(A) -- ((D)C)A(D)B 23: (~,)~,>(~>+)~,(~,<),~,< (by PkmnQ) 30: ~>,>~>,~>>+,~>,<~>,<~,>~,~,<~< (by sakito system) j == (D)(C)(B)(A) -- ((C)(D)A)(B)A 27: +>(>~>,)~,(~>~,)~(,),>,<~,< (by Garklein and Zhil) 29: >+>~,~>,<>~,>~,~>>,~>>,<,~,~< (by PkmnQ) j' == (E)(D)(C)(B)(A) -- ((D)A(E)B)(C)B 35: ~(~(~>~,~>,)~>,<)~>,<+(~(,)~>,<)~,< (by Garklein)
Open Challenges
- Finding some of the shortest programs with an element of chaos (aka they run for a while and don't just produce very simple repetitive output)