Simple translation

From Esolang
Jump to navigation Jump to search

A simple translation (ST) is a relationship between two programming languages. It is a particular type of equivalence between languages. It was developed to formalize the notion of a minimization.


Simple English definition

A language is a ST of another language if a pair of translation tables exist between those languages which preserve certain properties. For each row, a translation table contains a single symbol from the source language in the left column and a finite sequence of symbols from the destination language in the right column. Use the translation tables to convert from the source language, to the destination language and back. If semantics are preserved during this process, then the destination language is a ST of the source language.

Formal definition

Suppose we have language A with commands a1, a2, ... an and language B with commands b1, b2, ... bm (note that neither A nor B must possess a finite number of commands). A translation table from A to B looks like

A B equivalent
a1 β1,1β1,2...β1,p1
a2 β2,1β2,2...β2,p2
... ...
an βn,1βn,2...βn,pn

where βi,j is one of bk. A translation table from B to A looks like

B A equivalent
b1 α1,1α1,2...α1,q1
b2 α2,1α2,2...α2,q2
... ...
bm αm,1αm,2...αm,qm

where αi,j is one of ak. Notice that both tables must exist.

Take a program consisting of a sequence symbols in A denoted by x0. Translate that program to B using (1) to produce y. Translate y back to A using (2) to produce x1. If, for an arbitrary initial state, x0 halts if and only if x1 halts and they produce equivalent final states, then B is a ST of A.

Notice that if B is a ST of A, then B has implicit semantics defined by (2) in terms of A. This effectively provides a definition for B which may or may not match some other semantic interpretation. Under the implicit semantic interpretation, B is clearly Turing complete (TC) if and only if A is TC.

Thinking in terms of monoids

Concatenative programming languages can be thought of as monoids (reversible concatenative programming languages like Reversible Bitfuck can be thought of as groups). If a language A uses the symbols a1, a2, ..., an, then one way to think of that language is as a quotient of the free monoid generated by A:

 A* := <A>

where A = {a1, a2,... an}. Another way of writing this is defining a set of equivalence relations RA* x A* ((x, y) ∈ R if x and y are equivalent programs) and thinking of A0 as A* modulo R. We write this

 A0 := <A|R>

two languages can be said to be simple translations of each other if

 <A|R> = <B|S>

that is, if the monoids with different presentations are in-fact the same monoid.

Rank of monoid

If we have R0R, then rank(<A|R>) ≤ rank(<A|R0>) where rank(<A|R>) is the cardinality of the smallest set which generates <A|R>. This allows us to define subsets of the proper R of a language to set upper-bounds on it's rank. This is useful because, in general, R cannot be computed. One useful approach we have found is to define R0 to contain the obvious nops in a language. For example

 M := < {a, b, b', c, c'} | aa = bb' = cc' = ε >

represents the structure of a monoid of which the RBF monoid is a quotient (think a = '+', b = '>', b' = '<', c = '(', c' = ')'). R in this case contains many other relations, but these relations are sufficient to prove that rank(RBF) ≤ 3 because

 M0 := < { ab, b'c, c' } | aa = bb' = cc' = ε >

is equivalent to M:

 a  = (ab)(b'c)(c') = a(bb')(cc') = a
 b  = (ab)(b'c)(c')(ab) = a(bb')(cc')ab = aab = b
 b' = (b'c)(c') = b'(cc') = b'
 c  = (ab)(b'c)(c')(ab)(b'c) = a(bb')(cc')a(bb')c = aac = c
 c' = c'

so rank(RBF) ≤ rank(M) = rank(M0) ≤ 3

Another example

This example establishes an upper bound for the rank of brainfuck without I/O. Let C := { +, -, <, >, [, ] }. Then

 BF := <C|R>

for some R. Notice |C| = 6. We know R0 := { +- = -+ = <> = >< = ε } ⊆ R. Moreover,

 BF0 := < { >+, -, <, [, ] } | R0 >

is equivalent to <C|R0> because (>+)(-) = >(+-) = > and (<)(>+) = (<>)+ = +. Therefore rank(BF) ≤ 5.


Consider Nanofuck (NF) and Reversible Bitfuck (RBF). The following translations establish that they are STs of each other.

RBF NF equivalent
+ *{}
> *{}*
< {}
( *{}*{
) }
NF RBF equivalent
* +>
{ <(
} )

From RBF to NF and back gives us

+ *{} +><()
> *{}* +><()+>
< {} <()
( *{}*{ +><()+><(
) } )

clearly RBF and RBF' are equivalent (+ twice on the same cell is a NOP, () is a NOP, and >< is a NOP). This establishes that NF is a ST of RBF. In the same way we can prove that RBF is a ST of NF.


Simple translations are a useful tool in formalizing the notion of an instruction minimization. Suppose a language A contains n commands and a language B contains m commands. If m < n and B is a ST of A, then B can be said to be a minimization of A.

Another way of thinking about a minimization that is equivalent is to consider the rank of the monoid. RBF, for example, is a quotient of the monoid with the presentation

M := <{a, b, b', c, c'} | aa = bb' = b'b = cc' = c'c = ε>

That implies that rank(RBF) <= rank(M). The question then becomes, what is the rank of M? It is three.

Input and output

Input and output consideration may be added by considering the input buffer and output buffer part of the machine state.

See also