Simple translation
A simple translation (ST) is a relationship between two programming languages. It is a particular type of equivalence between languages. It was developed to formalize the notion of a minimization.
Definition
Simple English definition
A language is a ST of another language if a pair of translation tables exist between those languages which preserve certain properties. For each row, a translation table contains a single symbol from the source language in the left column and a finite sequence of symbols from the destination language in the right column. Use the translation tables to convert from the source language, to the destination language and back. If semantics are preserved during this process, then the destination language is a ST of the source language.
Formal definition
Suppose we have language A with commands a_{1}, a_{2}, ... a_{n} and language B with commands b_{1}, b_{2}, ... b_{m} (note that neither A nor B must possess an infinite number of commands). A translation table from A to B looks like
(1) | |
A | B equivalent |
---|---|
a_{1} | β_{1,1}β_{1,2}...β_{1,p1} |
a_{2} | β_{2,1}β_{2,2}...β_{2,p2} |
... | ... |
a_{n} | β_{n,1}β_{n,2}...β_{n,pn} |
where β_{i,j} is one of b_{k}. A translation table from B to A looks like
(2) | |
B | A equivalent |
---|---|
b_{1} | α_{1,1}α_{1,2}...α_{1,q1} |
b_{2} | α_{2,1}α_{2,2}...α_{2,q2} |
... | ... |
b_{m} | α_{m,1}α_{m,2}...α_{m,qm} |
where α_{i,j} is one of a_{k}. Notice that both tables must exist.
Take a program consisting of a sequence symbols in A denoted by x_{0}. Translate that program to B using (1) to produce y. Translate y back to A using (2) to produce x_{1}. If, for an arbitrary initial state, x_{0} halts if and only if x_{1} halts and they produce equivalent final states, then B is a ST of A.
Notice that if B is a ST of A, then B has implicit semantics defined by (2) in terms of A. This effectively provides a definition for B which may or may not match some other semantic interpretation. Under the implicit semantic interpretation, B is clearly Turing complete (TC) if and only if A is TC.
Thinking in terms of monoids
Concatenative programming languages can be thought of as monoids (reversible concatenative programming languages like Reversible Bitfuck can be thought of as groups). If a language A uses the symbols a_{1}, a_{2}, ..., a_{n}, then one way to think of that language is as a quotient of the free monoid generated by A:
A^{*} := <A>
where A = {a_{1}, a_{2},... a_{n}}. Another way of writing this is defining a set of equivalence relations R ⊆ A^{*} x A^{*} ((x, y) ∈ R if x and y are equivalent programs) and thinking of A_{0} as A^{*} modulo R. We write this
A_{0} := <A|R>
two languages can be said to be simple translations of each other if
<A|R> = <B|S>
that is, if the monoids with different presentations are in-fact the same monoid.
Rank of monoid
If we have R_{0} ⊆ R, then rank(<A|R>) ≤ rank(<A|R_{0}>) where rank(<A|R>) is the cardinality of the smallest set which generates <A|R>. This allows us to define subsets of the proper R of a language to set upper-bounds on it's rank. This is useful because, in general, R cannot be computed. One useful approach we have found is to define R_{0} to contain the obvious nops in a language. For example
M := < {a, b, b', c, c'} | aa = bb' = cc' = ε >
represents the structure of a monoid of which the RBF monoid is a quotient (think a = '+', b = '>', b' = '<', c = '(', c' = ')'). R in this case contains many other relations, but these relations are sufficient to prove that rank(RBF) ≤ 3 because
M_{0} := < { ab, b'c, c' } | aa = bb' = cc' = ε >
is equivalent to M:
a = (ab)(b'c)(c') = a(bb')(cc') = a b = (ab)(b'c)(c')(ab) = a(bb')(cc')ab = aab = b b' = (b'c)(c') = b'(cc') = b' c = (ab)(b'c)(c')(ab)(b'c) = a(bb')(cc')a(bb')c = aac = c c' = c'
so rank(RBF) ≤ rank(M) = rank(M_{0}) ≤ 3
Another example
This example establishes an upper bound for the rank of brainfuck without I/O. Let C := { +, -, <, >, [, ] }. Then
BF := <C|R>
for some R. Notice |C| = 6. We know R_{0} := { +- = -+ = <> = >< = ε } ⊆ R. Moreover,
BF_{0} := < { >+, -, <, [, ] } | R_{0} >
is equivalent to <C|R_{0}> because (>+)(-) = >(+-) = > and (<)(>+) = (<>)+ = +. Therefore rank(BF) ≤ 5.
Examples
Consider Nanofuck (NF) and Reversible Bitfuck (RBF). The following translations establish that they are STs of each other.
RBF | NF equivalent |
---|---|
+ | *{} |
> | *{}* |
< | {} |
( | *{}*{ |
) | } |
NF | RBF equivalent |
---|---|
* | +> |
{ | <( |
} | ) |
From RBF to NF and back gives us
RBF | NF | RBF' |
---|---|---|
+ | *{} | +><() |
> | *{}* | +><()+> |
< | {} | <() |
( | *{}*{ | +><()+><( |
) | } | ) |
clearly RBF and RBF' are equivalent (+ twice on the same cell is a NOP, () is a NOP, and >< is a NOP). This establishes that NF is a ST of RBF. In the same way we can prove that RBF is a ST of NF.
Minimization
Simple translations are a useful tool in formalizing the notion of an instruction minimization. Suppose a language A contains n commands and a language B contains m commands. If m < n and B is a ST of A, then B can be said to be a minimization of A.
Another way of thinking about a minimization that is equivalent is to consider the rank of the monoid. RBF, for example, is a quotient of the monoid with the presentation
M := <{a, b, b', c, c'} | aa = bb' = b'b = cc' = c'c = ε>
That implies that rank(RBF) <= rank(M). The question then becomes, what is the rank of M? It is three.
Input and output
Input and output consideration may be added by considering the input buffer and output buffer part of the machine state.