Simple translation

From Esolang
(Redirected from Simple Translation)
Jump to navigation Jump to search

A simple translation (ST) is a relationship between two programming languages. It is a particular type of equivalence between languages. It was developed to formalize the notion of a minimization.

Definition

Simple English definition

A language is a ST of another language if a pair of translation tables exist between those languages which preserve certain properties. For each row, a translation table contains a single symbol from the source language in the left column and a finite sequence of symbols from the destination language in the right column. Use the translation tables to convert from the source language, to the destination language and back. If semantics are preserved during this process, then the destination language is a ST of the source language.

Formal definition

Suppose we have language A with commands a1, a2, ... an and language B with commands b1, b2, ... bm (note that neither A nor B must possess an infinite number of commands). A translation table from A to B looks like

(1)
A B equivalent
a1 β1,1β1,2...β1,p1
a2 β2,1β2,2...β2,p2
... ...
an βn,1βn,2...βn,pn

where βi,j is one of bk. A translation table from B to A looks like

(2)
B A equivalent
b1 α1,1α1,2...α1,q1
b2 α2,1α2,2...α2,q2
... ...
bm αm,1αm,2...αm,qm

where αi,j is one of ak. Notice that both tables must exist.

Take a program consisting of a sequence symbols in A denoted by x0. Translate that program to B using (1) to produce y. Translate y back to A using (2) to produce x1. If, for an arbitrary initial state, x0 halts if and only if x1 halts and they produce equivalent final states, then B is a ST of A.

Notice that if B is a ST of A, then B has implicit semantics defined by (2) in terms of A. This effectively provides a definition for B which may or may not match some other semantic interpretation. Under the implicit semantic interpretation, B is clearly Turing complete (TC) if and only if A is TC.

The idea of implicit semantics can be formalized too. Suppose that A is equipped with a homomorphism to some semantic domain S. The category ST(A,S) over that homomorphism is the category whose objects are languages and arrows are translation tables equipped with paths to A and therefore to S; alternatively, ST(A,S) is the path category on a path-connected graph of translation tables which includes A as a vertex. Every object of ST(A,S) can be interpreted in S; as such, either every object is Turing-complete or none are. This also, by the underlying definition of category requiring everything in sight to commute, requires all of the different translations of B in ST(A,S) to be equivalent in S.

Thinking in terms of monoids

Concatenative programming languages can be thought of as monoids. (Additionally, though we will not use it here, reversible concatenative programming languages like Reversible Bitfuck can be thought of as groups.) If a language A uses the symbols a1, a2, ..., an, then one way to think of that language is as a quotient of the free monoid generated by A:

 A* := <A>

where A = {a1, a2,... an}. Another way of writing this is defining a set of equivalence relations RA* x A* ((x, y) ∈ R if x and y are equivalent programs) and thinking of A0 as A* modulo R. We write this:

 A0 := <A|R>

Two languages A and B can be said to be simple translations of each other if there exist equivalence relations R and S such that:

 <A|R> = <B|S>

That is, if the monoids with different presentations are in-fact the same monoid, which we might call C. The translation table is actually a pair of tables, one between A and C, and one between B and C, built from the generators of R and S respectively.

Rank of monoid

The rank of a monoid A is the cardinality of the smallest set of words which generates A. If we have R0R, then rank(<A|R>) ≤ rank(<A|R0>); by taking less of a quotient, it is possible to require more generators for programs. This allows us to define subsets of the proper R of a language to set upper-bounds on its rank. This is useful because, in general, R cannot be computed; the word problem is usually undecidable.

Examples

One useful approach we have found is to define R0 to contain the obvious nops in a language. For example

 M := < {a, b, b', c, c'} | aa = bb' = cc' = ε >

represents the structure of a monoid of which the RBF monoid is a quotient (think a = '+', b = '>', b' = '<', c = '(', c' = ')'). R in this case contains many other relations, but these relations are sufficient to prove that rank(RBF) ≤ 3 because

 M0 := < { ab, b'c, c' } | aa = bb' = cc' = ε >

is equivalent to M:

 a  = (ab)(b'c)(c') = a(bb')(cc') = a
 b  = (ab)(b'c)(c')(ab) = a(bb')(cc')ab = aab = b
 b' = (b'c)(c') = b'(cc') = b'
 c  = (ab)(b'c)(c')(ab)(b'c) = a(bb')(cc')a(bb')c = aac = c
 c' = c'

so rank(RBF) ≤ rank(M) = rank(M0) ≤ 3.

Another example establishes a lower bound for the rank of Brainfuck without I/O. Let C := { +, -, <, >, [, ] }. Then

 BF := <C|R>

for some R. We don't know exactly what R is, although it surely incorporates the standard algebraic semantics of Brainfuck. Notice |C| = 6. We know R0 := { +- = -+ = <> = >< = ε } ⊆ R. Moreover, let C0 := { >+, -, <, [, ] } be a five-element set of generators, and define:

 BF0 := < C0 | R0 >

On one hand, rank(BF0) ≤ |C0| = 5. On the other hand, BF0 is equivalent to <C|R0> because of the following equalities which give a simple translation from C0 to C:

(>+)(-) = >(+-) = >
(<)(>+) = (<>)+ = +

Therefore rank(BF) ≤ rank(BF0) ≤ 5.

Examples

Consider Nanofuck (NF) and Reversible Bitfuck (RBF). The following translations establish that they are STs of each other.

RBF NF equivalent
+ *{}
> *{}*
< {}
( *{}*{
) }
NF RBF equivalent
* +>
{ <(
} )

From RBF to NF and back gives us

RBF NF RBF'
+ *{} +><()
> *{}* +><()+>
< {} <()
( *{}*{ +><()+><(
) } )

clearly RBF and RBF' are equivalent (+ twice on the same cell is a NOP, () is a NOP, and >< is a NOP). This establishes that NF is a ST of RBF. In the same way we can prove that RBF is a ST of NF.

Minimization

Simple translations are a useful tool in formalizing the notion of an instruction minimization. Suppose a language A contains n commands and a language B contains m commands. If m < n and B is a ST of A, then B can be said to be a minimization of A.

Another way of thinking about a minimization that is equivalent is to consider the rank of the monoid. RBF, for example, is a quotient of the monoid with the presentation

M := <{a, b, b', c, c'} | aa = bb' = b'b = cc' = c'c = ε>

That implies that rank(RBF) <= rank(M). The question then becomes, what is the rank of M? It is three.

Input and output

Input and output consideration may be added by considering the input buffer and output buffer part of the machine state.

See also