Thue
- This is a featured language.
Thue is an esoteric programming language invented by John Colagioia in 2000. It is a matrioshka language based on nondeterministic string rewriting which the author describes as a constraint-programming version of a Turing tarpit.
Computational model
Thue is based on a semi-Thue grammar, which is a generalized form of Thue system in which each association between a pair of strings is only one-way.
Etymology
The language is presumably named after either Norwegian mathematician Axel Thue, or the grammar which bears his name (which the language employs).
It is worth noting that while the mathematician's name is pronounced approximately "TOO-eh", the associated documentation explicitly gives the pronounciation of the name of the language as "TOO-ay".
Language description
Each Thue program consists of two parts: a list of substitution rules of the form
<original string>::=<replacement string>
which is terminated with a line having both <original string> and <replacement string> either empty, or consisting entirely of whitespace:
::=
followed by a string representing the initial program state, which may extend over multiple lines.
Execution consists of picking, from the list of rules, an arbitrary rule whose original string exists as a substring somewhere in the program state, and replacing that substring by the rule's replacement string. This process repeats until there are no rules that can be applied, at which point, the program ends.
Edge cases
The case where a substitution rule has nothing but whitespace or an empty string on its left-hand side, but something other than whitespace or an empty string on its right-hand side, is not defined. Some implementations consider this to be a syntax error, while others (including the original interpreter) consider it to terminate the list of rules, just as a rule with whitespace on both sides of the ::= would.
It is also possible (and arguably reasonable) for an implementation to interpret it as a rule which rewrites the given whitespace (or zero-length string) to the right-hand side, when applied. However, no such implementations have been brought to the attention of this article.
Input and output
Input is performed using a special rule like this:
<original string>::=:::
When triggered, instead of replacing the original string with the string :::
it will be replaced with one line from the input stream.
Similarly, output is triggered by a line containing a rule like this:
<original string>::=~<string to output>
When triggered, the original string is deleted (i.e. replaced by the null string) and the string to the right of the ~
character is sent to the output stream.
Criticisms
Laurent Vogel complains that handling of newlines is not well defined in the specifications and proposes the following convention: when a string is sent to the output stream, no newline is printed at the end, except if the string is empty, in which case a newline is all that is printed.
Ørjan Johansen complains that input is a lousy hack that has no way of handling arbitrary strings (think Wikipedia:code injection here) and no well-defined behavior on end of file. This makes an otherwise powerful language useless for implementing I/O-supporting programs properly in.
Chris Pressey complains that the spec does not agree with the reference implementation on the matter of input. The spec defines :::
and ~
as "strings" or "symbols" which "trigger implicit rules", while the implementation treats them as special rule forms. This leaves open questions such as whether a Thue program with no rules, but with initial data containing :::
, is intended to replace that string with user input.
Sample programs
Here's the traditional "Hello World!" in Thue:
a::=~Hello World! ::= a
The following Thue program performs an increment of a binary number entered as the initial state surrounded by "_" characters, in this case the number 1111111111:
1_::=1++ 0_::=1 01++::=10 11++::=1++0 _0::=_ _1++::=10 ::= _1111111111_
The following sample program is to demonstrate Thue's nondeterminism (and to show an example of an infinite loop, besides). The program outputs bits in an undefined (and quite possibly random) sequence.
b::=~0 b::=~1 ac::=abc ::= abc
The following program is a typical truth machine:
*::=::: 1::=1_ _::=~1 0::=~0 ::= *
The following program shows the fallacy of attributing to Thue the same behavior as Markov algorithms. It is supposed to convert a group of * to Roman numerals and it does work correctly as a Markov algorithm. However, due to Thue's non-determinism, it does not work on Thue (Talk:Thue):
*::=I IIIII::=V IIII::=IV VV::=X VIV::=IX XXXXX::=L XXXX::=XL LL::=C LXL::=XC CCCCC::=D CCCC::=CD DD::=M DCD::=CM ::= ******************
This program is an InterpretMe interpreter:
*::=::: ::= *
The following program is a multiplier and multiplies 3x7. The result is on the state string.
(*::=a( *)::=)b (x)::= ab::=ba* *b::=b* a*::=*a {b::={ a}::=} *}::=}* {}::= ::= {(***x*******)}
This program converts the input unary numbers into decimal numbers and prints the result:
>**********::=*> >_::=<_0 >*_::=<_1 >**_::=<_2 >***_::=<_3 >****_::=<_4 >*****_::=<_5 >******_::=<_6 >*******_::=<_7 >********_::=<_8 >*********_::=<_9 *<::=<* _<*::=_>* _<_::=o l0::=~0 l1::=~1 l2::=~2 l3::=~3 l4::=~4 l5::=~5 l6::=~6 l7::=~7 l8::=~8 l9::=~9 o0::=ol0 o1::=ol1 o2::=ol2 o3::=ol3 o4::=ol4 o5::=ol5 o6::=ol6 o7::=ol7 o8::=ol8 o9::=ol9 i::=::: ::= _>i_
This program generates Kolakoski sequence and prints it:
[1>]1::=1[1>] [1>]2::=2[1>] [1>]*1::=[<1]1*[a1>] [1>]*2::=[<1]2*[a1>][a1>] (1)::=~1 (2)::=~2 1[<1]::=[<1]1 2[<1]::=[<1]2 >[<1]1::=1>[2>](1) >[<1]2::=2>[2>](2) [a1>]1::=1[a1>] [a1>]2::=2[a1>] [a1>]|::=1| [2>]1::=1[2>] [2>]2::=2[2>] [2>]*1::=[<2]1*[a2>] [2>]*2::=[<2]2*[a2>][a2>] 1[<2]::=[<2]1 2[<2]::=[<2]2 >[<2]1::=1>[1>](1) >[<2]2::=2>[1>](2) [a2>]1::=1[a2>] [a2>]2::=2[a2>] [a2>]|::=2| ::= |>[1>]12*2|
Computational class
By showing that there is a reduction to Thue from Type-0 languages in the Chomsky hierarchy (unrestricted grammars), which are computationally equivalent to Turing machines, we can show that Thue is Turing-complete.
Let some arbitrary set of non-zero-length potential substrings of the Thue state be called the nonterminals; the only restriction on these is that they don't show up in the final state of the program. All other substrings are terminals.
Both sides of every Thue substitution rule thus contain an arbitrary combination of terminals and nonterminals. Each Thue substitution rule is in this way equivalent to a production in an unrestricted grammar, and the Thue program is equivalent to the grammar itself.
See also
External resources
- Thue project at Cat's Eye Technologies
- Thue interpreter in JavaScript (from the Wayback Machine; retrieved on 4 May 2006) Working copy hosted in jsfiddle
- Laurent Vogel's Thue resources
- Keymaker's Thue programs
- Brainfuck interpreter in Thue (from the Wayback Machine; retrieved on 21 July 2011) (together with a less innovative Thue interpreter in Python)
- User:Jan jelo/BF interpreter in Thue
- Safalra's website Thue section (from the Wayback Machine; retrieved on 12 July 2006)
- Rue, a variant of Thue which uses regex
- "Thue in Java" Thue interpreter in Java by Mykola Makhin
- Thue interpreter in Common Lisp and a collection of Thue programs by Yoel Matveyev
- Public-domain Thue interpreter in Ruby
- Another Thue interpreter in JavaScript
- "Thue (programming language)" on en.Wikipedia
- Thue interpreter in javascript for node.js
- Thue graphical interpreter in Uxntal, video
- A Thue interpreter in Rust by Adam Soutar
- Thue interpreter in D
- Neothue, a dialect of Thue written in C++ with support for running Thue programs in compatibility mode