Tarpit

From Esolang
Jump to navigation Jump to search

Tarpit is a description language for state diagrams of finite state machines and their generalizations.

It is generally well established that the best description of finite state machines is a state diagram. But computers do not presently understand them. Tarpit is an attempt to solve this problem.

While Tarpit may be seen as related to the Funge class of languages, or wire-based cellular automata, this resemblance is really superficial, as such languages embed their control flow directly on the characters, rather than on the general structure described by them.

This initial specification supports only deterministic finite automata (DFAs); a revised spec at a later date may support a wider variety of automata. Despite the fact that it would require no additional effort, the use of this initial specification to encode an NFA is FORBIDDEN.

Generic Syntax

Lines

A line is the basic concept of Tarpit. A line may connect with itself, in which case it is a loop, or it may have two endpoints, in which case it must be directed.

A line is formed by - and | characters, along with / and \ characters to turn corners. Below is an example of a loop:


 /----\
 |    |
 \----/


Two different lines may share the same corners. Alternatively, the + character may be used to allow to lines to cross. In the below examples, the two as are on the same line, as are the two bs:

     b        b
     |        |
  a--\--b  a--+--a
     |        |
     a        b

For a directed line, one endpoint must be immediately next to a perpendicular line, and marked with an arrowhead (<, >, ^, or ', the last one being a downwards arrowhead so that V remains available as a letter) pointing at that perpendicular line. The arrowhead defines the direction of the line, as in the following example of a rightwards line:


 /----\ /--\ /----\
 |    |-/  \>|    |
 \----/      \----/

A directed line's head is the end with the arrowhead, and its tail is the other end. A line points to the line at the end of the arrowhead and, if there is a perpendicular line adjacent to its tail, then from that line.

States & Transitions

In a valid Tarpit program, every loop is associated with a single state. A loop cannot contain any other lines except possibly a single loop (which cannot contain any more lines). If it does contain another loop, then the two loops together are associated with the same state, and this state is accepting. Other than this, each loop is associated with a distinct state.

Every directed line must begin and end at a loop, and this line denotes a transition between the two states.

Labels

Every state and transition may be labeled. Any nonblank character, other than the symbol used to define lines, may be used in a label. In an accepting state, there must be no nonblank characters in between the two loops defining the state. The label of a state is all the characters inside the inner loop. Where a canonical description of the label is required, the characters should be arranged in reading order with the blanks on the ends trimmed and multiple blanks collapsed to one, so that the label of the following state is hi there:


      /--\
 /--\ | h|
 |i | | t|
 |he\-/r//
 |e     |
 \------/

The label of a state has no semantic purpose and is used only to make it easier to refer to states, as one can clearly see from the above example. It is RECOMMENDED that all labels match the regular expression q(0|[1-9][0-9]*).

Every edge's label is computed by beginning at its tail, and looking at the characters orthogonally adjacent to the - and | segments which are not part of the line. Blanks are trimmed and collapsed as with state labels, and additionally, any control characters are ignored and skipped. Other characters must appear exclusively on a single side of the line; if they appear on both sides, it is an error. So for instance, the following edge is labeled A:

   A
 ----->

DFAs

Syntactic Requirements

In a DFA, the following restrictions must be observed:

  1. There must be a unique unlabeled transition that has no from state. Its to state is called the initial state, and SHOULD be labeled q0.
  2. Every other transition must have a label of a single character.
  3. There may not be two transitions on the same character from the same state.

The set of characters which appear in transition labels are called the alphabet of the program.

Semantics

To execute a Tarpit DFA, the program begins with the initial state being current. The interpreter reads characters from the input one at a time. If any input is not in the alphabet of the program, it prints INVALID and quits. Otherwise, it looks at the transitions from the current state and the next input character. If any are labeled with that character, the input is advanced and the new current state is the state pointed to by that transition. Once the input is fully consumed, if the current state is an accepting state, the interpreter prints ACCEPTED and quits.

If there is no transition from the current state labeled with the next input character at any point, or if the end of input is reached and the current state is not accepting, the interpreter prints REJECTED and quits.

Examples

The following example accepts exactly the binary representations of nonnegative integers divisible by 3:

                           /\
    /----\                1|'
  0 |/--\|</-\  /--\      /--\
 /->||q0||-/1\->| q|<-----|  |
 \--|\--/|<-----|1 |-\0/->|q2|
    \----/      \--/ \-/  \--/