Transceternal

From Esolang
Jump to: navigation, search
Transceternal
Paradigm(s) Graph-rewriting paradigm
Designed by Hakerh400
Appeared in 2019
Computational class Turing complete
Major implementations Interpreter
File extension(s) .txt


Transceternal is very similar to Realm, but while Realm stores some data outside the graph (source code, instruction pointer, nested level, etc), this programming language stores everything in the memory.

Memory

Memory representation is the same as in Realm. We suggest reading it first, because in this article we do not explain things that are explained there.

Syntax

All strings are valid Transceternal source code. The first thing that parser does is tokenizing the source code. There are three cases:

  • If the source code is empty, or consists only of whitespace characters, the list of tokens contains three identical tokens (irrelevant which).
  • If the source code does not contain any whitespace characters, the list of tokens contains all non-whitespace characters in the same order that appear in the source code.
  • If the source code contains at least one whitespace character, the list of tokens contains all continuous sequences of non-whitespace characters in the same order that appear in the source code.

For example, empty program is tokenized to ["0", "0", "0"] (0 can be replaced by any other sequence of non-whitespace characters). Program abc is tokenized to ["a", "b", "c"]. Program abc defgh xyz is tokenized to ["abc", "defg", "xyz"].

Initial memory construction

After converting the source code to a list of tokens, we create a new node for each token. Then, iterate over tokens and for each token get the corresponding node and assign the next nodes (represented by the next tokens) to the 0-pointer and 1-pointer of the current node. We do it in the depth-first manner.

Here is an example. Given the following source code:

012032004051025

Each character is a token (because there is no whitespace). We start with the first token (digit 0) and it represents the root node. We add it to the stack (underscores mean that pointers are missing):

Stack:
0: _ _

On the stack we have a single node and it corresponds to the token 0. It is missing two pointers. Now, for each token, we add it to the first missing pointer of the node at the top of the stack, then we remove that node if it has two pointers, and if we have not seen the new token before, add a new entry to the top of the stack. In this example, The next token is 1, we add it to the first missing pointer of the root node, we do not remove the root node (bacause it has still 1 missing pointer) and we add 1 to the stack, because it is a new token:

Stack:
0: 1 _
1: _ _

We repeat the process until we consume all tokens. At the end, the stack will be empty and we will have the following graph (node tokens on the left and pointers on the right):

0: 1 4
1: 2 0
2: 0 3
3: 2 0
4: 0 5
5: 1 0

This is our initial memory. We still have two tokens left: 2 and 5, but since we have already formed the graph, we ignore them. On the other hand, if we had missing some tokens to fill all pointers, for each missing pointer we assign the node itself (so that each missing pointer points to the node that contains the pointer).

Initializing input string

After constructing a graph from the list of tokens, the graph is not yet ready for computation. We need to inject the input string first, since all information must be stored in the graph (the graph uniquely represents execution state). To do that, we convert the given input string, represented as an array of bytes, to an array of bits (8 bits for each byte, bits are from lowest to highest). For example, string abc is converted to the following array of bits:

100001100100011011000110

Then we convert the array of bits to a graph in the following way. For each bit we create a new node. 1-pointer of the node points to the next node (corresponding to the next bit in the array) and the 1-pointer of the last node points to B0 node (we will explain later what that node means). 0-pointer of each node points to B0 node or B1 node if the corresponding bit is 0 or 1, respectively (we will also explain what B1 node is).

The reverse of this algorithm (for converting a graph to an array of bits) we use later in program execution. After constructing a graph that corresponds to the input string, we create a new node and assign the old root node (from the step Initial memory construction) to the 0-pointer and we assign the input string graph to the 1-pointer. Then we make this new node the root node.

Node B0 is the node at address 000 (see addressing in Realm) and node B1 is the node at address 001 after step Initializing input string. Or, in other words, nodes at addresses 00 and 01, respectively, before step Initializing input string.

The graph is now ready and the computation can begin.

Parsing address from the given node

We do not apply this algorithm immediately, but we explain how it works, because it is used later in the computation. Given a node in the graph, we parse an array of bits from it in the following way.

Let the array be empty. Repeat the following until the current node becomes the node at address 000, or if we reached the same node two times while parsing. In the loop, if the 0-pointer of the current node points to the node at address 000, then add bit 0 to the sequence, otherwise add bit 1 to the sequence. Update the current node to point to the 1-pointer of the previous 0-node. After the loop ends, return the array of bits.

Note that this algorithm is somewhat a reverse of the algorithm for converting an array of bits to a graph. If we apply this algorithm to the graph that we obtained from the input array of bits, we will get the input array of bits back (this does not work always - for example if nodes at addresses 000 and 001 are equal, we will get all zeros in the sequence).

Computation

This is the main loop. If the node at address 01 is equal to the node at address 000, then terminate the program. Otherwise, do the following cases (we check cases in order, if a case matches, we skip other cases in that iteration).

Case 1

If the node at address 0100 is equal to the node at address 000, then let A1 be the parsed address at node 01010 and A2 be the parsed address at node 01011, then assign the node at address A2 to the address A1 and after that assign the node at address 011 to the address 01.

Case 2

If the node at address 0100 is equal to the node at address 001, then let A1 be the parsed address at node 01010, A2 be the parsed address at node 010110 and A3 be the parsed address at node 010111, then create a new node that points to nodes from addresses A2 and A3 (0-pointer and 1-pointer, respectively) and assign it to the address A1. Like in the previous case, after that assign the node at address 011 to the address 01.

Case 3

If no other cases can be applied, let A1 be the parsed address at node 010100 and A2 be the parsed address at node 010101. If the node at address A1 is equal to the node at address A2, then assign the node at address 01011 to the address 01, otherwise assign the node at address 011 to the address 01.

Obtaining the output

After the main loop ends, we obtain the output array of bits by parsing the address from the node at address 1, then we convert it to array of bytes (append zeros if necessary to form a byte) and it represents the output string.

Similarities to Realm

Case 1 corresponds to the assignment in Realm, Case 2 corresponds to allocating a new node and Case 3 corresponds to loops. Any Realm program can be converted to a Transceternal program.

Examples

Empty program

If the source code is empty, or consists only of whitespace characters, the list of tokens will contain three identical tokens (we say tokens are a for simplicity). Then we construct our initial graph. It is a graph that contains a single node whose both pointers point to itself:

Initial graph:
a: a a

Then we inject the input and switch roots:

Graph ready for computation:
root: a input
a: a a
input: ...

where input is the root node of the graph obtained by converting the input string of bits to a graph. Now we start the main loop. It says that if the node at address 01 is eqaul to the node at address 000, then break the main loop. The node at address 01 is a and the node at address 000 is also a, so the main loop has zero iterations. Now we need to extract the output. We parse the sequence of bits from the node at address 1 (which is node input). Since nodes at addresses 000 and 001 are identical (it is node a), all bits in the output are zeros, but the number of bits do not change.

Therefore, the empty program converts all ones from the input string of bits to zeros. In other words, it outputs the NULL character (character with char code 00) repeated the number of times equal to the length of the input string.

Cat

catacat

First, we construct the graph:

Graph:
root: c input
c: a t
a: t a
t: a c
input: ...

The main loop starts. The node at address 01 is t and the node at address 000 is also t, so the main loop ends without doing any iterations. Then we parse the output string of bits. The node at address 000 is t and the node at address 001 is a, so the output string is identical to the input string.

Output a digit

This program ignores the input and outputs ASCII digit 3

0122233445262778889A2B9C2A2

The main loop has a single iteration and it is the Case 1.

Hello, World!

Ignores the input and outputs the Hello, World! string:

00 01 02 02 02 03 02 04 02 05 02 06 00 07 02 08 02 09 00 0A 02 0B 00 0C 02 0D 00 0E 02 0F 02 10
00 11 00 12 02 13 02 14 02 15 00 16 00 17 02 18 00 19 00 1A 02 1B 02 1C 02 1D 00 1E 00 1F 02 20
00 21 00 22 02 23 00 24 00 25 00 26 00 27 02 28 00 29 00 2A 02 2B 02 2C 02 2D 00 2E 00 2F 02 30
00 31 02 32 02 33 02 34 02 35 02 36 02 37 02 38 00 39 02 3A 02 3B 00 3C 00 3D 00 3E 02 3F 00 40
02 41 00 42 02 43 00 44 00 45 00 46 00 47 02 48 00 49 00 4A 02 4B 02 4C 00 4D 02 4E 02 4F 00 50
00 51 00 52 02 53 02 54 02 55 00 56 00 57 02 58 00 59 00 5A 02 5B 02 5C 02 5D 00 5E 02 5F 02 60
00 61 00 62 02 63 00 64 02 65 02 66 02 67 02 68 00 69 02 6A 02 02 6B 6C 02 6D 6B 6E 02 6C 02

Note: this is not the hexadecimal representation of the source code, but the literal text. We could write it in a lot of different ways.

Interpreters

Interpreter