Cythan

From Esolang
Jump to navigation Jump to search

The Machine of Cythan is a language with no explicit instructions (being a ZISC), created by Cyprien Bourotte. The data is stored as an infinite line of positive integers, which represents both the program and the inputs. The idea behind it was to make an esolang that is easy to create an interpreter for, to understand, with very few rules, Turing complete and complex to code in. However, it turned out not to be Turing complete due to its reliance on lookup tables for arithmetic – this makes it impossible to create large numbers, not even for use as memory addresses.

It's development lasted around one full year, with 3 different interpreters.

How does it work ?

The Machine of Cythan consist of a positive band of positive intergers, like 1, 8, 4, 99, 0, 74

At every iteration of the machine, those two operations are executed in this order :

code[0] += 2
code[code[code[0]-1]] = code[code[code[0]-2]]

code[0] can be represented as the instruction pointer, and so the second operation as a mov.

We can recapitulate the instruction as :

  • Take the value pointed by the instruction pointer (cell at index 0), and the value behind
  • Execute a mov on the cells at index value 1 and value 2
  • If cell 0 didn't changed, add 2 to cell 0

This is an equivalent much easier to understand.

Examples

Code Number of iterations Result after the iterations Explanation
1 4 3 1111 9999 1 3, 4, 3, 9999, 9999 We took the content of cell 4 and put it in cell 3.
2 1000 5 1 0 9999 1 4, 9999, 5, 1, 0, 9999 We took the content of cell 5 and put it in cell 1.
1 9 9 1 3 9 9 <X> <X> is an empty instruction: we took the content of cell X and put it in cell X, doing nothing.
1 3 0 99 1 99, 3, 0, 99 We wrote to the cell 0 the content of cell 3 (99), basically, we jumped to cell 99.
1 3 0 1 1 1 3 0 1 Is an infinite loop. The v2 or v3 interpreter will automatically detect it and stop.
1 7 1 <BIT> 0 4 0 1111 9000 2 or 3 ? 1111 1 <BIT> 0 4 0 1111 9000 If BIT is a 0, jump to 9000 and if it's a 1, jump to 1111.

Cythan Compiler v1

This has nothing to do with the Cythan Machine has explained above.

There is 2 versions of the Cythan Machine, and 3 versions of the compiler. The first version of the Cythan Machine was very different, and was used during the creation of the language. It doesn't represent what the language was aimed for. It is interesting enough, but a lot more complex.

An OR gate program in the Cythan v1 can be represented as so:

1,1;-1,-2;-1,-4;13,9;13,10;16,1;14,-3;-2,10;-1,-2;-3,11;12,12;12,12;17,0;12,12;-1,-1;0,16;19,1;18,1;100,-1;999,-1

As you can see, in this vesion, data is grouped by pairs. The automatic incrementation of the instruction pointer can be changed during execution (and even set as negative), the data can be indexed to negative values, those are not grouped by pairs, all negative cells are initialized with -1, and some customely rules are made for operation to / from the -1 cell. This version is not intended for use anymore. It was used at the beginning for research purpose.

For more informations and to test it, you can read the older v1 wiki (CF. the Ressources section), and play with the Python Compiler / Interpreter.

The v1 compiler does accept some of the BCL syntax (CF. Cythan Compiler v3), but to know the exaustive list, go to to the Github Wiki.

Cythan Compiler v2

This is the exact machine described in the How does it work ? section.

The compiler was written in Rust, but isn't intended to be used anymore, since the version 3 is retro-compatible.

Cythan Compiler v3

This version of the Compiler was made by CCgauche.

The v3 compiler language is called the BCL: Basic Cythan Language. It's retro-compatible with the Cythan Compiler v2. The new v3 compiler added some abstractions:

Abstraction Use
~ Is replaced by the current cell index
+X Add to cell a number. Useful in combinaison with ~: to acces the 5th cell after the current one, you can use ~+5
<name> = (...) Create a const. Everytime <name> is in the program, it will be replaced by the content
<name> {...} Create a const function. Use <name>(<arg1>, <arg2>, ...) to use it
self.<X> Inside a function, replaced by first argument
self.<X>?<Y> Inside a function, replaced by Xnth arguments. If not existant, replace by Y
self.<X>..<Y>?<Z> Inside a function, replaced by the list of argument from X to Y. If one not existant, replaced by Z
'<ele>: Create a variable (called a pointor) named '<ele>, that contain the current cell index

An active problem of the compiler is that a pointer can only be use after decalartion. This will be corrected in the future.

An example

'start # We start with the instruction pointor at the 'start cell
0 # We use this element to test the or

'0:0 # Some constant pointors to 0 or 1
'1:1

no_op = (0 0) # An empty operation

stop {~+2 0 ~-2} # Do an infinite loop
jump { ~+2 0 self.0 } # Jump to cell self.0

or {
    # self.0 => jump if self.1 or self.2 is '1
    # self.1 => can be '0 or '1
    # self.2 => can be '0 or '1
    self.0 1 self.1 ~+1 0 0 self.2 ~+1 no_op
}

'start:or('good '0 '0) # start of the program
stop() # if result is 0
'good:stop() # if result is 1, will jump to good

This will compile to 4 0 0 1 17 1 2 8 0 0 2 12 0 0 16 0 14 19 0 17, once run, it will be 14 19 0 1 17 1 2 8 0 0 2 12 0 0 16 0 14 19 0 17 (the program jumped to index 14) If we change line 19 to 'start:or('gud '1 '0), 'start:or('gud '0 '1) or 'start:or('gud '1 '1) then it will jumped to index 19 (the second stop() instruction).

The OR gate is basically two if for both self.1 and self.2.

Computational class

Cython is a bounded-storage machine. Via the use of lookup tables, it is possible to implement arbitrary arithmetic as long as the values being operated on do not become larger than envisaged by the original program, and because the instruction pointer is memory-mapped, this allows for arbitrary control flow. However, it's impossible to create a value that's larger than any in the original program, except by reading the instruction pointer, and this is limited to constructing values that are no larger than the length of the program (because if you send the instruction pointer beyond the end of the program, there's no longer any way to retrieve it, and you can't add instructions there without being able to construct their address). This means that the data storage for any given program is limited to a finite number of addresses, each of which can only store a finite number of possible values.

Resources