TaiDoKu

TaiDoKu is a Esoteric programming language project since 2018. It looks extremely normal at first glance, but is actually super tedious to interpret, totally insane to compile, and almost impossible (but cool, of course) to program.

Naming
The name of this language consists of two parts: Tai and DoKu. Tai stands for Texualized Abstract Ideas. DoKu means the same as in SuDoKu.

Supported Operations
Just as any RISC architecture machine would support, TaiDoKu offers the following operations. Note that TaiDoKu is 4-bit addressable; all computation and data transfer units described below are of 4 bits.

I/O

 * in [pos] read 4-bit data from user's input into specified memory position.
 * out [pos] write 4-bit data to output from specified memory position.

Memory

 * sv [pos], src save the 4-bit value src into specified memory position.
 * ld [pos], [dst] load the 4-bit value to the place of dst from specified memory position.

Control

 * jmp [pos] jump to the specified memory position.
 * jeq [pos], s1, s2 jump to the specified memory position if s1 is equal to s2.
 * jl [pos], s1, s2 jump to the specified memory position if s1 is less than s2. Both are 2's complement, 4-bit signed integers.
 * jlu [pos], s1, s2 jump to the specified memory position if s1 is less than s2. Both are 2's complement, 4-bit unsigned integers.

Arithmetic/Logic

 * add [pos], s1, s2 add s1, s2 and save the result to the specified memory position.
 * and [pos], s1, s2 compute s1 and s2, and save the result to the specified memory position.
 * or [pos], s1, s2 compute s1 or s2 and save the result to the specified memory position.
 * xor [pos], s1, s2 compute s1 xor s2 and save the result to the specified memory position.
 * shl [pos], smt shift the content in the specified memory position to left smt times. While smt is a 4-bit value, only the last two bit is used here.
 * shrl [pos], smt shift the content in the specified memory position to right smt times, logically (without sign extension). While smt is a 4-bit value, only the last two bit is used here.
 * shra [pos], smt shift the content in the specified memory position to right smt times, arithmetically (with sign extension). While smt is a 4-bit value, only the last two bit is used here.

Meta Operations
The meta operations will be introduced later.

Computation Model
TaiDoKu neither distinguishes between text and data memory, nor runtime and instruction memory. With an universal view of all memory, A TaiDoKu program can easily self-modifying itself. (More precisely, no program can run without self-modification. See the reason below.)

Terms

 * Program A TaiDoKu program is made up of a pile of tiles, and then executes operations according to the content of the tile(s); at runtime, every operation follows the same execution cycle, which will be described in detail in later section.
 * Tile The basic unit of memory used in any TaiDoku program. A tile is a two-dimensional space consisting of 16*16 cells.
 * Cell A cell contains a hexadecimal number no larger than 16, so has 4-bit size. The total space of a tile is thus 128 bytes.
 * Steady state In the steady state, a tile should form a 4*4 classic SuDoKu solution, that is, each of the 16 columns, 16 rows, and 16 non-overlapping 4*4 regions, individually contains distinct 16 hexadecimal numbers. Every execution cycle begins in this state. Steady.gif
 * IP Instruction Pointer. At the very beginning of a TaiDoKu program, IP is at the (0,0) position of the first tile.  In each execution cycle, the IP first moves in the OP_Lookup stage, which will be stated later, and then moves in the actual execution stage if the instruction is jump or branch.  Note that in this article, operation or instruction is used interchangeably.
 * X and Y axis Tiles are two-dimensional objects, usually presented in a textual terminal. To obey the right-hand rule, it is defined that columns are parallel to X-axis, and rows to Y-axis.  Thus, when a algorithm traverses this two-dimensional space in row-major way, it does something like normal terminal operations: horizontal line first, carriage return, and then next line.  For instance, the C-4-5-6 sequence in the steady state figure shows the direction of X-axis, and C-8-3-7 shows that of Y.  Note that the upper-left position, where C is, is the origin (0,0).

Execution Cycle
Every execution cycle starts when the current tile is in its steady state. In order to perform an operation properly, preparation and refinement is required. Together they form a sequence as the pseudo code below

while program not halted begin /* at steady state here */

/* pos: current position of IP    * op: operation code *    * This method traverses the current tile, so      * the position of IP may likely change when return. * The return value, op, is the code of     * the operation of this execution cycle. */   op = OP_Lookup(&pos);

/* Execute the specified operation, and * move IP to the proper position. */   Execute(op, &pos);

/* Previous two functions change the * content of the current tile, * so refine the tile here. */   Refine;

/* steady state again */ end

OP Lookup
OP Lookup is a process to determine what operation to be executed in this execution cycle. In brief, the IP travels cells in the tile until some value is met twice. As the preface of refinement stage, each cell involved in this OP lookup procedure is marked as fixed, and becomes a constraint in the later refinement.

The pseudo code is as follows, v = the value of current cell

/* Once a v is seen twice, break.*/ while no duplicated v is visited begin /* Based on this hint, the IP (pos) is then moved to corresponding * position later. '+' is current IP. *     *   0     6     D     *      3  7  A     *   1  4  +  B  E     *      5  8  C     *   2     9     F     */ hint = (v + pos.x + pos.y) % 16

/* modify the content in current cell and mark it*/ do { cell = (cell+1)%16 } while the cell conflicts with other marked cells (in the same column, row, or region) mark(cell)

/* move IP according to the hint */ move(hint, &pos)

v = the value of current cell end return v

Execution
Once the OP Lookup procedure is over, the Execute function treats the return value, which is number between 0 and 15, as a opcode. For opcode smaller than 14, a basic operation is executed according to the order of the list. For instance, the in operation has opcode 0, and the shl operation, which shifts a number to left, has opcode 0xC. The operand required by the operation is loaded in row-major order. For example, assuming the following segment, 5 indicates a jeq operation, and this performs the jump to (A, 6) if F is equal to C, so the jump would not happen.

The remaining one opcode, 15 or 0xF, serves as a common opcode of the meta operations. The actual behavior of the operation depends on the second number in this case. Those functions are defined in the following subsection.

Meta Operations
The opcode of meta operations is 0xF, and the second hexadigit is defined as follows:
 * 1) meta.move direction,x,y recall: a TaiDoKu program is a pile of many tiles.  This operation moves the IP to the upper tile or lower tile.  Note that if the current tile is at the boundary, then treat this call as a circular move.  For example, after a meta.move up at the top tile, the IP moves to the bottom one; if there is only one tile, this operation has no effect.  The last significant bit serves as the direction indicator: 0) up, and 1) down. (x, y) is the new position of IP in the destination tile.
 * 2) meta.new.rand direction new a randomized tile above/below the current one.
 * 3) meta.new.duplicate direction new a duplicated tile above/below the current one.
 * 4) meta.delete direction delete a tile above/below the current one.  When there is only one tile, this operation has no effect.
 * 5) meta.fork direction, x, y fork a new thread, which is presented as a new IP at (x,y) in the tile above/below.
 * 6) meta.sync token,num num specify how many threads are to be synced on the token token.
 * 7) meta.shuffle i, j exchange all i's and j's in this tile.
 * 8) meta.copy [is_source(1),type(2),dir(1)],index copy the content of a part from (if is_source is 0) or to the tile above (if dir is 0) or below.  Valid types are columns (if type is 01), rows(10), and regions(11); index specifies the index of the objects.  Let the steady state figure be a reference. The 2-D-4-6-3-5-7-0-... region has index 4, and 0-6-3-2-5-... column has index 0xD.  If a invalid type 00 is given, it should be treated as invalid operation and cause a panic.
 * 9) reserved.  A reserved operation should cause a panic.
 * 10) reserved.
 * 11) reserved.
 * 12) reserved.
 * 13) reserved.
 * 14) reserved.
 * 15) meta.halt halts the current thread.  Once all threads halt, the program ends.

Refinement
In previous two sections, it has been shown that the content of the current tile definitely changes in each execution cycle. During the OP lookup procedure, the number of rewritten cells is less or equal to 17 due to the pigeonhole principle; while in the execution stage, some operation intentionally overwrites cells with the result of computation. In both case, the current tile needs refining to the steady state, and that is what this stage is all about.

To refine the current tile, an algorithm is needed. In brief, this function first flips the tile according to a calculated hint value, and then runs a backtracking algorithm on the tile from the original point in row-major way. That is, a order of (0,0), (0,1), (0,15), (1,0), and so on. The ordinary sequence used in the trial-and-error process in a SuDoKu backtracking algorithm is simply 0~9, but here, it depends on the least significant bit of the hint value, to follow a normal or gray code sequence.

The pseudo code is as follows:

/* The IP should remain at the cell right after the OP Lookup procedure * if the operation does not involve a jump. */ v = the value of current cell /* This serves as a hint to refine current tile */ hint = (v + pos.x + pos.y) % 16

/* Tile Flipping * * If hint has the 4th least significant bit (0x8), do a horizontal flip (about x axis) * If hint has the 3rd least significant bit (0x4), do a vertical flip (about y axis) * It hint has the 2nd least significant bit (0x2), do a transpose (about line y = x) * * Interestingly, the 8 possible outcomes for a set of all symmetries of the tile. */ if hint & 0x8 all tile[x][y] = tile[x][15-y] if hint & 0x4 all tile[x][y] = tile[15-x][y] if hint & 0x2 all tile[x][y] = tile[y][x]

/* determine the number sequence used in backtracking */ if hint & 0x1 seq[16] = 0~16 else seq[16] = 0,1,3,2,6,7,5,4,12,13,15,14,10,11,9,8

backtrack(0,0)

procedure backtrack input: x, y begin

/* traverse the seq[] until a viable value is found */ while tile[x][y] is the same as any of the follows tile[x][*], tile[*][y], and tile[*][*] in the same 4*4 region begin /* once current value is viable in the position, call to next cell */ if tile[x][y] is OK and not conflicts with any marked cells /* the ending condition is needed in practice */ recursive call backtrack(next_pos(x, y))

if the return value is FAILED tile[x][y] = trynext(seq, tile[x][y]) end

/* all numbers tried, but none of them can be used */ return FAILED

end

Format Specification
A TaiDoKu program should follow the following format  <0/1 = the following tile has IP or not> [x and y coordinates if 1] 16*16 hexadigits <2nd tile has IP or not> [x and y coordinates if 1] 16*16 hexadigits ...  [x and y coordinates if 1] 16*16 hexadigits

An example is like 2 1 0 0 9 D 7 1 E A F B 8 0 3 2 6 5 4 C 4 C 6 E 2 3 7 9 1 5 D F A B 8 0 A F 8 0 C 1 6 5 7 4 E B 2 3 D 9 2 3 B 5 D 4 8 0 6 C A 9 7 1 F E 7 2 E 3 4 D C F 5 6 0 A B 8 9 1 B 8 4 F 1 E A 3 C 9 2 7 5 D 0 6 0 9 A C B 8 5 6 3 E 1 D 4 F 7 2 1 6 5 D 7 9 0 2 4 B F 8 3 C E A 6 0 9 A 3 2 1 7 D 8 5 4 C E B F F 7 3 8 A C 4 D E 2 B 1 9 0 6 5 5 B 1 4 0 6 9 E F A 7 C D 2 3 8 C E D 2 F 5 B 8 9 3 6 0 1 4 A 7 3 5 C 6 9 7 D 1 B F 8 E 0 A 2 4 D 4 0 7 8 F 3 A 2 1 C 6 E 9 5 B E A F 9 6 B 2 C 0 D 4 5 8 7 1 3 8 1 2 B 5 0 E 4 A 7 9 3 F 6 C D 0 9 D 7 1 E A F B 8 0 3 2 6 5 4 C 4 C 6 E 2 3 7 9 1 5 D F A B 8 0 A F 8 0 C 1 6 5 7 4 E B 2 3 D 9 2 3 B 5 D 4 8 0 6 C A 9 7 1 F E 7 2 E 3 4 D C F 5 6 0 A B 8 9 1 B 8 4 F 1 E A 3 C 9 2 7 5 D 0 6 0 9 A C B 8 5 6 3 E 1 D 4 F 7 2 1 6 5 D 7 9 0 2 4 B F 8 3 C E A 6 0 9 A 3 2 1 7 D 8 5 4 C E B F F 7 3 8 A C 4 D E 2 B 1 9 0 6 5 5 B 1 4 0 6 9 E F A 7 C D 2 3 8 C E D 2 F 5 B 8 9 3 6 0 1 4 A 7 3 5 C 6 9 7 D 1 B F 8 E 0 A 2 4 D 4 0 7 8 F 3 A 2 1 C 6 E 9 5 B E A F 9 6 B 2 C 0 D 4 5 8 7 1 3 8 1 2 B 5 0 E 4 A 7 9 3 F 6 C D which has two tiles, and the only IP is initially at the first tile. A program should have at least one IP.

Dialect
As a new language, TaiDoKu is not popular enough to have many dialects. However, it is designed with the flexibility to develop dialects easily. Here are some possible directions to start with:
 * Change the rule of OP_Lookup
 * Add side effects to Execution
 * Change the rule of Refine
 * Enlarge the tile size, expand to higher dimensions
 * Any combinations of above