Tina

From Esolang
Jump to navigation Jump to search

Tina (This is not assembly) is an esoteric programming language made by user User:Oscarlo with an assembly-like syntax and a deliberately overpowered “ALU matrix” instruction format. A Tina program is assembled into a linear list of instructions plus an initial memory image, then executed on a simple machine with unbounded signed integer cells and non-negative addresses.

A reference interpreter/assembler is implemented in Python.

Etymology

“Tina” is a backronym for “This is not assembly”. The language is intentionally close to assembly in feel (labels, cells, jumps, stack operations), while also featuring higher-level conveniences like conditional-branching ALU instructions and built-in memory/string operations.

Computational model

Tina executes a sequence of instructions over:

  • An unbounded memory of signed integers (cells default to 0).
  • A program counter (PC) indexing the instruction list.
  • Optional stack/frame support using two conventional cells named SP and FP when stack instructions are used.

Memory addresses must be non-negative; attempting to access a negative address is an error.

Syntax overview

  • One instruction or directive per line.
  • Labels are written as name: (multiple labels may precede a statement).
  • Comments begin with ; and continue to end of line.
  • Mnemonics are case-insensitive (the reference interpreter uppercases internally).

Example line:

loop:  ADD32SNEZ #1, counter, loop   ; increment and loop while counter != 0 (example only)

Assembler directives

These directives allocate and/or initialize memory in the initial image.

.cell name = value
Allocate one cell, optionally initialized. If omitted, initializes to 0. Character literals like 'A' are allowed (one character only).
.block name, n
Allocate n zero-initialized cells.
.data name v1, v2, ...
Allocate a list of cells initialized to the given integers (supports decimal/hex via Python-style 0x).
.zstr name "text"
Allocate a null-terminated byte string. Each character must be in 0..255, and a trailing 0 cell is appended.

Symbols defined by these directives can be used as memory operands and (in immediates) as numeric addresses.

Operands and addressing modes

Tina instructions take operands which can be immediate values or memory references.

Immediate

#n
Immediate integer n. In immediates, n may be:
  • a numeric literal (#10, #0xFF, #-3)
  • a memory symbol (#MSG yields MSG’s address)
  • a code label (#loop yields the instruction index of loop:)
  • optionally with a small decimal offset: #LABEL+3, #var-1

Immediates are not addressable (you cannot write to #...).

Memory (direct / indexed)

x
Direct cell at address x (where x is a symbol or numeric address).
x+K / x-K
Direct cell at address x+K (offset is decimal in the reference interpreter).

Indirect (pointer)

@x
Indirect cell at address mem[x].
@x+K
Indirect indexed cell at address mem[x]+K.

In pseudocode, reading @p+2 means mem[mem[p] + 2].

ALU instruction matrix

Most arithmetic/logic operations are expressed with a single general pattern:

<OP><WIDTH><OVF><COND> src, dst [, label]

Execution model: 1. Read src and dst. 2. Compute a new value (new_dst) using OP. 3. Write new_dst back to dst. 4. If a condition suffix (COND) is present, branch to label if the condition is true when applied to new_dst; otherwise fall through.

Base operations (OP)

The reference interpreter implements these ALU base ops:

  • Data movement/arithmetic: MOV ADD SUB MUL DIV MOD
  • Convenience arithmetic: INC DEC NEG ABS MIN MAX
  • Bitwise: AND OR XOR XNOR NOR NAND NOT
  • Shifts/rotates: SHL SHR SAR ROL ROR
  • Bit tricks: POPCNT CLZ CTZ
  • Comparisons (write results into dst): CMPEQ CMPLT CMPLE CMPGT CMP3
  • Swap: SWP (special: swaps two addressable operands; immediates are not allowed)

Width (WIDTH)

Optional: 8, 16, 32, 64.

If provided, results are interpreted as signed integers of that width.

If omitted, Tina uses unbounded integers (except some bit-operations internally assume a 64-bit “view” for operations like rotate/counting, per the reference interpreter).

Overflow/checked behavior (OVF)

Only meaningful when a width is present:

  • (none): wrap to the given width (two’s complement wrap)
  • S: saturating (clamp to min/max representable signed value)
  • C: checked (raises an error on overflow)

Conditional suffix (COND)

If present, the instruction takes an additional label argument.

Simple conditions (applied to new_dst):

  • LEQ (≤ 0), EQZ (== 0), NEZ (!= 0)
  • LTZ, GEZ, GTZ
  • ODD, EVN
  • POS (≥ 0), NEG (< 0)

Bit-test conditions:

  • BSET0..BSET63 (branch if bit k is 1)
  • BCLR0..BCLR63 (branch if bit k is 0)

SUBLEQ alias

SUBLEQ is an alias for:

SUBLEQ src, dst, label  ≡  SUB src, dst, label    (branch if new_dst <= 0)

This makes it easy to write classic SUBLEQ-style code while still having a larger instruction set available.

Control flow and non-ALU instructions

These instructions are not expressed through the ALU matrix:

  • JMP label — unconditional jump
  • JMPI op — jump to the value read from op
  • BR op, label — branch if op != 0

Dedicated branches (branch based on op):

  • BZ, BNZ, BLTZ, BLEQZ, BGEZ, BGTZ, BODD, BEVN

Loop helper:

  • DJNZ dst, label — decrement dst, branch if the result is not zero

Misc memory ops:

  • ZAP dst — set destination to 0
  • XCH a, b — exchange two addressable operands (no immediates)

Stack and calls

These instructions use a conventional stack in memory and rely on two special cells if used:

  • .cell SP = ... must exist for PUSH/POP/CALL/RET/ENTER/LEAVE
  • .cell FP = ... must exist for ENTER/LEAVE

Instructions:

  • PUSH op — store value at mem[SP], then increment SP
  • POP dst — decrement SP, then load from mem[SP] into dst
  • CALL label, CALLI op — push return address, jump
  • RET — pop return address into PC
  • ENTER #n — push old FP, set FP = SP, allocate n locals by advancing SP
  • LEAVE — restore SP = FP, restore old FP from stack

Built-in memory/string instructions

These are “library-like” instructions implemented directly by the interpreter:

  • MEMSET dstAddr, byte, n
Set n bytes at address dstAddr to byte & 0xFF.
  • MEMCPY srcAddr, dstAddr, n
Copy n cells; overlap-safe in the reference interpreter.
  • MEMCMP aAddr, bAddr, n, dst
Compare n cells lexicographically; write -1/0/1 into dst.
  • STRLENZ srcAddr, dst
Length of a null-terminated byte string.
  • STRCPYZ srcAddr, dstAddr
Copy a null-terminated byte string including terminator.
  • STRCMPZ aAddr, bAddr, dst
Compare two null-terminated byte strings; write -1/0/1.

Note: srcAddr/dstAddr operands are read as values (addresses). For example, if p contains 100, then STRLENZ p, len measures from address 100.

Input and output

Input is byte-based or integer-based and shares a single input stream.

  • INB dst, labelEOF
Read one byte. On EOF, writes -1 to dst and jumps to labelEOF.
  • INN dst, labelEOF
Read a signed decimal integer token (skipping whitespace). On EOF (or no integer available), jumps to labelEOF (reference interpreter does not guarantee writing anything to dst in this case).

Output:

  • OUTB op — output low byte (& 0xFF)
  • OUTD op — output decimal integer text (no newline)
  • OUTHEX op — output as 64-bit-masked hex with 0x prefix
  • OUTBIN op — output as 64-bit-masked binary with 0b prefix
  • OUTZ symbolOrAddr — output null-terminated bytes starting at that address (note: this takes a symbol/address, not a full addressing-mode operand)
  • OUTZI op — output null-terminated bytes starting at the address read from op
  • OUTS op — output a length-prefixed string: addr = op, mem[addr] is length, bytes follow at addr+1..
  • EOL — output newline

Debug/abort

  • TRAP #code — terminate the program with exit code code
  • ASSERT op, #code — if op == 0, terminate with code
  • BREAK, WATCH op — present but treated as no-ops by the reference interpreter (useful with tracing/debugging in other implementations)

Program termination

  • HALT returns exit code 0.
  • Falling off the end of the instruction list ends execution (exit code 0 in the reference interpreter).
  • TRAP/ASSERT can terminate with non-zero codes.

Implementation

The reference implementation is a combined assembler and interpreter written in Python 3.

Usage:

python tina.py program.tina < input > output
python tina.py program.tina --trace

Examples

This is a traditional "Hello World" program in Tina:

.zstr MSG "Hello, world!\n"

start:
  OUTZ MSG
  HALT

This program is a typical truth machine:

.cell ZERO = 0
.cell C48  = 48      ; '0'

.cell ch   = 0
.cell tmp  = 0

start:
  INB ch, done
  OUTB ch

  MOV ch, tmp
  SUBEQZ C48, tmp, done   ; tmp = ch - 48; if tmp == 0 => input was '0'

loop:
  OUTB ch
  SUBLEQ ZERO, ZERO, loop ; unconditional jump (classic SUBLEQ trick)

done:
  HALT

This program is a simple cat program:

.cell ZERO = 0
.cell ch   = 0

loop:
  INB ch, done
  OUTB ch
  SUBLEQ ZERO, ZERO, loop

done:
  HALT

This program is an implementation of Fizz Buzz:

.cell ZERO    = 0
.cell ONE     = 1
.cell THREE   = 3
.cell FIVE    = 5

.cell i       = 1
.cell rem     = 100
.cell c3      = 3
.cell c5      = 5
.cell f       = 0
.cell b       = 0
.cell tmp     = 0
.cell sum     = 0

.zstr SFIZZ "Fizz"
.zstr SBUZZ "Buzz"

loop:
  ZAP f
  ZAP b

  DJNZ c3, no_fizz
  MOV THREE, c3
  MOV ONE, f
no_fizz:

  DJNZ c5, no_buzz
  MOV FIVE, c5
  MOV ONE, b
no_buzz:

  ; if f != 0 print "Fizz"
  MOV f, tmp
  SUBEQZ ZERO, tmp, skip_fizz
  OUTZ SFIZZ
skip_fizz:

  ; if b != 0 print "Buzz"
  MOV b, tmp
  SUBEQZ ZERO, tmp, skip_buzz
  OUTZ SBUZZ
skip_buzz:

  ; if (f+b)==0 print the number
  MOV f, sum
  ADD b, sum
  SUBEQZ ZERO, sum, print_num
  SUBLEQ ZERO, ZERO, after_num
print_num:
  OUTD i
after_num:
  EOL

  ADD ONE, i
  DJNZ rem, loop
  HALT

This takes a number as input and calculates its factorial:

.cell ZERO = 0
.cell ONE  = 1

.cell n    = 0
.cell fact = 1
.cell tmp  = 0

start:
  INN n, eof
  MOV ONE, fact

loop:
  MOV n, tmp
  SUBLEQ ONE, tmp, print    ; tmp = n-1; if tmp <= 0 => n <= 1 => done

  MUL n, fact               ; fact *= n
  SUB ONE, n                ; n--
  SUBLEQ ZERO, ZERO, loop

print:
  OUTD fact
  EOL
  HALT

eof:
  HALT

This implements a simple Brainfuck interpreter:

; Brainfuck interpreter in Tina
;
; Input format on stdin:
;   1) First line: the Brainfuck program (only ><+-.,[] are kept; other chars ignored)
;   2) Remaining bytes after the newline are used as Brainfuck input for ','.
;
; Notes:
; - Tape cells are treated as 8-bit (wrapping) via ADD8/SUB8.
; - Data pointer moves right/left by 1 cell. Moving to a negative address will error.

; -------- fixed “heap” base addresses (picked far away from our .cell area) --------
.cell PROGBASE  = 100000    ; program bytes stored at PROGBASE + i
.cell JUMPBASE  = 200000    ; matching-bracket table at JUMPBASE + i (only for [ and ])
.cell STACKBASE = 300000    ; stack for bracket matching during load
.cell TAPEBASE  = 400000    ; BF data tape starts here

; -------- required register cells for PUSH/POP --------
.cell SP = 0
.cell FP = 0

; -------- interpreter state --------
.cell PLEN  = 0     ; program length (#instructions kept)
.cell IP    = 0     ; BF instruction pointer (0..PLEN)
.cell DP    = 0     ; BF data pointer (address into tape)

.cell OP    = 0     ; current program byte
.cell TMP   = 0
.cell FLAG  = 0

; pointers for indirect addressing
.cell PPROG = 0
.cell PJUMP = 0

start:
    MOV STACKBASE, SP
    ZAP PLEN
    ZAP IP
    MOV TAPEBASE, DP
    JMP load_loop

; --------------------------- load / filter program ---------------------------

load_loop:
    INB OP, load_done          ; EOF => done
    ; stop reading program at newline (LF)
    MOV OP, FLAG
    CMPEQNEZ #10, FLAG, load_done

    ; keep only ><+-.,[]
    MOV OP, FLAG
    CMPEQNEZ #62, FLAG, load_store   ; '>'
    MOV OP, FLAG
    CMPEQNEZ #60, FLAG, load_store   ; '<'
    MOV OP, FLAG
    CMPEQNEZ #43, FLAG, load_store   ; '+'
    MOV OP, FLAG
    CMPEQNEZ #45, FLAG, load_store   ; '-'
    MOV OP, FLAG
    CMPEQNEZ #46, FLAG, load_store   ; '.'
    MOV OP, FLAG
    CMPEQNEZ #44, FLAG, load_store   ; ','
    MOV OP, FLAG
    CMPEQNEZ #91, FLAG, load_store   ; '['
    MOV OP, FLAG
    CMPEQNEZ #93, FLAG, load_store   ; ']'

    JMP load_loop

load_store:
    ; PROG[PLEN] = OP
    MOV PROGBASE, PPROG
    ADD PLEN, PPROG
    MOV OP, @PPROG

    ; if OP == '[' => push its index
    MOV OP, FLAG
    CMPEQNEZ #91, FLAG, load_push

    ; if OP == ']' => pop and create jump links
    MOV OP, FLAG
    CMPEQNEZ #93, FLAG, load_pop

    INC #0, PLEN
    JMP load_loop

load_push:
    PUSH PLEN
    INC #0, PLEN
    JMP load_loop

load_pop:
    ; underflow check: if (SP-STACKBASE) <= 0 => unmatched ']'
    MOV SP, TMP
    SUB STACKBASE, TMP
    BLEQZ TMP, trap_unmatched_close

    POP TMP                  ; TMP = matching '[' index

    ; JUMP[TMP]  = PLEN
    MOV JUMPBASE, PJUMP
    ADD TMP, PJUMP
    MOV PLEN, @PJUMP

    ; JUMP[PLEN] = TMP
    MOV JUMPBASE, PJUMP
    ADD PLEN, PJUMP
    MOV TMP, @PJUMP

    INC #0, PLEN
    JMP load_loop

load_done:
    ; if stack not empty => unmatched '['
    MOV SP, TMP
    SUB STACKBASE, TMP
    BNZ TMP, trap_unmatched_open

    MOV STACKBASE, SP   ; reset stack
    ZAP IP              ; start execution
    JMP exec_loop

; --------------------------- execute BF program ---------------------------

exec_loop:
    ; if IP >= PLEN => halt
    MOV PLEN, TMP
    SUB IP, TMP
    BLEQZ TMP, done

    ; OP = PROG[IP]
    MOV PROGBASE, PPROG
    ADD IP, PPROG
    MOV @PPROG, OP

    ; dispatch on OP
    MOV OP, FLAG
    CMPEQNEZ #62, FLAG, op_gt       ; '>'
    MOV OP, FLAG
    CMPEQNEZ #60, FLAG, op_lt       ; '<'
    MOV OP, FLAG
    CMPEQNEZ #43, FLAG, op_plus     ; '+'
    MOV OP, FLAG
    CMPEQNEZ #45, FLAG, op_minus    ; '-'
    MOV OP, FLAG
    CMPEQNEZ #46, FLAG, op_dot      ; '.'
    MOV OP, FLAG
    CMPEQNEZ #44, FLAG, op_comma    ; ','
    MOV OP, FLAG
    CMPEQNEZ #91, FLAG, op_lbr      ; '['
    MOV OP, FLAG
    CMPEQNEZ #93, FLAG, op_rbr      ; ']'

    ; should not happen
    INC #0, IP
    JMP exec_loop

op_gt:      ; '>'
    INC #0, DP
    INC #0, IP
    JMP exec_loop

op_lt:      ; '<'
    DEC #0, DP
    INC #0, IP
    JMP exec_loop

op_plus:    ; '+'
    ADD8 #1, @DP
    INC #0, IP
    JMP exec_loop

op_minus:   ; '-'
    SUB8 #1, @DP
    INC #0, IP
    JMP exec_loop

op_dot:     ; '.'
    OUTB @DP
    INC #0, IP
    JMP exec_loop

op_comma:   ; ','
    INB TMP, op_comma_eof
    MOV TMP, @DP
    INC #0, IP
    JMP exec_loop

op_comma_eof:
    MOV #0, @DP
    INC #0, IP
    JMP exec_loop

op_lbr:     ; '['
    BZ @DP, op_lbr_zero
    INC #0, IP
    JMP exec_loop

op_lbr_zero:
    ; IP = JUMP[IP] + 1
    MOV JUMPBASE, PJUMP
    ADD IP, PJUMP
    MOV @PJUMP, TMP
    MOV TMP, IP
    INC #0, IP
    JMP exec_loop

op_rbr:     ; ']'
    BNZ @DP, op_rbr_nz
    INC #0, IP
    JMP exec_loop

op_rbr_nz:
    ; IP = JUMP[IP] + 1   (jump back to after matching '[')
    MOV JUMPBASE, PJUMP
    ADD IP, PJUMP
    MOV @PJUMP, TMP
    MOV TMP, IP
    INC #0, IP
    JMP exec_loop

done:
    HALT

; --------------------------- error exits ---------------------------

trap_unmatched_close:
    TRAP #1     ; saw ']' with no matching '['

trap_unmatched_open:
    TRAP #2     ; program ended with unmatched '[' remaining

Computational class

With unbounded memory, arithmetic, and conditional/unconditional jumps, Tina can implement a register machine (or simulate other Turing-complete models). Therefore Tina is Turing-complete (assuming unbounded time and memory).

See also

External resources