Unilinear

From Esolang
Jump to navigation Jump to search

Unilinear is a deque-based RPN programming language originally based on the dc(1) Unix desk calculator and FALSE, but there are some major differences. All register sets are simultaneously random-access arrays and deques. The main stack is itself a register set, called STACK. Although it's a deque and is often treated like one using the t and T commands, its primary use is for holding the inputs and results of various operations. Code executes in a loop, or queue, which is a circular buffer that is automatically rotated so that the first character is the next instruction, and all programs, no matter how long, consist of a single line, with newline an invalid character (a newline anywhere tells the interpreter to begin executing the program, and all lines in a program file after the first one are comments), which is why it's called "Unilinear." {} is used for code blocks instead of [], since [] is used to form a code queue for use as a loop. All numbers are one digit, with longer numbers formed by using arithmetic operations, and there is no bracket-matching, but instead the escape character ' (single quote). Another difference is that lambda functions can be attached or bound to commands using the ` command. The reference implementation is in Ruby.

History

Unilinear was based on a language I wrote in TI-86 BASIC in 2008 based on dc and FALSE that I called RPN86, which accounts many of the unusual features, including the single-digit numbers. Because any "undefined" command or string enclosed in <> was interpreted as TI-86 "equation" variable (like an eval function), I could simplify some of the operations, so <> could push a multi-digit number, an equation, or a TI-86 variable.

Since lists could only hold numbers, that language had "string references" which were indices into two lists, one containing start values and the other having length values to extract a substring from a single global string. A string enclosed in {} would append that string to the global string and push a string reference onto the stack. The $ substring and # length operators only looked at and modified the string reference, and concatenation used a copy-on-write style system where if, say stra+strb was equal to sub(rpnSTR,rpnSS(a),rpnSL(a)+rpnSL(b)), it would just change the string reference and won't touch rpnSTR at all.

It initially didn't support nesting of the same type of block, and instead used a string reference with the x command. Operator overloading was also unsupported. Since I wanted loops inside of blocks, there are separate {} and [] groups. I eventually added the ' escape character (TI-86 has no backslash). There was an rpnTAG() list that held the data types, like REAL, LIST, VECTR, MATRX, EQU, STRNG, and the CPLX variants of the first 4. It always used one entry per stack item, so the X operator was just an equation that contained dimL(rpnTAG). It was more APL-like and polymorphic, and functions like . (which would be dot product on 2 vectors and logarithm otherwise) and * (which was for scalar multiplication, matrix multiplication, vector cross product, multiplication of corresponding elements of a list, etc.) automatically adjusted based on data type using the built-in TI math operators. There were also matrix transpose (superscript T) and matrix swap operators. The register set was stored in a matrix with the rows each associated with string references, so rpnREGS(rpnSTRR(rpnRC),index) would return the register numbered by index from the current register set. The , operator was originally an upside-down question mark that pushed 1 if equal and 0 if unequal for any data type (including matrices and vectors), since = was already used by xor, but when I rewrote it for the PC, I made , a string-equality operator instead.

Commands

  • 0123456789 - pushes the literal number onto the stack as an integer
  • ' - An escape character. The next character is neither branched by a jump nor closes any group ""()<>[]{}
  • a - (#a -- $b) replaces a number by the ASCII character with that value
  • A - ($a -- #b) replaces the first character in a string by its ASCII numeric value
  • B - (#a --) jumps to the nth character in its argument based on `a', starting from index 0
  • b - ($a $b --) reads a file specified by its name in `a' into a column specified by `b'
  • C - clears the current register set
  • c - (?... --) clears the stack
  • D - (#... --) executes the next character/block as a math function (e.g. D(sin) is a sine)
  • d - (?a -- ?a ?a) duplicates `a', the same as `0s'
  • e - (?a --) deletes top of stack
  • F - (#a -- #b) replaces `a' with its fractional part
  • f - (#a -- #b) replaces `a' with its integral part
  • g - (-- #a) gets the number of registers in a column (register set)
  • h - (for "holdeval") executes the following instruction as a builtin, not its user-defined value
    • hb - ($a $b --) transfer the lines of a file named `a' into the register set `b'
    • hw - ($a $b --) write each entry in the register set `b', converted to strings, as lines in file `a'
    • others may be available in an implementation-dependent manner
  • H - gets the help of the following character, if it exists.
    • H^A (Crtl-A) pushes a list of all defined commands.
    • H^B (Crtl-B) pushes a list of all Builtins.
    • H^D (Crtl-D) pushes a list of all commands with help available.
    • H^X (Crtl-X) pushes a list of all Extended (user-defined) Commands.
  • I - (-- #a) pushes the ASCII value of a single inputted character, input without buffering (using conio or curses)
  • i - (-- $a) pushes a line of input from the user as a string
  • J - Jumps backwards to the previous : (colon) character; to skip a :, the ' escape character must come after the :
  • j - Jumps forwards to the next : (colon) character
  • k - (#a #lower #upper -- #cond) pushes 0 if the (inclusive) range `k'ontains the number, -1 if below the bounds, or 1 if above them
  • l - (-- $a) gets the current column's name
  • L - ($a --) loads a column by its name in `a'
  • m - (#a --) sets (makes) the number of registers in a column equal to `a'
  • M - (?a -- ??b) converts (makes) the top of stack into a different type
    • Mi - makes it an integer (truncating it if necessary)
    • Mf - makes it floating-point (if applicable)
    • Ms - makes it a string
    • non-numeric strings, when converted to numeric types, become 0
    • other conversions may be possible in some implementations or extensions
  • N - (#a --) swaps execution level with entry `a'-from-top (cooperative multitasking/coroutines)
  • n - cycles to the next execution level in the queue (cooperative multitasking/coroutines)
  • O - ($a --) replaces the parent instruction stream with the string popped from the stack
  • o - (-- $a) pushes the parent instruction stream onto the stack
  • p - ($a --) prints the top of stack followed by a newline
  • P - ($a --) prints the top of stack with no trailing newline
  • Q - breaks from the current loop/macro/subroutine
  • q - exits from the program
  • R - (#a -- #b) reads the the register specified by `a' and pushes its value
  • r - (?a ?b -- ?b ?a) swaps the top two items on the stack, the same as `1s'
  • S - (#a -- #b) replaces `a' with its sign (a => signum(a))
  • s - (?a ... ?n -- ?n ... ?a) swaps the top of stack with the entry `a'-from-top
  • T - (?a ?b...?y ?z -- ?b...?y ?z ?a) rotates the stack left
  • t - (?a ?b...?y ?z -- ?z ?a ?b...?y) rotates the stack right
  • U - pushes nil or undef; U` undefines the macro represented by the next character, returning its meaning (if any) to the builtin
  • u - (-- #a) pushes the current number of execution levels
  • V - sets or clears the verbosity to the opposite of its current level (used for debugging or interactive calculations)
  • v - (#a -- #b) takes the square root of `a'
  • W - (#a #b --) writes the value of `a' into the register specified by `b'
  • w - ($a $b --) writes a column specified by `b' into a file specified by its name in `a'
  • X - pushes the current stack depth (number of items on the stack)
  • x - ($a --) appends a `Q' command to string `a' and executes it as a subroutine
  • Y - ($a --) rotates the specified column left COL(a b...y z)=>COL(b...y z a)
  • y - ($a --) rotates the specified column right COL(a b...y z)=>COL(z a b...y)
  • z - (#a $b --) swaps the top of column `b' with the entry `a'-from-top
  • Z - ($a $b --) moves the register set identified by `a' to the one identified by `b'
  • + - (?a ?b -- ?c) adds (or concatenates strings) `a' and `b' on the stack
  • - - (#a #b -- #c) subtracts `b' from `a' (a-b)
  • _ - (#a -- #b) negates `a'
  • * - (?a #b -- ?c) multiplies `a' and `b', or repeats the `a' string `b' times
  • / - (#a #b -- #c) divides `a' by `b'
  • % - (#a #b -- #c) returns `a' modulo `b'
  • ^ - (#a #b -- #c) returns `a' raised to the power of `b'
  • . - (#a -- #b) takes the natural logarithm of `a'
  • & - (#a #b -- #c) ANDs `a' and `b'
  • | - (#a #b -- #c) ORs `a' and `b'
  • = - (#a #b -- #c) XORs `a' and `b'
  • " - prints a string specified between " and " (double quote) characters, followed by a newline
  • # - ($a -- #b) replaces the string `a' with its length
  • $ - ($STR #START #LEN -- $SUBSTR) takes the substring of `STR' starting at position `START' and length `LEN'
  • , - ($a $b -- #cond) pops `a' and `b', pushes 0 if equal or 1 if not equal
  • ; - ($a $b #c -- #d) an "indexOf" or "strstr" function that returns the first position of `b' found in `a' beginning at position `c'
  • ? - (?a --) pops `a' and skips the next character/block if it is nonzero
  • ! - always skips the next character/block
  • : - A jump target for `j' and `J', otherwise a no-op
  • @ - (-- $a) fetches the next character/block from the parent instruction stream based on the grouping rules (as used by ? and !)
  • \ - pushes the following character onto the stack
  • ` - ($a --) defines the string at the top of stack as a macro representing the following character, automatically executing that macro instead of a builtin (if it exists) when that instruction is executed
  • ~ - clears the screen
  • ( - (GROUP) is a no-op, but ! and ? skip the enclosed commands; can be used for comments with !
  • < - <SYSEVAL> executes a string specified between < and > (angle brackets) as native code, using eval (implementation-dependent)
  • [ - [LOOP] executes a subroutine specified between [ and ] (square brackets) with no Q on the end
  • { - (-- $a) {STRING} or {LAMBDA} pushes a string specified between { and } (curly brace) characters
  • )>]} - ends a the corresponding block, are no-ops if executed; see the opening character for details

Some of these commands are actually extended commands, defined as part of the standard library. For example, "," is defined as {\ +r\ +rdt#rdt#-?j:TeTe1Q:Tdt#1-?jTTd>t<Ard>t<A-?Jj:TATA-hSd*}`,, ";" is defined as {l{STACK'}Lt!1ttt[1R0R2Rd1+2W1R#\$,?Q0R#2R-?(02WQ)]TTeeT1-TL}`;, and "k" is {ttdT-S1+?(eTe1_Q)T-S1-?(1Q)0}`k. In the definition of ",", "<" and ">" are themselves redefined extended commands.

SYSEVAL may be an eval(), the numeric address of a function pointer, a system() command, a function name, inline assembly or machine code, Unilinear code with special handing (e.g. all overloads or all PRE/POSTRUN effects disabled), or any or none of the above, depending on the implementation.

All commands beginning with "(<[{ strip all unescaped ' escape characters upon each parsing, so 2^n-1 (or 2r^1-) single quotes must be used to escape n levels of parsing.

Data Types

  • # - Numeric (integer, floating-point, or other numeric types)
  • $ - String
  • ? - Any
  • ?? - Any, but may be converted into a different type than its original type

Giving the wrong data type may either cause type coercion (as if the proper M instruction were executed) or cause a type error. A type error will always be displayed on an output stream, but may possibly also terminate the program, kill the current execution stream (as though Q were executed), replace the return value with the output of U, or return to the main input loop.

All strings start at 1, not 0. However, 0 or negative numbers are interpreted as a starting index of 1. If the length of a substring is negative or 0, and the starting position is within the string bounds, an empty string is pushed. If the start or end of a substring is outside the bounds of the original string, the result is undefined, with the exception that it may not return any portion of the string prior to the specified starting position.

Special Register Sets

  • ARGS - the command-line arguments, indexed as argv (or equivalent)
  • STACK - the main stack from which all commands get their operands
  • ISTACK - the instruction stream stack, the top holds the current execution level
  • ISTACK_V - holds the verbosity mode (set by `V') for each execution level
  • SKRULES - the skip rules for instructions like !, ?, and @, parsed by the SKRULES parser; each entry corresponds to the ASCII value of a command character
  • XCMDS - overloads for commands; each entry corresponds to the ASCII value of a command character, and is a string of length > 0 if overloaded or a number, empty string, or the output of `U' otherwise
  • PRERUN - each entry is executed prior to every command, except in a PRERUN, POSTRUN, CLEANUP, or XCMDS loop; the only way to overload `h'
  • POSTRUN - each entry is executed following every command, except in a PRERUN, POSTRUN, CLEANUP, or XCMDS loop
  • CLEANUP - upon executing a `q' command, the output of `U' is pushed, then each entry is executed; if the top of stack still contains the output of `U', exit, otherwise continue
  • 0 - the default register set loaded when the interpreter begins execution

Where it says "each entry is executed" assume it means "all entries that contain a string of nonzero length are executed in the order from index 0 to what `g' would return if that register set is loaded."

SKRULES Parser

  • >CC - seek until the character C (using ' as an escape character)
  • >NN - skip N times forward
  • <NN - skip N times backward
  • >WC - skip while the current character is C
  • ?>C<OPER> - execute rule for <OPER> if the next character is C
  • ?=C<OPER> - execute rule for <OPER> if this character is C
  • ?<C<OPER> - execute rule for <OPER> if the previous character was C
  • ?*C<OPER> - execute rule for <OPER> if character C is anywhere in the stream
  • ?C*<OPER> - execute rule for <OPER> if a change was previously made
  • ?c*<OPER> - execute rule for <OPER> if no change was made
  • Cc* - clear the changed flag
  • =CC - execute rules of character C as if C was the character, don't return
  • ==* - execute rules of the current character, don't return
  • &CC - execute rules of character C as if C was the character, return to the previous rules afterwards
  • &=* - execute rules of the current character, return to the previous rules afterwards
  • QQ - quit parsing immediately
  • ER<MESG> - print error message and quit parsing

Entries of the SKRULES register set are in this command language. The default is >N1, which simply skips the single-character command itself. They should represent the instruction format of each instruction except for !, which deliberately uses >N1 to allow negated conditionals and else statements. All non-strings or strings in which the first few characters do not form a valid SKRULE are treated as if N>1 was specified.

Standard Conditionals

S1+?	(#a --) less than zero
?	(#a --) zero
S1-?	(#a --) greater than zero
S1+?!	(#a --) greater than or equal to zero
?!	(#a --) nonzero
S1-?!	(#a --) less than or equal to zero
-S1+?	(#a #b --) less than
-?	(#a #b --) equal
-S1-?	(#a #b --) greater than
-S1+?!	(#a #b --) greater than or equal to
-?!	(#a #b --) not equal
-S1-?!	(#a #b --) less than or equal to
kS1+?	(#a #min #max --) below range
k?	(#a #min #max --) in range
kS1-?	(#a #min #max --) above range
kS1+?!	(#a #min #max --) in or above range
k?!	(#a #min #max --) not in range
kS1-?!	(#a #min #max --) in or below range
#?	($a --) string empty
#?!	($a --) string not empty
,?	($a $b --) strings equal
,?!	($a $b --) strings not equal
0;S1+?!	($a $b --) string `a' contains string `b'
0;S1+?	($a $b --) string `a' does not contain string `b'
;S1+?!	($a $b #pos --) string `a' contains string `b' at or beyond position `c'
;S1+?	($a $b #pos --) string `a' does not contain string `b' at or beyond position `c'
d-?	(#a --) is the output of ''U''
d-?!	(#a --) is not the output of ''U''

These are immediately followed by a command or group of commands, which will be executed if the condition is true, or skipped according to SKRULES if the condition is false.

Properties of the Output of U

  • It is always referred to as the output of U, the value pushed by U, the value returned by U, nil, undef, or null. It is never referred to as U.
  • It may be an operand of any operation, and if the operation takes a type of ?, it is unchanged.
  • When it is the first (lower) operand (`a' in stack descriptions) of any binary mathematical operation or the only operand of a unary mathematical operation that does not coerce its argument, the operands are popped as usual and the result is the value pushed by U.
  • When it is the second (top) operand (`b' in stack descriptions) of any binary mathematical operation, the result is always the other operand (implying that if both are the value returned by U, it is also the value returned by U).
  • As the start index of a ; command, the substring is never found, and -1 is always returned.
  • It has a length (returned # by the command) of 0.
  • The , command compares it as equal to the empty string, based on their identical lengths.
  • A substring of it may either cause a type error, or return the value pushed by U, no matter what the bounds are.
  • It is the only value that may be divided by 0 without creating an error (strings, even the empty string, would create a type error, and all other numbers a "divide by 0" error).
  • Type conversion to string either implicitly or explicitly by Ms returns an empty string.
  • Type conversion to a number either implicitly or explicitly by Mi or Mf returns zero.
  • Type conversion to character either implicitly or explicitly by a returns a NULL (character code 0) character.
  • Type conversion of the (nonexistent) "first character" of an empty string to a number either implicitly or explicitly by A returns the value pushed by U.
  • As the operand to the ? operator, it always skips the next command (zero always executes the next command).

Examples

0dp1dp[dt+dpIe] !(Computes Fibonacci sequence, pausing after each number. Remove the Ie to eliminate the pause.)
Mf2/F?("Even"!()"Odd") !(Pops the top of stack and prints "Even" if it is even or "Odd" if it's odd.)
H^D[d<\Hr+x>d#?Q]e !(Gets help on all commands. The ^D is a literal Ctrl-D character.)
2s?!rex !(( cond true false -- ') if statement like in DUP or Factor.)

This line overloads the ~:!*a^S( commands to behave like the corresponding Underload commands:

\r`~\d`:\e`!\+`*{\)+\(r+}`a{@eo+dd#1$rd#1-1r$+O}`^\P`S{U`:1t\000$t)@dTr+td\),?h(TT1-tt)\(,?h(TT1+tt)TTdtrt?h(Td#1-T1+r$\d`:}`(