User:Zzo38/Programming languages with unusual features

From Esolang
Jump to navigation Jump to search

Here I list various programming languages and VMs and computers and so on with some kind of unusual features (and stuff possibly of interest); if you disagree you might change it if enough agreement to change it, although discussion on talk page first is usually preferred. (Note that just because it is unusual or otherwise of interest does not necessarily mean that the feature is unique. New programming languages continue to be made, and some might have some of the features of others, so they are probably not going to be unique, even if it might seem at first.)

See also: User:Ian/Computer architectures and Prehistory of esoteric programming languages

Apollo Guidance Computer

Some of its features include:

  • Like many old computers, it uses ones complement (with signed zero) rather than modern twos complement.
  • The registers are exposed on I/O ports.
  • The instruction to write the value from the accumulator into RAM also checks for overflow. If there is overflow, it skips the next instruction and then sets the accumulator to the carry amount.
  • The conditional branch instruction is unusual: It reads its operand and stores it in the accumulator, skips 0 to 3 instructions based on its value (positive nonzero, positive zero, negative nonzero, or negative zero), and then changes the accumulator one step toward zero.
  • Some memory addresses perform special operations when accessed, such as returning from interrupts, shifting the data written to them, etc.
  • The INDEX instruction adds its operand to the next instruction; both the operand bits and the opcode bits of the next instruction are affected. This is normally used for indexing, but can also be used to change the next instruction into a different kind of instruction.
  • It is possible to call the accumulator as a subroutine.

BANCStar

BANCStar is a programming language which was actually in use, and very strange (see the article on this wiki for some more information, although some of it is wrong). It was designed for financial systems and for computer operators to fill in forms on the screen and to make the report and so on.

Here are some features:

  • Each instruction consists of four signed 16-bit numbers, although instead of using bitwise like most instruction sets it is based on multiples of ten, presumably in order to make it easier to read (the compiler does not accept symbolic names or comments at all).
  • The "goto page" instruction points to pages, which are absolute; there is no relative addressing or label names. Pages seem to be introduced immediately after a prompt instruction with a negative screen position is encountered.
  • Block statements are (presumably) built-in to and executed by the VM, rather than being translated by a compiler at first.
  • There are no local variables; only globals.
  • There is a limit of 2000 variables in the entire system; string constants must also be stored in these variables, and so must any non-integer you wish to use, or anything you want to display on the screen or print on a form.
  • Some of these 2000 variables are special and are used for return address and other things.
  • A separate instruction is used to save the return address and to jump to the subroutine, while returning from the subroutine is done by using a "combination GOTO" to the saved return address.

BASIC

The GOSUB and RETURN commands can be used both at module-level and inside of SUB and FUNCTION routines, although only at module-level is a RETURN command allowed to specify which label or line number to return to (meaning it pop off of the GOSUB stack but will go to the specified label instead of after where it was called from); this can be used to implement something similar to the "DO RESUME #2" of INTERCAL (and I have done this once as it seemed useful in the particular program I was writing).

You can specify types for variables based on what letter their name starts with. For example with "DEFSTR Q-S" then the variables called "QUARTZ" and "SAVEGAME" are string variables.

The "ON ERROR RESUME NEXT" command means that any command with an error will be skipped, and the program will continue with the next one as though nothing happened. You can specify "ON ERROR GOTO 0" to cancel this, so that it will stop if there is an error; this is misleading, because it does not actually go to line zero, even if there is a line numbered zero.

There is no 8-bit type nor any unsigned types; if you want to store a 8-bit value you must use a single-character string instead.

The way that the DATA command works in BASIC is unlike any other programming language that I have seen.

Some variants allow you to put words together with no spaces and it will work.

Apparently, one variant includes a command to increase the size of the compiled file with worthless data.

bc

  • If "quit" is encountered, it terminates immediately regardless of any conditions, even if they are false.

C

  • Trigraphs, which have been included due to some computers not being ASCII. If you write two question marks, then it is interpreted as trigraphs even inside of a string literal or a comment (which can be used to detect the implementation of trigraphs). There is also digraphs, which are similar but are only applicable where operators are expected.
  • You can write such things as 5["Hello, World!"] and will work; they are interpreted as pointer arithmetic.
  • The syntax for types is confusing.
  • Macros can be used to override many things, including keywords, and can be used for various tricks. Some people dislike this, although some people do like the C macros. (I personally like C macros (it isn't only for stupid things), but would want some extensions as well, such as the option to make a proper hygienic macro if that is wanted, and a "scoped macro" similar to the vardef command in METAFONT.)
  • int a=4295010246;printf("%d",a); will only print its first five digits instead of all ten (if int is 32-bits; the number 4295010246 is a number whose high five digits represents the same number as its low thirty-two bits).

COBOL

  • There is a command to alter a GOTO to one label to go to a different label instead.
  • Data levels are declared by numbers, where 01 denotes a record and greater numbers denote parts of a record or other data item. Some data levels have special purposes, such as 66 to rename a group of previously defined items, 77 for stand-alone items, and 88 to name a condition based on the values of other items.
  • It is possible to call a range of adjacent procedures; you do not need to call one at a time. For example, if you defined three procedures called ALPHA, BETA, and GAMMA, then you can write PERFORM ALPHA THRU GAMMA to call all of them.
  • You can abbreviate complex conditions, e.g. "a > b AND a > c OR a = d" can be abbreviated as "a > b AND c OR a = d".

dc

The dc programming language has the unusual feature that although it does not have arithmetic IF, it would probably be much better if it did; most programming languages work fine with the conditions they already have.

Features related to numbers:

  • You can use digits up to F even in bases lower than that, so for example "Adio" will always reset to base ten.
  • Numbers are interpreted using the input base that is active at the time they are executed, so if you define a block and then change the base and execute it, it is the new base which is now used, rather than the previous one.

Like INTERCAL, every register is also a stack, and it is possible to exit out of multiple blocks at once by using a computed number (it doesn't have to be a constant). Unlike INTERCAL, arrays are stored in the same register as single values, and are stashed together with them.

The following program prints "google" if the input is 2:

A3P256?^255/B1*P[gle]P

Fairchild Video Entertainment System (Channel F)

It is a old video game console.

  • It doesn't have a address bus; instead, the program counter and data counter are shared among the cartrdige and other devices, and there is a ROMC bus to command the access and to set their values. (The bus has separate commands for opcodes, operands, and data; this means that it might be possible to have ROM with separate address spaces for these three things, although as far as I know, this feature was never used.)
  • Bit shifting instructions can be either one at a time or four at a time.
  • There is no direct addressing; all memory access must be done by first setting the data counter register to the address that you want to access. Accesses will automatically increment the data counter.
  • Some of the registers are scratchpad memory, some of which are also used for other purpose, some of which can be addressed directly or indirectly and some which can be addressed only indirectly. Indirect addressing of scratchpad memory is using the indirect scratchpad addressing register, and can be used with postincrement and postdecrement.
  • The game controller has eight directions: right, left, backward, forward, counterclockwise, clockwise, up, down.
  • There is no subtraction command.

Free Hero Mesh

Free Hero Mesh is a puzzle game engine.

  • The syntax is a mixture of S-expression-like syntax with stack-based code. (WebAssembly also does this; I don't know if anything else does.)
  • There is a goto sigil (=:) and gosub sigil (,:). These work like the GOTO and GOSUB commands in BASIC, but with sigils instead of keywords.
  • The chain instruction changes Self to point to a different object of the same class (you cannot assign to Self directly), and has a result indicating if it is successful or not (specifying an object of the wrong class, or a value of a different type, is not an error, but makes it unsuccessful). This is the only way (other than sending messages) to access user-defined local variables of another object than yourself, but also affects all instructions which implicitly work on the Self object.
  • Many instructions accept , and/or = modifier sigils. Usually, , means to either use signed arithmetic (instead of unsigned) or to work on an object other than Self (where omitting the comma means to implicitly work on Self); = usually means assignment; there are a few exceptions (including the goto sigil and gosub sigil mentioned above). (Apparently TECO also has a command modifier sigil.)
  • Although there are types, the null object is represented by the number zero, not by the object type or a special null type. (Zero can also be used to mean no class, or to mean all classes, in some contexts.)
  • In most cases, specifying zero instead of an object where an object is expected will cause the operation to be ignored, and if the operation would have a result, the result is zero. This is not usually an error.
  • The working of arrays is unusual. Arrays are only static (there are no dynamic arrays); however, array references are only stored in variables, which can be reassigned. When a level is started, references to arrays that you have defined are stored in global variables. (You can make references to parts of arrays, but cannot allocate any new memory for arrays.)
  • Sounds cannot be compared (nor used where a boolean value is expected); all other values of other types can be compared (and can be used where booleans are expected).
  • The flip instruction reverses everything above the most recent mark on the stack (marks work similarly to PostScript). This is similar to how some programming languages can reverse a list, but there is no "list" value in Free Hero Mesh, so it does this instead. (Likewise, mbegin iterates over the list. Since lists are not a first-class value in Free Hero Mesh, it works by things like this instead. Actually, mbegin is the same as begin tmark while; you can use tmark to test for when you have finished removing the data from the list and found the mark. Other operations on lists are also available.) (Note also that arrays are not the same as lists; they are separate and work entirely differently.)
  • There is a %R substitution to display roman numbers in popup messages.
  • Determining the character set of the source file potentially involves reading the entire file, reading other files (sometimes it will not even be known which other files), and/or evaluating macros (which are Turing-complete). (Although, in the most common cases, this won't be necessary.)
  • When an animation is set, there are two separate animations, the "logical" animation which runs within the time of a single turn and then expires before the next turn, and the "visual" animation which is visible on the screen for its set duration, which may span multiple turns. Only the logical animation affects the behaviour of the game, although logical and visual animations are not normally set independently. Furthermore, if you set too many animations in one turn, excess animations are simply ignored.
  • The (case) block, which can be used in two ways; one way is similar to the ON ... GOTO in BASIC, and the other way is like a lookup table to convert values instead of branching.
  • Text strings are normally used for display, but may also contain data (a \d escape) which is not displayed. Program codes can read only the hidden data and not the displayed text. There is also a \q escape, which in addition to displaying a "quiz button", also affects the behaviour of the program, causing the input which dismisses the message to be handled by the game; if there is no \q escape, then the input to dismiss the message is ignored.
  • The fork ... else ... then block first executes the part before else and then the part after then; after returning from the subroutine (including implicitly at the end of the message block) then the part after else is executed, and then it continues again with the part after then.
  • The data types are: numbers (32-bit integers; whether they are treated as signed or unsigned depends on the operator used on it, not inherent to the value or to the type), classes (you cannot modify or create new classes at runtime, but can check classes of existing objects and can create new objects of a dynamically specified class), messages (dynamic dispatch is possible), strings, objects, sounds, marks (a singleton type), arrays (which cannot be dynamically allocated, although you can create slices which alias existing memory), and links (like a anonymous function, but there is no closure (you cannot save values)). All of them meet all four of Popplestone's criteria for first-class (they can be the actual parameters of functions, returned as results of functions, be the subject of assignments, and can be tested for equality), except sounds, which meet only the first three (attempting to test a sound for equality with anything is an error).

Possibly the most unusual feature is the preprocessor:

  • The built-in {edit} macro alters the first token of the definition of another macro. Unlike {define} and {append}, it is possible to add expanded tokens to a macro definition in this way. One use of this is to make auto-incrementing counters in the preprocessor.
  • The built-in {cat} macro concatenates several tokens (not necessarily strings), stripping out their sigils (if they have any), and produces a string.
  • The built-in {make} macro converts a string into another kind of token; it takes another argument to specify what kind.

The following code is a quine in Free Hero Mesh:

(Control(INIT"(Control(INIT%c%s%c34 over 34(PopUp 3)))"34 over 34(PopUp 3)))

The following code is a tag system implemented in the preprocessor of Free Hero Mesh:

{define "skip" {call \2}}
{define "1" {skip \1|"3"|"3"|"2"|"1"|"H"}}
{define "2" {skip \1|"3"|"3"|"1"}}
{define "3" {skip \1|"3"|"3"}}
{define "H" \1}
{call "2"|"1"|"1"}

FurryScript

FurryScript is a domain-specific programming language (which may be considered as esoteric). Although the author believes it works well for this purpose but it has a few strange thing compared to other programming language with similar purpose.

Some of these features include:

  • No negative integer literals (although negative numbers are possible). (This is true in some other programming languages too, but the syntax makes negative integer literals unnecessary in many programming languages; this is not the case here)
  • No scalar variables; only variable containing list of scalars
  • Text strings can contain subroutine calls, continuations, and references to list variables
  • Picking a random entry from a list is done implicitly when a string references it, so you do not need a command to do this explicitly
  • Subtraction, but no addition
  • The return value of a subroutine is "OK", "bad", or "very bad"

The collection of operations is unusual; here are a few:

  • Discard the top value of the stack with a given probability
  • Pop two values from the stack, and discard one more if those two don't match
  • A sophisticated dice-rolling function

Glulx

Some features of Glulx are unusual for assembly language, even if some of them are stuff you might ordinarily find in high-level languages. Some of its features include:

  • The amount of bit shifting can be out of range and it will work. The bit shift amount is considered to be unsigned, so if it is not in the range 0 to 31, then all bits will be shifted out. This means that, for example, you can write "ushiftr 1,$,$" to convert 0 to 1 and other numbers to 0. (Some implementations of some programs already do this; Glulx guarantees it.)
  • There are five addressing modes: ROM, RAM, stack, local variables, and immediate. You can write to an immediate if that immediate is zero; doing so will discard the value to be written. (ROM and RAM are actually in the same address space, although they do have separate addressing modes.)
  • There are instructions for linear search, binary search, and linked search. You can use the linear search to implement the strlen function of C.
  • Glulx has a single calling convention for all programming languages (including assembly language), I suppose similar to what VAX does. There are two ways to call a subroutine, either with arguments on the stack, or with up to three arguments given as operands to the instruction. Each subroutine itself also specifies in the header how it receives its arguments (and how many local variables it has): either on the stack (which can be retrieved using stack operands or using instructions dealing with the stack), or copied into local variables. There is a built-in tail-call instruction (simply jumping won't work).
  • All numbers are big-endian in ROM and RAM, but use the native byte order in the stack. The stack is not addressable, so you cannot take the address of the stack or of local variables, and this byte order usually doesn't matter.
  • There is support for accelerated functions. You can tell the VM to replace calls to a function in your program with its own implementation. If the VM implementation does not implement that function, then your own function is called instead.
  • You can make VM saves, both in memory (in another separate address space which your program has no access to) and to I/O streams (which you can access). This includes not only the contents of RAM, but also the stack, program counter, etc. (Note that, unlike with Z-machine, you must open and close the I/O streams yourself.)
  • Since the ROM and RAM are in the same address space (and code is stored there), self-modifying code is possible. This makes it possible to implement dynamic linking; there is no built-in support for dynamic linking.
  • There are instructions for dealing with bit arrays of any length. Counting starts from the low bit of each byte, and then continues with the next byte and so on, so they are small-endian even though everything else is big-endian.
  • Most instructions deal with 32-bit data with any addressing mode. However, the copyb and copys instructions are an exception; they deal with 8-bit and 16-bit data being pointed to by ROM and RAM operands, but 32-bit data for stack operands.
  • Some instructions use relative branch offsets. If a relative branch offset is 0 or 1, then instead of branching, it will return from the currently executing subroutine.
  • It has built-in string compression. Compressed strings are stored in the same address space as everything else, and the table for controlling decompression of strings can be modified at runtime.

JOSS

JOSS is a programming language from 1963.

  • Its line numbers have a major and minor part, rather than being a single number.
  • There is a direct mode like BASIC, although unlike BASIC, both direct commands and numbered commands were both saved in the file, so direct commands could be repeated when the saved program was loaded.
  • It has ranges in the form of start(step)end as a first-class type.
  • There is a "parenthetical Do" command, which allows you to call a subroutine in direct mode without changing the main pointer; you could then cancel it by the "parenthetical Cancel" command.

MegaZeux

MegaZeux is a ZZT-like computer game. Some of its features include:

  • It has ZAP and RESTORE commands like ZZT.
  • The message passing also works like ZZT. (However, it is now possible to return from interrupts; returning from interrupts is optional.)
  • It is possible for an object to change its name at run time by the use of a special comment.
  • Some features (such as string manipulation and file I/O) require writing to variables to trigger some operations, or writing special values into variables to trigger some operations.
  • If you type a number in hexadecimal, it is automatically changed to decimal in the editor.
  • A program will stop executing after a certain number of commands are executed; it is possible for a program to change this number at run time.

MMIX

See MMIX (has Knuth read this yet?). Other features:

  • No operation instruction is not called "NOP" or "NOOP"; they called it "SWYM" instead. ("SWYM" is short for "Sympathize With Your Machinery")
  • Forward references are resolved at loading time and not at compile time.
  • There is no relocatable code in MMIXAL.
  • There are some kinds of exotic stuff such as "sideways add" (a bit population count, but you can omit some bits from the count), "multiple or" (which can be used to convert endianness but can also be use for many other kind of things), "multiple exclusive-or" (similar to multiple or; can be used for hash functions), and others, which can be useful even if exotic.
  • The most unusual feature of MMIX is its register stack.

OASYS

OASYS is an old system for text adventure games. OASYS VM actually has two different programming languages, OAC (the original one) and OAA. The interpreter is OAI, and there is also a disassembler called OAD.

General features and VM features:

  • There are no general procedures, only methods that can be called on objects. (Only the initialization method can be called without an object.)
  • There is no support for arrays whatsoever (although a limited imitation of arrays is possible).
  • There are classes of objects, but all classes share the same properties and methods.
  • The types of values are integer, string, and object. Integer and string types are actually interchangeable, although variables are not typechecked at runtime.
  • The type is used at runtime to determine how to parse the user's input, to save/restore games, and to delete references to objects that have been destroyed.
  • Pointer variables cannot be stored in save games (but OAC does not support pointer variables); normally pointer values exist only on the stack. (This wouldn't be unusual in C, but this is a VM, and it automatically saves all global variables and objects in save games.)
  • The only operation on a string value is to print it (strings are represented as ID numbers into a constant string pool).
  • The only thing that can be allocated is an object, and all objects have the same set of properties and methods.
  • It is a stack machine, although there are no operations like Forth's DUP and DROP and so on. (A conditional branch to the immediately following statement can be a substitute for a DROP though.)
  • The standard interpreter crashes if there are no vocabulary words defined or if there are no properties defined.
  • Each argument of a method optionally has another method (called a selector) associated with it. If the user's input specifies a class of object as the argument value, the selector will be called on each object of that class, and the first one it returns nonzero for will be the chosen object.
  • Each method optionally has a string associated with it, which is displayed if it is being used as a selector and returns zero for all objects of the specified class.

OAC features:

  • There are no reserved words; any keyword can be used as the name of a class, variable, method, etc. A variable can even share the name of a method, or the name of a property with the name of a class, or whatever.
  • Static type checking is done to distinguish integers from strings even though they are actually interchangeable in the VM.
  • Everything must be defined before it is mentioned later in the program.
  • There are no delimiters between statements; when a token cannot be part of the current statement it will be a part of the next one.
  • No operators are needed for property accessing and method calls, nor are parentheses needed around the argument list or commas in between the arguments; just spaces will do.
  • No optimization is done by the compiler. If a string literal occurs more than once, it will compile multiple instances of that string literal into the binary (and there is no way to avoid this either; you can't define a constant). (The included documentation does document this feature.)
  • There is no support for include files or for multiple modules linked together.
  • If the name of a variable occurs in a phrase, that name will be included in the vocabulary list but the game will not actually understand any uses of that word. (The included documentation does document this feature.)
  • Pointer types are not possible at all (the compiler will only generate them temporarily on the stack before a DEREF or ASSIGN instruction; you cannot pass pointers anywhere).

OAA features:

  • OAA uses a much more terse (and strange) syntax than OAC, although it does add support for include files and macros, and several other features.
  • No type checking is done. A string literal is actually syntactically equivalent to a numeric literal, and is treated exactly like a numeric literal in all cases!
  • Names will have a prefix and/or suffix to indicate the types and so on (similar to BASIC and Perl, I suppose).
  • The name of a method or property or variable is allowed to be blank (other than the type prefix and/or type suffix) in OAA without causing problems. A method may also be anonymous and have no name at all. (The names &, &#, and %@ are examples of such names, and are used for predefined purposes; they are the same things called init, select_addressee, and player in OAC. Other blank names can be used for your own purposes.)
  • Unlike in OAC where a class must be defined before it is used, in OAA it is possible to use a class without ever defining it. The same is true of variables and properties. However, methods must be defined (but do not have to be defined before they are used; you can define them in any order).
  • Pointer types are possible (and can be passed anywhere), although they should not be used as the types of global variables or types of properties.

Example of Thue-Morse (including explanations):

[&]                 ; Declare the initialization procedure
%#1>                ; Initialize length counter
%@*>                ; Create first object
,#1>                ; Initialize loop counter
:                   ; Begin loop
  %@<.#<PI          ; Print current cell
  *.#%@<.#<NOT>     ; Create new cell
  %@%@<NXT>         ; Advance to next cell
  ,#,#<DN>          ; Decrement loop counter
  ,#</              ; Check if loop counter is now zero
    %#%#<2MUL>      ; Double length counter
    ,#%#<>          ; Reset loop counter
    %@FO>           ; Reset object pointer
    CR              ; Line break
|                   ; Repeat loop

See Deadfish for another example.

This happened to the author of this page once: I wanted to copy the DOS versions of OAC and OAI from one computer onto another one, but I had no disk nor a C compiler on the target computer, so instead I printed it out and completely rewrote OAC and OAI (and later, OAD and OAA) in BASIC. (Note: I did not invent OASYS, and the inventor did not document the VM; they only documented OAC. OAA is my own invention.)

Perl

  • Many kind of blocks such as if can go before or after a statement.
  • Some values can be treated as numbers or as strings depending on the context, and the value will be different in each case.
  • You can increment strings.
  • File handles are not an ordinary object but are something else.

PL/I

Some of its features:

  • There are no reserved words. All keywords (even IF and DO) are not reserved.
  • An extremely sophisticated (and unusual for a non-Lisp system) preprocessor.
  • Apparently you can use a SQL statement anywhere a PL/I statement is allowed, and you can use UTF-16 but there is no support for UTF-8.

An early UNIX fortune file mentions the following:

  • You can allocate an array and then free the middle third.
  • You can multiply a character string by a bit string and assign the result to a float decimal.

POP-11

  • You can use a mixture of RPN and conventional notation (infix and functions).

PostScript

PostScript is often used for printable documents, although some users (such as myself) use it as a programming language (especially suitable for graphics). It is a stack based programming language, like Forth and others are, although with some unusual features.

Some of its features:

  • PostScript can be said to be both esoteric and non-esoteric, both general purpose and domain specific, and both text and binary.
  • There are no variables. Values are normally stored in dictionaries in the dictionary stack, and executing a name will look it up in the dictionary stack and execute whatever it refers to (or just push it to the operand stack, if it is not executable).
  • Procedures are just arrays, and can be manipulated just like any other array. Executing it involves executing each object it contains in order; if an object is not executable, it is just pushed to the operand stack. (Actually there is one exception: if a procedure contains an executable array, it is nevertheless not executed and is just pushed to the stack (although it remains executable); it must be executed explicitly. This is so that you can make flow control blocks.)
  • Whether or not an object is executable is independent of its type. You can use "cvx" and "cvlit" commands to make an object executable or not executable.
  • There is global memory and local memory. There are then VM saves, which save the contents of local memory; when a VM save is restored, all local memory is restored to its previous values while the global memory is left untouched.
  • One type of object is a mark, normally put in the stack to later make an array or dictionary from everything to the nearest mark, although it is an object like any other and can be stored in arrays, dictionaries, etc.
  • Many tokens have a binary representation. You can even mix text and binary tokens in the same program.
  • Normally the "bind" command replaces names of operators in a procedure with the operators themself; however, if the procedure matches any procedures in the IdiomSet resource then it replaces the entire procedure with a different one instead.
  • The "currentfile" command accesses the source file; you can use this to read parts of the source file as data (or execute it possibly with a decompression filter).

A quine in PostScript:

(dup == =)
dup == =

QUACKVM

Some of its features:

  • The program counter is memory-mapped (at address 0).
  • The memory is 32768 cells each storing a 16-bit number.
  • The only real flow-control instruction is CALL; anything else must be done using other instructions manipulating cell 0 (for example PUT ,,50 jump the program flow to address 50)
  • Most instructions do more than one thing, for example MUL is actually a multiply, add, and test if zero, all in one instruction, while AND both does a bitwise AND of two numbers, and compares if the result is equal to a third number.
  • It uses bankswitching where each bank of the ROM or disk (the disk has bankswitching too in QUACKVM) has a set size and banks may be of any size.
  • There is one built-in instruction for decoding Huffman data. (This instruction has not been used yet so far, as far as I know.)
  • All instructions can optionally store the result and can optionally have a conditional branch associated with it. For example, ADD with neither will just end up to do nothing.

TECO

TECO is a very unusual text editor. (I have heard that Emacs was originally written in TECO, although now it uses Lisp.)

Some of its features:

  • Everything including control-characters are commands.
  • It has 36 global and 36 local registers called "Q-registers".
  • Each register stores a string *and* a number.
  • There is a mode to display a lot of information whenever an error occurs, which is called "War and Peace mode" (I don't know why).
  • Normally it won't read past a CTRL+Z, but in "SUPER TECO mode" it does.
  • Strings can use any character you want to as a delimiter.
  • Search specifications like regular expressions, but with control characters and some other worse stuff.
  • Numeric arguments come before a command; text arguments come after a command.
  • Instead of comments, you can use labels (goto targets).
  • You can modify commands by adding a colon in front. Usually this causes it to return 0 if it fails or -1 if it is successful, but some commands will do something else.

TeX

TeX is a (very good) typesetting system, but when you go beyond such things it can begin to get strange. It is guaranteed to be the same on all computers in all time periods; however some computers might run out of memory with some input files and/or execute them very slowly.

A number consists of individual tokens for each digit. Same thing with words; a dimension measured in points consists of a "p" token and a "t" token.

You can change the meaning of any character in the input, and of any control sequence (a token that begins with a control sequence introducer and then several letters). You can also configure which characters are letters, and make it different in different parts of the file; you can change which characters introduce a comment, which delimit parameters/groups, and what character is automatically inserted at the end of each line of input. And it does support trigraphs; you can change which character introduces a trigraph but it is always the same as the character to indicate a superscript in a math formula.

The stack is also unusual compared to other programming languages (at least most modern ones). Nearly all registers and macros and so on are global; subroutines do not have their own local variables. However, you can begin/end a group anywhere, and any changes made normally persist only until the end of the group, but you can tell it to persist globally.

For many things it can be useful to take advantage of features which are meant to do something else.

A golfed FizzBuzz program in TeX with typesetting output:

\newcount\-\let~\advance\day0\loop~\-1~\day1~\mit\ifnum\-=3\-0Fizz\fi\ifnum\fam=5Buzz\rm\fi\ifvmode\the\day\fi\endgraf\ifnum\day<`d\repeat\bye

ToonTalk

  • Any object can be "erased" so that it matches anything of the same type; it can later also be "unerased" to restore its original value.
  • The usual way to edit a program is (as far as I can tell from the documentation) to play back the program but to interrupt it at the point that you want changed. It is (probably) also possible to edit the XML representation, although the XML schema seems to be undocumented (the documentation only says that it is XML). (The HTML-based version uses JSON, although again the schema seems to be undocumented.)
  • It is possible for a program to alter the pattern of inputs that it expects, which is an object like any other one; if they are "erased" (see above) then they will match anything of the correct type.
  • Numbers can have an operation associated with it (addition by default). This is the operation that will be used when doing a calculation with that number.
  • To compare if one number is less or greater than another number, you will need a balancing scale. You can then match which way the scale is tilting.

TRON

  • The "additional addressing mode" allows an operand of a single instruction to have any number of additions and indirections. (For example, the C code y=*****x; can be implemented as a single instruction on TRON.)
  • There are instructions for inserting into, deleting from, and searching doubly linked queues. (VAX also has, I think. Some of the other interesting features of VAX also seem to be in TRON, too.)
  • There are many instructions for dealing with fixed and variable length bit fields, including packing and unpacking, searching, copying and can be used also for some kinds of graphics operations.
  • There are two entry points to a C program, MAIN for starting by a message, and main for starting by command-line arguments.
  • TRON has a "hypertext file system", with "real objects" and "virtual objects".
  • File permissions can be read/write/execute by owner/group/others like UNIX has, but instead of one bit each, each one has a 4-bit number 0 to 15 indicating what user level is required to access that file (zero is the most privileged level). Each process has, in addition to the user and group, also the user level indicating which privilege it has.
  • The file system can include passworded files.
  • Files can have an expiry date.
  • Any file can potentially be any type of data (inside of a standardized container format, called TAD (TRON Application Databus)), and can have multiple records; the TAD main record contains all of the TAD data, but others can be used for fonts, execution, etc.
  • The left and right shift key are different and produce different characters when pushed in combination with another one. There are also separate arrows (four ways each) on the left side of the keyboard and right side of the keyboard (I don't know what is the significance).
  • The "TRON Font Traceability System" apparently allows to interchange data among different character sets (but I do not know how it works; the documentation file seems to be missing).
  • There are special character codes for path delimiters, etc, rather than using the ordinary codes (like a slash) like most other operating systems does.

TUTOR

  • Its character set includes control charaters superscripts and subscripts, and the superscripts are used in the syntax of the programming language for exponentiation.
  • It has a "answer judging" control block, which starts with arrow, which also prompts for input. You can then specify the pattern matching commands to match the user's input, according to sequences of words, some of which may be optional and some of which may have multiple possibilities. There are multiple kinds of sub-blocks for specifying the patterns, e.g. answer allows continuing the program while wrong will ask the question again after it executes.
  • The answer judging will be able to automatically correct spelling errors (but you can use the specs command to control this feature), e.g. if "trangle" is expected, then "triangel" will also be accepted.
  • The join command is a variant of a subroutine call that is equivalent to textual substitution of the subroutine (presumably, similar to #include in C; so it can contain a part of a judging block); you can also use do for an ordinary subroutine call.
  • Control blocks other than judging blocks have mandatory indentation. However, unlike Python, the indentation is indicated by . rather than spaces, and the command at the end (e.g. endif) is still required, too.
  • It has graphics capabilities to draw lines, circles, text (including font sizes and rotated text), etc, on a monochrome display with 512x512 pixels.
  • There are many kinds of memory. One kind is "student variables" which are persistent but private to each user and to this program. There are also temporary and permanent common blocks; a common block shares memory between multiple instances of the program that multiple users are running.
  • You can define segmented arrays with whatever number of bits per byte that you want.

Unofficial-MagicKit Assembler

A 6502 assembler. Here are some of its features:

  • Nonstandard syntax. Uses square brackets for indirection, and uses a less-than sign to indicate zero-page addressing (rather than being implicit).
  • Support for stable unofficial opcodes. The author tends to find this feature useful.
  • Support for custom postprocessors and output-routines, written in 6502 assembly language (executed with a slightly modified version of lib6502).
  • Normally banks are fixed at 8K and only INCBIN can cross banks, although you can also define multiple banks to have the same name to make them contiguous.
  • Macros are text-based, although they can modify their own arguments and "go to" other macros; the only other flow control is "if" blocks.
  • Any expression can also ask which pass of the assembler is active (it uses two passes), as well as the number of errors and several other things.
  • It is even possible for assembly-language programs to interactively ask the user for input at compile-time.
  • Built-in support for NES/Famicom graphics and PC-Engine graphics.

For some examples of macros, see [1] and [2].

Uxn

  • Each instruction has, in addition to its ASCII name, also a hand gesture and a written sign.
  • The bit shift instruction (SFT) does both left and right shift together, first right and then left.
  • There is increment (INC) but not decrement (you can use #01 SUB instead, though).
  • There are two stacks, and all instructions except JSI and JCI can use either stack; one of the opcode bits controls this. Two instructions (JSR and STH) use both stacks; the opcode bit will exchange the use of the two stacks in that instruction.
  • Any instruction except JSI and JCI can optionally use "keep mode" to keep the operands on the stack. With stack manipulation instructions, this will keep the old data in its old order and then the reordered new data above it, e.g. DUPk will make two extra copies (while DUP makes just one copy), SWPk will be ( a b -- a b b a ), etc. (The JMI instruction cannot use the keep mode either, but it does not use the stacks at all.)
  • Some instructions take PC-relative addresses from the stack.
  • Nearly any instruction (except JCI, JMI, and JSI) has a 8-bit mode and a 16-bit mode, although the 16-bit mode really just operates on two 8-bit values and treats them as a single 16-bit value (this is often used in programs). Sometimes, this only affects some operands.
  • The assembler has some unusual features, including: all numbers are hexadecimal (there is no decimal numbers), string literals cannot have spaces in them (although you can add spaces by separate tokens with their ASCII codes, e.g. "Hello, 20 "World!), square brackets are ignored (anything inside is still parsed normally), etc.

VAX

VAX uses a very orthogonal instruction set from what I have read.

  • You can write to immediates; they aren't read-only. (On NMOS 6502, writes to immediates don't work, and cause them to read instead.)
  • If a field is too big for one register, it uses the next register too.
  • There is no AND, but it does have AND NOT.
  • You can define your own microcodes and then call them by the XFC instruction.
  • It has one instruction that has a separate instruction set for converting BCD to ASCII.
  • Some instructions have a variable number of operands that is impossible to know how many or their length until it is fully decoded (and may fill the entire address space).

Z-machine

There is 64K of addressable memory and the rest of it can contain only instructions and text strings (the first 64K also can), and the stack is not stored in this memory space, but registers are (this is the opposite of most CPU architectures).

There are many features Infocom has documented but never used, as well as many optimizations that could be made but weren't. Some features have been mentioned in the documentation but neither documented enough to use them nor ever used nor implemented, such as joystick interrupts and XZIP menus. (There is one picture file format they documented, which seems to have never been used for any games, and I have only ever seen a single file in that format; I know of no implementation other than my own (part of Farbfeld Utilities).)

The ORIGINAL? instruction is supposed to check if the game disk is the original rather than a pirated copy, although it is unclear how the interpreter is possibly supposed to know. Infocom has never used this instruction in any of their games though.

Although Infocom did not use it, the BCOM instruction with an immediate operand would actually be a more efficient way to write a long number (i.e. one that doesn't fit in 8-bits) into a local or global variable (in ZIP and EZIP only, not in XZIP and YZIP). For example, "SET 31,-1" encodes as "CD 4F 1F FF FF" which is 5 bytes long; "BCOM 0 >VAR31" would have the same effect and encode as only three bytes. In XZIP, figuring out the most efficient way may involve the compiler doing prime factorization (although not of numbers longer than 15-bits).

The documentation for the PTSIZE instruction (which is used to determine the length of a property) says that it is "guaranteed to return a meaningless answer if given any other kind of table". (It isn't actually true; it simply looks at the byte preceding the address given and performs a simple calculation on it, although this calculation differs between ZIP and EZIP.)

There are no absolute jumps, although there is absolute function call. Jumps can only be relative, although unconditional jumps can be computed rather than constant.

Although there is no bitwise XOR instruction (and bitwise shifts were not added until XZIP), the similar "DIP" (which was only ever used for one game, as far as is known) (also invented by Infocom) does.

ZZT

ZZT is a old DOS computer game, you can make up your own game worlds. The source code was lost (or, at least they thought so; much later they actually managed to find it), but in 2020 someone managed to rewrite it (the rewritten source code produces the same executable file as the original). It is said to be object-oriented, but does not have many features of object-oriented programming languages (although it does have message passing).

Some of its features include:

  • The #ZAP and #RESTORE commands are used to change which of several labels of the same name are used when a branch to that label name is done. Initially, the first label of that name is "active"; #ZAP makes the next one "active" for branching to, etc.
  • The only arithmetic is adding a constant to one of several built-in variables, or to try to subtract a constant and check if it was successful (if it would go below zero, it keeps its current value instead).
  • Message passing is by name; it is possible for multiple objects to have the same name, in which case sending a message to that name will send to all objects with that name.
  • An object can be interrupted by receiving messages (unless it does not implement the message, in which case the message is simply ignored); there is not the way to return from interrupts, but you can disable interrupts (by the #LOCK command).
  • Lines without # are text which will be displayed, and can include hyperlinks, both within the object's code and to external files.
  • The #BIND causes an object's code to be replaced by that of another object.