Tip

From Esolang
Jump to navigation Jump to search

Tip is an esoteric programming language created by User:ais523 in 2018, as a minimization of iterated Collatz functions (which were already pretty minimal to start with!). It allows for very terse implementations in many languages which have primitives for handling rational numbers. The name is an acronym for "times instruction pointer".

As the language is intended as a target language for interpreter golf, it has no fixed syntax (interpreters are expected to use whatever encoding of the program is most convenient, which probably will not even be a sequence of bytes, and as such Tip programs cannot be stored in a file without an additional encoding step). However, it is likely to be trivial to convert between the syntaxes required by various interpreters. (As such, the language falls into a similar category as languages like Algol, which also just defined a list of syntactic elements the language needed and left it up to the various implementations as to how to spell them.)

Specification

A Tip program consists of a list of commands, plus an initial value for the instruction pointer (which must be a positive integer). There are two commands that can appear within the command list; goto commands and the halt command (which ends the program). A goto command takes a positive rational number as its argument, and is written simply by writing the argument in question in whatever syntax the implementation requires. (As a special case, using 0 or 1 as a goto command argument is syntactically invalid; this enables implementations to allow either of these values to be used as a special case for representing the halt command without making the implementation nonportable to Tip programs intended for other interpreters. Interpreters need not be able to diagnose this particular syntax error, or indeed any form of syntax error.) A halt command will also typically be represented as a number, but the choice of number will depend on what is most convenient for the implementation; however, it must be something that's invalid as a goto command argument (i.e. 0, 1, a negative number, an infinite number, an irrational number, or an imaginary/complex number), thus ensuring that it can always be distinguished from a goto command.

Unusually, the program is seen not as a finite entity, but as covering the entirety of an infinite memory space; the program is conceptually repeated an infinite number of times to create an infinitely long program (in practice, implementations will instead use only one copy of the program and perform a modulo operation on the instruction pointer, thus saving the need to make an infinite number of copies before starting).

The instruction pointer starts with the initial value specified by the program. (Depending on the implementation, the language might be 0-indexed or 1-indexed, i.e. command 1 might be the second or first command of the program; it's trivial to convert programs between these two language dialects by moving the first command to the end or vice versa.) When executed, a goto command multiplies the instruction pointer by its argument, i.e. it can be seen as goto *(ip * a); or as ip *= a;; there's no implicit increment of the IP (i.e. whichever line the goto command goes to, that's the line which will execute next). The halt command halts the program when executed. The language has no RAM or other similar mutable addressable memory; the only data used are the program (which never changes) and the instruction pointer (which is just a number).

Allowing the instruction pointer to become anything other than an integer is undefined behaviour. (In order to prevent such undefined behaviour occurring, a Tip programmer can ensure that the denominator of each goto instruction is divisible by both its location within the program and the number of commands in the program, thus ensuring that whichever copy of the instruction is run, the IP will be divisible by the instruction's denominator and thus the result of the multiplication will be an integer.)

Relationships to other languages / language families

Tip is almost an OISC. It can be seen as an OISC either via treating the "halt argument" as a special case, by using 0 as a halt argument and "memory-mapping" halt behaviour, or by using an imaginary number as the halt argument to effectively move the IP "outside the bounds of the program"; even though the program is being treated as infinite, it's only infinite along a single line in the complex plane.

Tip is also a special case of a Collatz function (specifically, the special case where all si are 0); halting behaviour for Collatz functions is normally defined as entering a trivial infinite loop, in which case either 0 or 1 works as a halt instruction.

Computational class

Tip is Turing complete, which can be seen via implementing a Minsky machine with two counters. We use a construction in which the Minsky machine commands can be an increment of either counter, an unconditional decrement of either counter (undefined behaviour upon an attempt to decrement zero), or a zero-test (which goes to one of two states depending on whether a particular counter is zero).

Let p be a prime number that's a) at least 7, b) at least 2 more than the number of states of the Minsky machine. (Each state of the Minsky machine, including the halt state, is given an arbitrary number from 2 to p-1 inclusive.) Let x equal 2p-1, and y equal 5p-1; and define F(s, t) as the set of positive integers which are equal to 1 mod 10 and to t÷s mod p (where the division symbol refers to finite field division mod p); by the Chinese Remainder Theorem, such a set exists and has infinitely many elements, because p and 10 are coprime. Let f(s, t) be the smallest element of F(s,t).

At any given point in the program's execution (after the first Tip command has run), our invariant is that the Tip instruction pointer will equal x to the power of the first counter, times y to the power of the second counter, times an element of F(1, t) (where t is the number of the Minsky machine's current state).

The compiled Tip program consists of 10p instructions, with initial instruction pointer value 1, and the instruction in position n of the Tip program is determined as follows:

  • If n mod p is 0, the instruction is irrelevant (this construction never allows the instruction pointer to divide p).
  • If n mod p is 1, the instruction is a goto instruction, and its argument is x to the power of the first counter, times y to the power of the second counter, times f(1, t) (where t is the Minsky machine's initial state). This obviously means that at the start of the program (where the IP has value 1), the simulated Minsky machine will be initialised appropriately.
  • If n mod p is the number of the Minsky machine's halt state, the instruction is a halt instruction.
  • If n mod p is the number of a state that increments the first counter, the instruction is a goto instruction, and its argument is xf(s,t) (where s is the number of the state being compiled, and t is the number of the state that follows it).
  • If n mod p is the number of a state that increments the second counter, the instruction is a goto instruction, and its argument is yf(s,t) (where s is the number of the state being compiled, and t is the number of the state that follows it).
  • If n mod p is the number of a state that decrements the first counter, the instruction is a goto instruction, and its argument is f(s,t)/x (where s is the number of the state being compiled, and t is the number of the state that follows it).
  • If n mod p is the number of a state that decrements the second counter, the instruction is a goto instruction, and its argument is f(s,t)/y (where s is the number of the state being compiled, and t is the number of the state that follows it).
  • If n mod p is the number of a state that zero-tests the first counter, and n mod 2 is nonzero, the instruction is a goto instruction, and its argument is f(s,t) (where s is the number of the state being compiled, and t is the number of the state that follows it when the first counter is zero).
  • If n mod p is the number of a state that zero-tests the first counter, and n mod 2 is 0, the instruction is a goto instruction, and its argument is f(s,t) (where s is the number of the state being compiled, and t is the number of the state that follows it when the first counter is positive).
  • If n mod p is the number of a state that zero-tests the second counter, and n mod 5 is nonzero, the instruction is a goto instruction, and its argument is f(s,t) (where s is the number of the state being compiled, and t is the number of the state that follows it when the second counter is zero).
  • If n mod p is the number of a state that zero-tests the second counter, and n mod 5 is 0, the instruction is a goto instruction, and its argument is f(s,t) (where s is the number of the state being compiled, and t is the number of the state that follows it when the first second is positive).

It can be seen that, given our invariant for how the Tip IP and Minsky machine memory state relate, each Tip instruction will have exactly the same effect as the corresponding Minsky machine instruction. (By Fermat's Little Theorem, x and y both equal 1 mod p, and all f values are 1 mod 10, so multiplying or dividing by x or y won't change the value of the instruction pointer mod p, and multiplying by an f value won't change number of multiples of 2 or 5 that exist within the instruction pointer, and so we can treat the "two parts" of the IP's state independently.)

Note that more efficient constructions are available: for example, it's possible to inline a whole sequence of increments and decrements into a single Tip instruction. However, the simple construction above is enough to demonstrate the computational class of the language.

Variations

Although Tip is defined in terms of rational numbers, it's still Turing-complete if restricted to (bignum) decimal fixed-point numbers with sufficiently many decimal places; in the construction above, all the goto commands used are terminating decimals. It is not Turing-complete when using (binary) floating-point for the instruction pointer rather than an integer; even allowing for a bignum exponent, it becomes a one-counter machine, as only finitely many mantissas exist, and for any given mantissa, there's no way to distinguish more than finitely many values for the exponent via which command is running. (This in turn means that in most languages, it's difficult to accept the program as floating point, as the instruction will typically be "promoted" to floating point as a consequence.)

It's fairly simple to add batch I/O to the language. One simple construction that fits into the spirit of the rest of the language is to start at minus the IP's stated initial value, and to negate it (i.e. multiplying by -1) after a number of commands have run equal to the input (which must be a positive integer); and when the program halts, to output the number of times copies of the same command were run in a row immediately before halting (this will also be a positive integer); when input is not in use, this is equivalent to an (otherwise invalid) input of 0 (i.e. the IP gets negated twice before the program even runs).

The requirement to allow programs to specify the initial instruction pointer is not necessary (and is added only because taking it from input is often terser than requiring it to be hardcoded); always starting the instruction pointer at 1 is still sufficient for Turing-completeness, as shown by the construction above.

Example

Here's a simple program to double a number (it doubles 4 when I/O is not used, or the input when I/O is used):

  • Initial IP: 1
  • Commands: [1/4, 196608, 16, halt, 3, 16]

and here are debug traces, produced using the Perl implementation below:

With no I/O
Initial IP: 1
Program: [1/4 196608 16 0 3 16]
IP 1: running command: 196608 (index 1 of program)
IP 196608: running command: 1/4 (index 0 of program)
IP 49152: running command: 1/4 (index 0 of program)
IP 12288: running command: 1/4 (index 0 of program)
IP 3072: running command: 1/4 (index 0 of program)
IP 768: running command: 1/4 (index 0 of program)
IP 192: running command: 1/4 (index 0 of program)
IP 48: running command: 1/4 (index 0 of program)
IP 12: running command: 1/4 (index 0 of program)
IP 3: running command: 0 (index 3 of program)
Doubling 5
Initial IP: 1
Program: [1/4 196608 16 0 3 16]
IP -1: running command: 16 (index 5 of program)
IP -16: running command: 16 (index 2 of program)
IP -256: running command: 16 (index 2 of program)
IP -4096: running command: 16 (index 2 of program)
IP -65536: running command: 16 (index 2 of program)
IP 1048576: running command: 3 (index 4 of program)
IP 3145728: running command: 1/4 (index 0 of program)
IP 786432: running command: 1/4 (index 0 of program)
IP 196608: running command: 1/4 (index 0 of program)
IP 49152: running command: 1/4 (index 0 of program)
IP 12288: running command: 1/4 (index 0 of program)
IP 3072: running command: 1/4 (index 0 of program)
IP 768: running command: 1/4 (index 0 of program)
IP 192: running command: 1/4 (index 0 of program)
IP 48: running command: 1/4 (index 0 of program)
IP 12: running command: 1/4 (index 0 of program)
IP 3: running command: 0 (index 3 of program)
10

The program works by using the exponent of 4 to store the number being doubled, and the value of the instruction pointer mod 3 as a state machine.

Implementations

Perl

Here's an implementation in Perl, which sends copious debug output to stderr, and takes input from stdin but will run the program without input if given an empty file on stdin. It uses 0-indexing for the program, places the initial IP before the program, and represents halt as 0.

use warnings;
use strict;
use Math::BigRat;

my $ip = Math::BigRat->new(scalar <>);
my @program;
while (defined ($_ = <>)) {
    if (/H/) {
        push @program, 0;
    } else {
        push @program, Math::BigRat->new($_);
    }
}

print STDERR "Initial IP: $ip\n";
print STDERR "Program: [@program]\n";

my $input = <>;
my $inputdefined = defined $input;
my $lastci = .5;
my $output = 0;

while ($ip) {
    my $eip = $input ? -$ip : $ip;
    $eip < 0 and $input--;
    my $ci = $eip % @program;
    $ci < 0 and $ci += @program;
    my $cmd = $program[$ci]->copy();
    print STDERR "IP $eip: running command: $cmd (index $ci of program)\n";
    $ip->bmul($cmd);

    if ($ci == $lastci) {
        $output++;
    } elsif ($ip) {
        $output = 1;
        $lastci = $ci;
    }
}

$inputdefined and print "$output\n";

See also

External resources