Infinite Vector

From Esolang
Jump to: navigation, search

Infinite Vector is an esoteric programming language created by User:ais523 in 2015. It is based heavily around the concept of vectors. Contrary to its name, the vectors are not infinite; the "Infinite" in the name is merely to sound cool.

Note: this language is still something of a work in progress. If I ever try to implement it, I may end up changing some details to make doing so easier, and if I ever try to write programs in it, I may end up changing some details to make doing that easier.

Data types and storage

All data types in Infinite Vector are vector types: that is, they are a sequence of elements, each of which has the same data type. Vectors can be arbitrarily long (but must be finitely long). Because all data types are vectors, there is no need to specify "vector of" or the like in the name of a type, it's just assumed.

The types are as follows:

  • `flag`: vector of 1-bit unsigned integers
  • `byte`: vector of 8-bit unsigned integers
  • `word`: vector of 16-bit unsigned integers
  • `sentence`: vector of 32-bit unsigned integers
  • `paragraph`: vector of 64-bit unsigned integers
  • `page`: vector of 128-bit unsigned integers
  • `chapter`: vector of 256-bit unsigned integers
  • `char`: vector of 8-bit signed integers
  • `short`: vector of 16-bit signed integers
  • `long`: vector of 32-bit signed integers
  • `long long`: vector of 64-bit signed integers
  • `longer long`: vector of 128-bit signed integers
  • `long long long`: vector of 256-bit signed integers
  • `sink`: vector of 16-bit floating-point numbers
  • `float`: vector of 32-bit floating-point numbers
  • `double`: vector of 64-bit floating point numbers

A program can use any number of variables (each of which is a vector), each of which has a name (made out of lowercase ASCII characters and underscores; and any sequence of characters that corresponds to this syntax, if not a type name, is a variable name (or part of one) and is used for no other purpose). A single variable can change type during the execution of the program; whatever type is stored into the variable, the variable changes type to match. However, if a program reads from a variable before its first non-initialization write to that variable, then that variable must have the same type at the end of every iteration of the main loop and at every flow control command. (Because all commands treat types deterministically, the type of each variable at each point in the program can thus be determined statically.)

There are also "implicit vectors" which have all elements the same; the value of their elements is the same as their name (which must be an integer or floating-point number, according to the type). The type and length of an implicit vector is inferred from context (causing a compile error unless exactly one type and length is legal in that context). For example, a = a + 1 would add 1 to every element of a; this is conceptually an element-wise addition that adds corresponding elements of a, and a vector 1 that has the same number and type of elements as a does, all of which have value 1. Implicit vectors cannot be assigned to.

Commands

An Infinite Vector program consists of a sequence of commands. The vast majority of commands operate on vectors; there is almost no flow control. The entire program is placed in an implicit infinite loop (that can only be exited via exiting the program directly). Commands are separated by newlines.

Arithmetic

Arithmetic operations act element-wise on two vectors, which must have the same type and length, and produce a third vector as output (which can be assigned to either of the source variables or to a third variable), with the same type and length as the inputs (except in a few special cases, explained below). Infinite Vector compilers are allowed to reorder and rearrange sequences of arithmetic operations to mathematically equivalent sequences, even if this would produce different results (due to overflow behaviour or floating point rounding); to suppress this, place a non-arithmetic operation between the two operations that you don't want reordered.

a = b + c
a = b +? c

Addition adds corresponding elements of its input vectors. Floating-point overflow gives an appropriate infinity. There are two variants, which differ in their integer overflow behaviour (and work the same way on floating-point numbers): + wraps to the other end of the range (giving correct behaviour modulo 2 to the power of the number of bits), whereas +? returns the largest/smallest integer for positive/negative overflow.

a = b - c
a = b -? c

Subtraction subtracts corresponding elements of its input vectors. It has two variants, analogous to those for addition.

a = b * c
a = b *+ c
a = b *- c

Multiplication multiplies together corresponding elements of its input vectors. On floating-point numbers, all three operations are equivalent. On integers, the three variants differ in how they handle overflow: *- returns the value modulo 2 to the power of the number of bits, *+ returns the value divided by 2 to the power of the number of bits (rounding so that the modulus will work correctly), and plain * unconditionally doubles the width of the data type (regardless of the magnitude of the result) so that overflow is impossible.

a = b / c

Division divides corresponding elements of its input vectors. In most Infinite Vector implementations, inexact results will round towards zero, although an implementation could make the rounding behaviour customizable via command-line options. It only works on floating-point numbers. Division by zero produces NaN.

a = // b

Square root sets each element of the output vector to the square root of the corresponding element of the input vector (the square root of a negative number is NaN). The input must have a floating-point type.

a = /// b

Inverse square root sets each element of the output vector to the reciprocal of the square root of the corresponding element of the input vector (the inverse square root of a negative number is NaN, or possibly negative infinity if using a buggy processor stares at SSE, and of zero is infinity). The input must have a floating-point type.

a = b & c

Bitwise AND combines corresponding bits of corresponding elements of its input vectors, which must have an unsigned integer type, such that two 1 bits combine to form a 1 bit and other combinations combine to form a 0 bit.

a = b | c

Bitwise OR combines corresponding bits of corresponding elements of its input vectors, which must have an unsigned integer type, such that two 0 bits combine to form a 0 bit and other combinations combine to form a 1 bit.

a = b ^ c

Bitwise XOR combines corresponding bits of corresponding elements of its input vectors, which must have an unsigned integer type, such that two equal bits combine to form a 0 bit and two unequal bits combine to form a 1 bit. (This operation can also be used to complement every bit in a vector, as a = 1 ^ b.)

a = b << c

Left shift multiplies each element of its input vector, which must have an integer type, by two to the power of the corresponding element of the second input vector, to produce an output vector. Any overflow is lost (i.e. the result is reduced modulo 2 to the power of the number of bits).

a = b >> c

Right shift divides each element of its first input vector, which must have an integer type, by two to the power of the corresponding element of the second input vector, to produce an output vector. Rounding is towards negative infinity.

a = b <@< n

Left rotate is similar to a left shift, except that any bits that overflow past the top of the integer are not lost, but rather are divided by 2 to the power of the number of bits and added onto the result. Additionally, the right hand argument isn't a variable or implicit vector; it's just an integer, that is used as the right-hand argument for every element. The input vector must have an unsigned integer type.

a = b >@> n

Right rotate is similar to a right shift or left rotate, just with the obvious changes. It only works on unsigned integer types, and bits shifted off the LSB end of the integer are added on at the MSB end.

a = b <? c

Minimum uses the smaller of the corresponding elements from the input vectors as the output. (On a flag vector, this is equivalent to &.)

a = b ?> c

Maximum uses the larger of the corresponding elements from the input vectors as the output. (On a flag vector, this is equivalent to |.)

Conditionals

These instructions act element-wise on vectors of the same length. Depending on the instruction, some of the vectors are flags, and the others must all have the same type.

a = b ? c : d

The choosing instruction looks at the bits in a flag vector (b in this case), and uses them to determine which of the two possible corresponding elements in the vectors c and d to place in the return vector. If an element of b is 1, then the corresponding element of c will be returned as the corresponding element of a. If an element of b is 0, then the corresponding element of d will be returned as the corresponding element of a.

a = b < c
a = b = c
a = b > c
a = b <= c
a = b => c
a = b <> c
a = b <=> c
a = b !< c
a = b != c
a = b !> c
a = b !<= c
a = b !=> c
a = b !<> c
a = b !<=> c

There are fourteen comparison operations, as shown above. Each compares corresponding elements of two vectors given as arguments, and returns a flag vector. The return bits are 0 for false and 1 for true if there is no ! in the operator name, or 1 for false and 0 for true if there is an ! in the operator name. The comparison itself is true if < appears in the operator name and the first argument is less than the second; or if = appears in the operator name and the first argument is exactly equal to the second; or if > appears in the operator name and the first argument is greater than the second. (Note that NaN is not less than, equal to, or greater than anything, and nothing is less than, equal to, or greater than it. Thus a comparison like <=> is potentially useful, at least in that it won't always return true. It isn't very useful on integers, though.)

Changing types and lengths

In order to be able to mix vectors of different types, and different lengths, the following operations exist:

a = (type)b

The cast operation produces a vector as output that has the same length as the input vector, where corresponding elements are numerically equal (as far as is possible, e.g. inexact results will be rounded) but with a different type (which is specified as part of the operator's syntax). In the case of overflow on integer-to-integer conversions, the value will be reduced modulo 2 to the power of the number of bits in the destination. Overflow in floating-point-to-integer conversions is undefined behaviour, and overflow with floating point targets will produce a suitable infinity.

a = b ++ c
a = b ++? n

The first version of the 'concatenate operation concatenates two vectors of the same type and length to form a vector that's twice as long as the original. The ++? version concatenates a single element n (which is a literal floating point or integer, not a variable or implicit vector) to a vector, and only if the vector contains an odd number of elements (thus making it possible to split it into two halves).

a = b --? n

This operation, which (just to be different) doesn't have a name, entirely deletes the last element of a vector (making it one element shorter) if it has the same value as the single element given; the command may only be used if the vector currently has an even number of elements. As such, it's basically the reverse of ++. Note that you don't have to worry about whether NaN values are equal or not for this, because you have no way to specify them.

a ++ b = c

The deconcatenate operation splits a vector into two equal-length parts, which when concatenated form the original vector. (Obviously, this only works if the input vector has an even length. If its length is odd, this causes undefined behaviour.)

a = b $ c

The mingle operation (from INTERCAL, although it works on types other than flags) interleaves elements in two vectors that have the same type and length; that is, the first element of the output is the first element of the first input, the second element of the output is the first element of the second input, the third element of the output is the second element of the first input, the fourth element of the output is the second element of the second input, and so on.

a $ b = c

The demingle operation is the reverse of the mingle operation; it produces two outputs which, when mingled, give the original input. (Just like deconcatenation, this only works on an even-length vector.)

Control

a !!!

The maybe skip operation !!! takes two vectors as arguments. For each nonzero element in the argument, nothing happens, and the program proceeds as normal. Where an element in the argument is zero, it is flagged as a "skip element", while retaining its value. Any operations that read skip elements (arithmetic, casts, copies, concatenations, etc.) produce skip elements (with the correct value) as output for any calculation that depends on a skip element ("dependency" is interpreted strictly here, like taint in Perl, e.g. 0 bitwise-AND a skip element produces a skip-0.) At the end of the current main loop iteration, all skip elements cease to be skip elements, but might or might not have their value spontaneously set to zero (depending on what is faster/easier for the compiler).

The purpose of this operation is to allow Infinite Vector implementations to entirely skip sequences of operations that would have no useful effect, and just zero the relevant sections of output instead. As such, a few scattered skip elements are unlikely to make any difference, but a long string of them might lead to noticeable performance savings. (That said, the semantics of !!! mean that it could correctly be implemented to do nothing at all, and still comply with the spec; as such, it is best considered to be an optimization hint.)

# command

The initialize modifier # is placed before a command, and causes it to only run on the first iteration of the main loop; it will be ignored in all other iterations.

a :-(

The exit if zero operation :-( exits the program if every single element of the vector used as an argument is zero. Otherwise it does nothing.

I/O

Input and output only happen in a batch style (all input happens at the start of a program, all output at the end). If the program contains a variable input, it will be initialized by reading input from the environment; its type must be inferrable via examination of the program, and its length will be determined from the input itself. Likewise, a variable named output will have its value output to the environment at program exit.

Computational class

Unclear; it's clearly possible to store infinite amounts of data, and simulate control flow via the use of bitmasks that suppress unwanted commands (sort-of like ABSTAIN/TRY AGAIN control flow in INTERCAL), but it's unclear if you can get data at one end of a long vector to affect data at the other end (note that any sufficiently long vector has to either grow, shrink, or remain approximately the same size every main loop iteration, because all the commands that affect vector size are unconditional doublings, halvings, or ±1 with a parity constraint). It may well be that this language is a linear bounded automaton.