7

From Esolang
Jump to: navigation, search

7 is an esoteric programming language created by User:ais523 towards the end of 2016. It's heavily inspired by Underload, but has much more powerful input/output capabilities and a different set of commands. One major difference is that it's based entirely on combinators; unlike Underload, it doesn't have literals (although they can be simulated fairly easily).

Syntax

Although the language 7 has twelve different commands, only eight of them can appear in source files. These have the names 0 through 7 inclusive. (Six of the named commands have the purpose of appending other commands to the frame, which is how the four anonymous commands can get involved in a program.)

These commands can either be written with one command per byte (with their numerical names encoded using the ASCII digits), or using octal, with three bits per command. When octal is packed into bytes, it's done in such a way that the view of the input as octal and as octets will look the same in big-endian representation.

7 is never allowed as the last command of the program (doing so would not be useful anyway); rather, any sequence of consecutive 1 bits at the end of the program is ignored, kind-of like trailing whitespace. If the program appears to end halfway through a command, it'll be padded up to the command boundary with 1 bits. So for example, if a program ends with a 3 command, that command could be expressed with a single 0 bit (rather than the usual triple of bits 011), because the trailing 1 bits are implied.

Data storage

A 7 program can store data in two places. One is the command list, which initially stores the program; as the name suggests, this is a list of commands. It's used like a stack, with commands running from the start of the list and being removed as they do so, and the command list only ever being refilled via adding on to the front. The other is the frame, which is a construct that has many similarities to a stack. It's a string that can hold commands, but also bars, which are not commands and serve as dividers between sections of commands. Each section is manipulated as a group, and is quite similar to a stack element from a stack-based language (the commands only manipulate sections near the end of the frame). The initial state of the frame is two bars, and nothing else.

Commands

The twelve commands form six pairs, each of which contains a passive and an active command. All the passive commands do much the same thing; they append the corresponding active command to the frame. The active commands are much more varied in their behaviour.

0 (passive), 6 (active)

The active command pacifies the last section of the frame, then removes the bar to its left. The pacification of a list of commands is defined as follows: identify the passive substrings of the list (favouring one long substring, rather than two shorter adjacent ones), prepend a 7 and append a 6 to each such substring, then convert every command outside the passive substrings (all such commands will be active) to its passive equivalent. A passive substring is a substring for which all commands are named, for which 7 and 6 commands are correctly matched (as though they were parentheses), and in which the substring76 does not appear.

1 (passive), 7 (active)

The active command appends a bar to the frame.

2 (passive), plus an anonymous active command

The active command duplicates the last section of the frame (separating the original section from the new section with a bar).

3 (passive), plus an anonymous active command

The active command outputs the last section of the frame to the user, then discards the last two sections (and the bars to their left). If the section being output contains anonymous commands, it's pacified first (pacification always leads to a section with only named commands), and a 7 command is prepended. The first command of output via the 3 command specifies the output format, character encoding, and the like; see the language's documentation for more details.

This command can also be used for input, via "outputting" certain special codes that don't correspond to characters. The input is always converted to a nonnegative integer (either it's one already, or it's a character which is converted to an integer by taking its codepoint and adding 1; EOF + 1 is 0). Then it's used to make that many copies of the contents of the last section of the frame (unlike with the active counterpart of the 2 command, the copies are all run together, with no bars separating them.

4 (passive), plus an anonymous active command

The active command here rearranges the frame as follows: the last two sections are swapped; and the bar between them is replaced by two bars.

5 (passive), plus an anonymous active command

The active command here moves the last section of the frame to the start of the command list, and removes the bar before it.

Execution model

The program repeatedly executes the first command of the command list (removing it from the list in the process) until the command list is empty. At that point, the program cycles; any bars at the very end of the frame are removed, then the last section of the frame is copied to the command list (this differs from the active counterpart of the 5 command, which would move it). If the command list is ever empty while the frame consists entirely of bars, the program exits; this is a successful exit. (It also exits, with more of a failure-type exit, if the 3 command runs out of sections to discard. Running out of sections in other commands should produce an error in well-behaved interpreters, although very simple interpreters may prefer to leave this as undefined behaviour.)

Example programs

Hello, world!

5325101303040432004515131401430134321027403

This makes use of output format 5, which uses the commands 0 through 5 (which are all passive and thus can be handled consistently when pushing them to the frame) to encode a string in Baudot (technically US-TTY, which is a Baudot variant); pairs of base-6 digits give 36 possibilites, compared to the 32 that Baudot has. Most of the program encodes the text "Hello, world!\n"; the 7 separates the data from the code that prints it, which is a simple 403 (you need to swap the data above the code that was left on the frame, and pacify it to convert the active commands back to passive, before you print it).

Factorial

177172051772664057074056167770236713351357263

The core of this program is 205 177266405707405616. When the last section of the frame is a sequence of passive commands with a 7 following it, the active counterpart of 205 will append the active counterparts of those commands to the section before, with all the other changes cancelling each other out. Underload programmers will realise that this is similar to a multiplication. Meanwhile, the active counterpart of 177266405707405616 implements an increment instruction; 177266 appends a suitably pacified 2 section to the frame, 405 prepends it to the section before, and 707405616 appends 64057 to the last section of the frame (which, as we maintain it ending with a 7 and only ever read it via running it, is equivalent to adding a 405 just before the final 7, because 76 is a no-op). Prepending 2 and appending 405 in 7 is very similar to prepending : and appending ~* in Underload, which would increment an integer seen.

As such, the program basically forms a loop, repeatedly incrementing the last section of the frame, then multiplying the penultimate section by it. The rest of the program is to deal with input, output, and setting up the initial state of the frame.

Underload translation

The named 7 commands could be approximately translated to Underload along the following lines:

  • 0 becomes (a*(*)*)*
  • 1 becomes (())*
  • 2 becomes (:)*
  • 3 becomes (S!)* (this is the least accurate translation)
  • 4 becomes (~()~)*
  • 5 becomes (^)*
  • 6 becomes a*(*)*
  • 7 becomes ()

The translation will not work directly, due to things like cycling, but can get a good idea of how 7 works.

Translating the other way:

  • (x) becomes 7x or 7x6 or sometimes even 177x66 (the quoting rules act differently between the languages, so it's a bit hard to give an accurate correspondence)
  • ! becomes 13
  • S becomes 23
  • : becomes 2
  • ^ becomes 5
  • ~ becomes 443
  • * becomes 443403
  • a becomes 17164430

Because 7 programs are initially given in passive form, understanding the quoting rules can be fairly confusing. (Additionally, it tends to be a better idea to use different algorithms from Underload on the small scale; in Underload, * is slightly shorter than ~*, but both are widely useful as appending and prepending text are both common operations; meanwhile, in 7, ~* to prepend is considerably shorter, with * being longer because it's much more rarely used; to append a constant, you can just use a sequence of passive commands.)

External resources

  • A (rather slow) interpreter written in Perl, which also contains the language specification as documentation, can be found here.