Funciton

From Esolang
(Redirected from Funciton/List handling)
Jump to navigation Jump to search

Funciton (pronounced: /ˈfʌŋkɪtɒn/) is a two-dimensional, minimalistic, declarative programming language invented by User:Timwi in 2011.

                   ╓───╖
                   ║ ! ║
                   ╙─┬─╜   ┌───╖  ╔═══╗
               ┌─────┴─────┤ > ╟──╢ 2 ║
               │           ╘═╤═╝  ╚═══╝
 ╔════╗  ┌───╖ │             │
 ║ −1 ╟──┤ + ╟─┴─┐           │
 ╚════╝  ╘═╤═╝   │           │
         ┌─┴─╖   │    ╔═══╗  │
         │ ! ║   │    ║ 1 ║  │
         ╘═╤═╝   │    ╚═╤═╝  │
           │   ┌─┴─╖  ┌─┴─╖  │
           │   │ × ╟──┤ ? ╟──┘
           │   ╘═╤═╝  ╘═╤═╝
           └─────┘      │

Example: The factorial function.

(None of the functions seen here (addition, multiplication, greater-than) are built-in.
All of them have their own implementation in Funciton.)

Features

  • Programs use the Unicode box-drawing characters to create a graphical representation of data flow.
  • Funciton is not electronics. Programs look a bit like circuit diagrams, but they are not circuit diagrams.
  • Funciton is invariant under 90° rotations. This means you can take any program or any individual function declaration and turn it by 90° without altering its semantics or the semantics of anything that calls it.
  • Funciton has only one datatype, the arbitrary-size integer. However, Funciton has built-in semantics to express strings and anonymous functions as integers.
  • Funciton has only five built-in instructions (NAND, less-than, shift-left, function invocation and lambda expressions); all other functionality (arithmetic, string handling, lazy sequences, etc.) is implemented in terms of these.
  • Funciton has a fully working interpreter and a compiler to IL; see External resources at the bottom.

Syntactic elements

Function declaration header
  ╓───╖       ╒═══╕
──╢ + ╟──   ──┤ + ├──
  ╙───╜       ╘═══╛
Function call
  ┌───╖      ╓───┐      ╔═══╕      ╒═══╗
──┤ + ╟──  ──╢ + ├──  ──╢ + ├──  ──┤ + ╟──
  ╘═╤═╝      ╚═╤═╛      ╙─┬─┘      └─┬─╜
    │          │          │          │
NAND or splitter
  │       │                   │
──┴──     ├──     ──┬──     ──┤
          │         │         │
less-than and shift-left
      │
    ──┼──
      │
Function return value
           │
  ────     │
           │
Integer literal
  ╔═══╗
  ║ 1 ╟──
  ╚═══╝
Read string from STDIN
  ╔═══╗
  ║   ╟──
  ╚═══╝
Lambda expression
    │          │          │          │
  ╒═╧═╗      ╓─┴─╖      ╔═╧═╕      ╔═╧═╗
──┤   ╟──  ──╢   ╟──  ──╢   ├──  ──╢   ╟──
  ╘═╤═╝      ╚═╤═╝      ╚═╤═╛      ╙─┬─╜
    │          │          │          │
Lambda invocation
    │          │          │          │
  ┌─┴─╖      ┌─┴─┐      ╓─┴─┐      ╒═╧═╕
──┤   ╟──  ──┤   ├──  ──╢   ├──  ──┤   ├──
  └─┬─╜      ╘═╤═╛      ╙─┬─┘      └─┬─┘
    │          │          │          │
Comment
  ╔════════════╗
  ║  Hi, Bob!  ║
  ╚════════════╝

Introduction

Simply speaking, Funciton programs consist of lines and boxes. Each line carries data from one box to the next.

A Funciton program must have exactly one loose end; a line that just stops somewhere. This is the program output.

Boxes with a double-struck line and at least one outgoing line are literals. Such a box may contain an integer written in decimal. A very simple program might look like this:

  ╔════╗
  ║ 47 ║
  ╚═╤══╝
    │

This program prints /, which is Unicode character number 47.

Boxes with a double-struck line can be empty, too. In this case, such a box represents the data passed to the program via STDIN. The data is encoded according to the string encoding described later. (Multiple such boxes in a program do not read multiple different inputs from STDIN. The interpreter reads STDIN only once, and then all such boxes return the same data.)

  ╔═══╗
  ║   ║
  ╚═╤═╝
    │

This program prints its input, so it is a cat program.

NAND and the splitter

A T-junction in which the two parallel connectors are inputs and the perpendicular one is an output calculates the NAND of two numbers. In C notation, NAND(a,b) = ~(a & b). Consider the following programs:

  ╔═══╗   ╔═══╗       ╔═══╗   ╔═══╗       ╔════╗  ╔════╗
  ║ 5 ║   ║ 5 ║       ║ 5 ║   ║ 0 ║       ║ -2 ║  ║ -1 ║
  ╚═╤═╝   ╚═╤═╝       ╚═╤═╝   ╚═╤═╝       ╚══╤═╝  ╚══╤═╝
    │       │           │       │            │       │
    └───┬───┘           └───┬───┘            └───┬───┘
        │                   │                    │

The above three programs calculate −6, −1 and 1, respectively. (The actual output will be an unprintable character.)

Remember that numbers in Funciton are arbitrary-size integers. Negative numbers can be conceptualized in two ways that are equivalent:

  • You can think of it as infinitely many 1-bits. -1 has all bits set. There is a least significant bit, but no most significant bit (any more than there is a most significant 0-bit in a non-negative number).
  • You can also think of it as finitely many bits followed by a sign bit, in which case the bit-shift operation performs sign extension.

Also remember that Funciton programs are rotation-invariant, so all of the following programs do exactly the same thing:

  ╔════╗  ╔════╗           │          ╔════╗               ╔════╗
  ║ -2 ║  ║ -1 ║       ┌───┴───┐      ║ -1 ╟──┐         ┌──╢ -2 ║
  ╚══╤═╝  ╚══╤═╝       │       │      ╚════╝  │         │  ╚════╝
     │       │      ╔══╧═╗  ╔══╧═╗            ├──     ──┤
     └───┬───┘      ║ -1 ║  ║ -2 ║    ╔════╗  │         │  ╔════╗
         │          ╚════╝  ╚════╝    ║ -2 ╟──┘         └──╢ -1 ║
                                      ╚════╝               ╚════╝
 
          ╔══════════════════════════════════════════════════╗
          ║  All four of the above programs are equivalent.  ║
          ╟──────────────────────────────────────────────────╢
          ║  Also, a double-lined box with no outgoing       ║
          ║  lines (like this one) is a comment.             ║
          ╚══════════════════════════════════════════════════╝

A T-junction in which the two parallel connectors are outputs and the perpendicular one is an input is a splitter. This is not an operation, but simply a syntactic element that splits a line into two. It allows you to access the same value more than once. For example, consider the following program:

  ╔═══╗              ╔═══╗              ╔═══╗
  ║ 1 ║ ╔════════╗   ║ 2 ║ ╔════════╗   ║ 3 ║
  ╚═╤═╝ ║ ↓ NAND ║   ╚═╤═╝ ║ ↓ NAND ║   ╚═╤═╝
    │   ╚════════╝     │   ╚════════╝     │
    └─────┬────────────┴─────┬────────────┘
          │ ╔════════════╗   │
          │ ║ splitter ↑ ║   │
          │ ╚════════════╝   │
          │     ╔════════╗   │
          │     ║ ↓ NAND ║   │
          │     ╚════════╝   │
          └───────┬──────────┘
                  │

This program computes the value (3 NAND 2) NAND (2 NAND 1) and thus outputs character number 2 (which, again, is an unprintable character). (The order of the NAND arguments is significant due to its short-circuit semantics explained later.)

Whether a T-junction is a NAND operation or simply a splitter is implicit in which incoming lines are inputs and which are outputs:

╔═════════════════╗    ╔═════════════════════╗    ╔══════════════════════════╗
║ This is a NAND. ║    ║ This is a splitter. ║    ║ This program is invalid. ║
╚═════════════════╝    ╚═════════════════════╝    ╚══════════════════════════╝
 
                                                           ╔═══╗
╔═══╗         ╔═══╗         ──────┬──────                  ║ 3 ║
║ 3 ╟────┬────╢ 2 ║               │                        ╚═╤═╝
╚═══╝    │    ╚═══╝             ╔═╧═╗                        │    ╔═══╗
         │                      ║ 3 ║                        ├────╢ 2 ║
         │                      ╚═══╝                        │    ╚═══╝

(Though notice that the second program is also invalid because it has more than one output.)

Declaring a function

A function declaration consists of a declaration header and one or more loose ends. The header is a box, containing the function name, that has two double-lined and two single-lined edges on opposite sides. A loose end defines a function output.

We can now construct a function for each of the familiar logic gates:

                                                  ┌──────────────────────┐
╓─────╖        ╓─────╖            ╓────╖          │  ╓─────╖             │
║ not ║     ┌──╢ and ╟──┐      ┌──╢ or ╟──┐       ├──╢ xor ╟─────┐       │
╙──┬──╜     │  ╙─────╜  │     ┌┴┐ ╙────╜ ┌┴┐      │  ╙─────╜ ┌┐  │    ┌┐ │
   │        └─────┬─────┘     └┬┘        └┬┘      └───┬──────┤├──┴──┬─┤├─┘
  ┌┴┐            ┌┴┐           └────┬─────┘           │      └┘     │ └┘
  └┬┘            └┬┘                │                 └────┬────────┘
   │              │                                        │

Notice that NOT consists of a splitter and a NAND.

The lines emerging from the declaration header define inputs to the function. Loose ends define outputs. A function can have multiple outputs. The directions in which the inputs “leave” the declaration header and the directions in which the outputs “point” determine how the function can be called; more about this later.

Less-than and shift-left

Less-than and shift-left are both calculated by a single operator, the cross:

      ╔═══╗
      ║ a ║
      ╚═╤═╝
 ╔═══╗  │
 ║ b ╟──┼──── (a < b)
 ╚═══╝  │
        │
        │
    (a SHL b)

The less-than operation returns −1 for true and 0 for false. This enables boolean logic using the NAND operation; if true were represented by 1, then NOT 1 would be −2.

The shift-left operation can also do shift-right simply by giving it a negative operand.

In order to use this to evaluate only less-than or only shift-left, we need a little trick.

The Starkov construct

The Starkov construct, named for Cambridge computer scientist and entrepreneur Roman Starkov, who discovered this construct months into Funciton’s existence, is a trick to “nullify” a wire. It consists of a NAND operator that is connected to itself. The following declarations for a less-than and greater-than function demonstrate this:

 ╔╤════════════╗   ╔╤═══════════════╗
 ║│ less than  ║   ║│ greater than  ║
 ╚╧════════════╝   ╚╧═══════════════╝
                      ╓───╖
 ┌────────┐         ┌─╢ > ╟──┐
 │ ╓───╖  │         │ ╙───╜  │
 └─╢ < ╟──┼─────┐   └────────┼─────┐
   ╙───╜  ├──┐  │            ├──┐  │
          └──┘               └──┘

In these examples, the less-than part of the cross operator is connected to the function output, while the shift-left part is connected to a Starkov construct. Since this does not constitute a function output, it is never evaluated and thus effectively nullifies that part of the calculation.

Calling a function

It would be no use declaring functions if you couldn’t call them. A call box consists of two adjacent double-lined edges and two single-lined ones. This is best demonstrated by an example, so let’s define some more comparison operators:

  ╔╤══════════════╗   ╔╤═══════════════╗
  ║│ less than    ║   ║│ greater than  ║
  ║│ or equal to  ║   ║│ or equal to   ║
  ╚╧══════════════╝   ╚╧═══════════════╝
        ╓───╖               ╓───╖
     ┌──╢ ≤ ╟──┐         ┌──╢ ≥ ╟──┐
     │  ╙───╜  │         │  ╙───╜  │
     │         │         │         │
     │  ┌───╖  │         │  ┌───╖  │
     └──┤ < ╟──┘         └──┤ > ╟──┘
        ╘═╤═╝               ╘═╤═╝
         ┌┴┐                 ┌┴┐
         └┬┘                 └┬┘
          │                   │

These declarations may at first look wrong. After all, “less-than-or-equal-to” is not the same as “NOT less-than”! To understand why this declaration is correct, you have to consider the direction in which data is flowing:

So basically, what goes in to the right of the call node comes out of the left of the declaration box (if the orientation of the outputs in the declaration and the call are the same). The fact that the wires take a 180° turn effectively swaps the order of the operands.

The conditional operator

The conditional operator, written in C-like languages as c ? y : n, which returns n if c is zero and y otherwise, is equivalent to the expression ((a ≠ 0) NAND b) NAND ((a = 0) NAND c). Here we can really show off the splitter: the test for a ≠ 0 is only evaluated once even though it is used in two places.

  ╔═══╗  ┌───╖  ╓───╖
  ║ 0 ╟──┤ ≠ ╟──╢ ? ╟─┐
  ╚═══╝  ╘═╤═╝  ╙─┬─╜ │        ╔══════════════════════════════╗
           │      ├─┬─┤        ║  conditional operator        ║
           │      │  ┌┴┐       ║  WITH A PROBLEM (read on)    ║
           │      │  └┬┘       ╚══════════════════════════════╝
           │      └─┬─┘
           └────────┘

Now, I hate to break it with you, but it’s not quite this simple. The above declaration has a big problem which we will come to later.

Addition

Incomplete first version

Addition will be our first recursive function. The simplest way to do addition is to add and subtract 1 at a time:

+(a, b) = a ? +(a−1, b+1) : b;

However, this would be incredibly slow (it would be exponential-time). We can do better — we can do linear-time by harnessing the bit-shift operation. We will implement addition as a recursive function using the following formula:

+(a, b) = b ? +(a XOR b, (a AND b) SHL 1) : a

You can easily convince yourself that this works by considering the following:

  • The XOR does the addition on each individual bit, ignoring the carry.
  • The AND calculates the carry. SHL carries it to the next bit and then the recursive call adds it to the previous result.

Unfortunately, we run into three problems. The first problem is immediately apparent here:

            ╓───╖
         ┌──╢ + ╟──┐
         │  ╙───╜  │
    ┌────┴────┬────┴────┐
    │  ┌┐     │      ┌┐ │
  ┌─┴──┤├──whoops!─┬─┤├─┴─┐
  │    └┘     │    │ └┘   │          ╔═══════════════════════════════════╗
  │    ┌──────┴───┬┘      │          ║  addition                         ║
  │   ┌┴┐        ┌┴┐      │          ║  WITH THREE PROBLEMS (read on)    ║
  │   └┬┘        └┬┘      │          ╟───────────────────────────────────╢
  │ ┌──┴─╖  ┌───╖ │       │          ║  If → is 0, answer is ←, else:    ║
  │ │ << ╟──┤ + ╟─┘       │          ║  XOR to add digits without carry  ║
  │ ╘══╤═╝  ╘═╤═╝         │          ║  AND to calculate carry           ║
  │  ╔═╧═╗    │           │          ║  then recursive call              ║
  │  ║ 1 ║    │           │          ╚═══════════════════════════════════╝
  │  ╚═══╝  ┌─┴─╖         │
  └─────────┤ ? ╟─────────┘
            ╘═╤═╝
              │

Problem #1: crossing lines (solution: cross-nop)

We need for those two lines to cross. But as we saw before, crossing two lines causes less-than and shift-left to be calculated; we don’t want that. We need a cross that is a no-op. A cross-nop.

Fortunately, there is a way to cross two lines. We can declare a function that takes two inputs and two outputs, and which... does nothing:


      ╒═══╕          ╔═════════════╗
      │ · ├──        ║  cross-nop  ║
      ╘═╤═╛          ╚═════════════╝
        │

Notice that this declaration says that:

  • the function has two inputs which are adjacent
  • the function has two outputs which are adjacent, and opposite the inputs
  • each output is equal to the input opposite to it

which is exactly what we want.

There is a way to cross two values without this function, although it involves multiple NAND operations. Consider this a puzzle challenge! Can you find it?

Problem #2: infinite recursion (solution: short-circuit evaluation)

Another problem with our addition function above is that it may potentially get stuck in infinite recursion. The function specifies a terminating condition (namely, b); but we need to make sure that the rest of the code (the part containing the recursive call) is not going to get evaluated when b is zero.

To this end, the NAND operator has built-in short-circuit semantics:

  ╔═══╗   ╔═══╗
  ║ x ║   ║ 0 ║      ╔═══════════════════════════════╗
  ╚═╤═╝   ╚═╤═╝      ║  short-circuit evaluation:    ║
  ┌─┴─╖     │        ║  the NAND operator returns    ║
  │ F ║     │        ║  −1 without evaluating F(x).  ║
  ╘═╤═╝     │        ╚═══════════════════════════════╝
    └───┬───┘
        │

Whenever the first operand to a NAND operation evaluates to 0, the second operand is never evaluated. Notice that if the NAND’s output is pointing down, the “first” operand is on the right (and analogously for the other possible rotations).

The conditional operator, fixed

Now the problem with the conditional operator we defined earlier may become apparent. Our implementation does ((c = 0) NAND n) NAND (y NAND (c ≠ 0)); the second inner NAND has its operands in the wrong order. We need to evaluate c ≠ 0 first in order to know whether to evaluate y. Therefore, we need to flip it around. At this point we run into the line-crossing problem again, so we use our newly-defined cross-nop:

 ╔═══╗  ┌───╖    ╓───╖
 ║ 0 ╟──┤ ≠ ╟────╢ ? ╟─┐
 ╚═══╝  ╘═╤═╝    ╙─┬─╜ │        ╔════════════════════════╗
          │      ┌─┴─╖ │        ║  conditional operator  ║
          │  ┌───┤ · ╟─┤        ╟────────────────────────╢
          │  │   ╘═╤═╝ │        ║  Returns ↓ if ← ≠ 0    ║
          │  └──┬──┤  ┌┴┐       ║  and → if ← = 0        ║
          │     │  │  └┬┘       ╚════════════════════════╝
          │        └─┬─┘
          └──────────┘

In cases where we already know that the input is a boolean (0 or −1), the call to is redundant, so for performance optimization let’s also provide an “unsafe” version:

                 ╓───╖
          ┌──────╢ ‽ ╟─┐
          │      ╙─┬─╜ │        ╔══════════════════════════════╗
          │      ┌─┴─╖ │        ║  conditional operator where  ║
          │  ┌───┤ · ╟─┤        ║  ← is assumed to be 0 or −1  ║
          │  │   ╘═╤═╝ │        ║  (if it isn’t, all operands  ║
          │  └──┬──┤  ┌┴┐       ║  are evaluated and unwanted  ║
          │     │  │  └┬┘       ║  bitwise operations happen)  ║
          │        └─┬─┘        ╚══════════════════════════════╝
          └──────────┘

Problem #3: it doesn’t work with negative numbers

Well, it works with some negative numbers. But there are combinations of inputs for which the recursion is infinite (the terminating condition is never reached).

Specifically, it works fine only if:

  • a ≥ 0 and b ≥ 0, or
  • a + b < 0

But it doesn’t manage to do the jump from a negative input to a positive (or zero) output. a XOR b will be negative and stay negative, and the (a AND b) SHL 1 will be positive and just keep on growing indefinitely.

The increment function

Let’s take a quick diversion to implement a simpler operation: the increment function, which is defined as:

♯(a) = a + 1

But we can’t define it in terms of the addition function because we still haven’t succeeded in writing one that works for all inputs. Let’s instead write a recursive function that does this:

  • If the input is −1, return 0.
  • If the least significant bit is 0, change it to 1.
  • If the least significant bit is 1, change it to 0 and call this function recursively on the next bit.
                            ╓───╖
                            ║ ♯ ║             ╔══════════════════════════════╗
                            ╙─┬─╜             ║  increment                   ║
                   ┌──────────┴────────┐      ╟──────────────────────────────╢
   ╔════╗  ┌────╖  │    ╔═══╗          │      ║  If ↓ = −1, return 0         ║
   ║ -1 ╟──┤ << ╟──┴─┬──╢ 1 ║          │      ║  If ↓ & 1 = 0, return ↓ | 1  ║
   ╚════╝  ╘══╤═╝    │  ╚═══╝          │      ║  Else, recurse with ↓ >> 1   ║
            ┌─┴─╖   ┌┴┐   ╔═══╗ ╔════╗ │      ╟──────────────────────────────╢
            │ ♯ ║   └┬┘   ║ 0 ║ ║ -1 ║ │      ║  Note: can’t actually use    ║
            ╘═╤═╝    │    ╚═╤═╝ ╚══╤═╝ │      ║  >> because >> depends on    ║
           ┌──┴─╖  ┌─┴─╖  ┌─┴─╖  ┌─┴─╖ │      ║  − (unary minus) which in    ║
           │ << ╟──┤ ? ╟──┤ ‽ ╟──┤ = ║ │      ║  turn depends on the         ║
           ╘══╤═╝  ╘═╤═╝  ╘═╤═╝  ╘═╤═╝ │      ║  increment function.         ║
            ╔═╧═╗ ┌┐ │ ┌┐   │      ├───┘      ╚══════════════════════════════╝
            ║ 1 ╟─┤├─┴─┤├──────────┘
            ╚═══╝ └┘   └┘

Note that the formula uses both << (shift-left) and >> (shift-right), but we haven’t defined a >> function yet. Fortunately, the built-in SHL operation can perform SHR by simply giving it a negative number, so we use SHL −1.

Introducing private functions

Since the addition function we already defined is potentially an infinite loop, we don’t really want any code to call it other than code that knows what it’s doing. Therefore, we will declare it private. A private function is a function that can only be called by functions within the same source file; in other words, its scope is its containing file. Private functions are marked with an extra little tag in the declaration box as seen here:

                    ╓┬───╖
                 ┌──╫┘+p ╟──┐
                 │  ╙────╜  │
            ┌────┴─────┬────┴────┐
            │ ┌───┐  ┌─┴─╖    ┌┐ │
          ┌─┴─┤   ├──┤ · ╟──┬─┤├─┴─┐
          │   └───┘  ╘═╤═╝  │ └┘   │          ╔═══════════════════════════════════╗
          │    ┌───────┴───┬┘      │          ║  addition in the case of          ║
          │   ┌┴┐         ┌┴┐      │          ║  (a≥0 & b≥0) | (¬b)≥a             ║
          │   └┬┘         └┬┘      │          ╟───────────────────────────────────╢
          │ ┌──┴─╖  ┌────╖ │       │          ║  If → is 0, answer is ←, else:    ║
          │ │ << ╟──┤ +p ╟─┘       │          ║  XOR to add digits without carry  ║
          │ ╘══╤═╝  ╘══╤═╝         │          ║  AND to calculate carry           ║
          │  ╔═╧═╗     │           │          ║  then recursive call              ║
          │  ║ 1 ║     │           │          ╚═══════════════════════════════════╝
          │  ╚═══╝   ┌─┴─╖         │
          └──────────┤ ? ╟─────────┘
                     ╘═╤═╝
                       │

Finally, we define the real addition function in terms of this private one. We know that the above function works for negative numbers if the result is negative. Therefore, if we can tell (using ≥) that it is going to be non-negative, just negate both operands; then the result will be negative, which we negate again to get the real result. Since binary negation is actually one away from unary minus, we use the increment function to correct the discrepancy.

      ┌──────────────────────────────┐  ╔══════════════════════════════════════════╗
      │       ╓───╖                  │  ║  addition                                ║
      ├───────╢ + ╟───────┐          │  ╟──────────────────────────────────────────╢
    ┌─┴─╖     ╙───╜       │          │  ║  If (a≥0 & b≥0) | (¬b)≥a, use +p         ║
 ┌──┤ · ╟─────────────────┴──┐       │  ║  Else, negate both inputs, then use +p,  ║
 │  ╘═╤═╝          ┌─────────┴──┐    │  ║  then increment and negate result        ║
 │    │            │  ┌────╖  ┌─┴─╖  │  ╚══════════════════════════════════════════╝
 │    │            └──┤ +p ╟──┤ · ╟──┴──────────────────┐
┌┴┐  ┌┴┐              ╘═╤══╝  ╘═╤═╝┌───╖  ╔═══╗  ┌───╖  │
└┬┘  └┬┘                │       └──┤ ≤ ╟──╢ 0 ╟──┤ ≥ ╟──┴─┐
 │    │                 │          ╘═╤═╝  ╚═══╝  ╘═╤═╝    │
 │ ┌──┴─╖  ┌───╖  ┌┐  ┌─┴─╖          └──────┬──────┘      │
 │ │ +p ╟──┤ ♯ ╟──┤├──┤ ‽ ╟─────────────────┤             │
 │ ╘══╤═╝  ╘═══╝  └┘  ╘═╤═╝                 │             │
 │    │                 │                   │             │
 └────┤                                   ┌─┴─╖           │
      └───────────────────────────────────┤ < ╟───────────┘
                                          ╘═══╝

Strings

As alluded to in the introduction, Funciton programs receive their input (which is assumed to be a Unicode string) as a single humongous integer. The format in which strings are encoded could be referred to as UTF-21: every 21 bits form a Unicode character. This way the complexity arising from surrogates in UTF-16 is avoided, while still being more memory-efficient than UTF-32.

The least significant 21 bits contain the first character in the string. Thus, if you were to look at the bit pattern of an integer, the string would appear to be stored backwards (unless you’re Arabic or Hebrew lol). The operation s SHR 21 removes the first character and thus appears to be shifting the characters in the string to the left.

The empty string is 0 and all other strings are positive integers. The NUL character (U+0000) is encoded as 0x110000, which is one higher than the highest legitimate Unicode codepoint. A compliant interpreter must replace NUL with this code when reading STDIN and replace this code back with NUL when outputting.

This takes us directly to some simple functions on strings. Here are string length and string concatenation:

                      ╓───╖
                      ║ ℓ ║                                                        ┌────┐   ╓───╖
       ╔═════╗ ┌────╖ ╙─┬─╜     ╔═══╤═════╤═════════════════╗                      │    ├───╢ ‼ ╟──┐
       ║ −21 ╟─┤ << ╟───┴─┐     ║ ← │  ℓ  │  String length  ║                      │  ┌─┴─╖ ╙───╜  │
       ╚═════╝ ╘═╤══╝     │     ╚═══╧═════╧═════════════════╝                      │  │ ℓ ║        │
               ┌─┴─╖      │                                                        │  ╘═╤═╝        │
               │ ℓ ║      │                                                        │  ┌─┴─╖ ┌────╖ │
               ╘═╤═╝      │                                                        │  │ × ╟─┤ << ╟─┘
               ┌─┴─╖      │             ╔═════╤═══════════════════════════╤═══╗    │  ╘═╤═╝ ╘═╤══╝
               │ ♯ ║      │             ║  ‼  │  Concatenate two strings  │ → ║    │  ╔═╧══╗  │
               ╘═╤═╝      │             ╚═════╧═══════════════════════════╧═══╝   ┌┴┐ ║ 21 ║ ┌┴┐
         ╔═══╗ ┌─┴─╖      │                                                       └┬┘ ╚════╝ └┬┘
         ║ 0 ╟─┤ ? ╟──────┘                                                        └───────┬──┘
         ╚═══╝ ╘═╤═╝                                                                       │
                 │

The string concatenation function allows us to implement 99 bottles of beer on the wall.

Lists

The Funciton language does not have any built-in functionality for lists; but that is not necessary because we can implement lists entirely in Funciton itself. The encoding used to represent lists in a single humongous integer is not fundamental to the language; anyone could come up with a different encoding and implement all the list handling functions for that.

(However, the official Funciton interpreter has a feature that allows you to trace the execution of a specific function. This feature will detect values that encode lists and decode them for the purposes of debugging. The encoding described here is the encoding understood by this feature of the interpreter as well as the one used by all the list handling functions.)

Since strings are encoded as integers, a list of integers can automatically double as a list of strings. Furthermore, since lists themselves are encoded as single integers, one can also have lists of lists, arbitrarily nested.

List encoding grammar

Since integers are arbitrary-size, we cannot assume each element in a list fits into any particular number of bits. Therefore, we have to define a variable-length encoding for integers. The encoding described here chops each integer into chunks of 21 bits. (The number 21 is arbitrary, but since characters in a string are already 21 bits, I decided to reuse the number.)

  • In the above diagram, the least significant bit is on the right.
  • Each element in the list is split into chunks of 21 bits. (The number 0 has no chunks.)
  • The least significant bits of the final encoding contain the least significant chunk of the element.
  • Each chunk is preceded by a continuation bit: 0 = more chunks to come, 1 = no more chunks.
  • The number is preceded by a sign bit. If the sign bit is set, the entire element is bitwise negated.

Notice that this encoding has several consequences:

  • The empty list is represented by the number 0.
  • The number 2 encodes a list containing one element, which is 0 (no chunks, sign bit is 0).
  • The number 3 encodes a list containing one element, which is −1 (no chunks, sign bit is 1).
  • The number 1 is not a valid list.
  • Negative numbers also never constitute a valid list.
  • Many of the list functions will get stuck in infinite loops if you pass them an invalid list.

Examples

List containing two items, each ≤ 21 bits long (again, in this diagram, the least significant bit is on the right):

List containing one item between 22 and 42 bits long:

Lambda expressions

Since Dec 2013, Funciton supports lambda expressions (anonymous functions). There are two new types of box for this:

 ╔══════════════════════════════╗          ╔══════════════════════════════╗
 ║       Lambda expression      ║          ║       Lambda invocation      ║
 ╟──────────────────────────────╢          ╟──────────────────────────────╢
 ║             output 1         ║          ║           lambda             ║
 ║                ↓             ║          ║             ↓                ║
 ║              ╔═╧═╕           ║          ║           ┌─┴─╖              ║
 ║  output 2 → ─╢   ├─ → input  ║          ║  input → ─┤   ╟─ → output 2  ║
 ║              ╚═╤═╛           ║          ║           └─┬─╜              ║
 ║                ↓             ║          ║             ↓                ║
 ║              lambda          ║          ║          output 1            ║
 ╚══════════════════════════════╝          ╚══════════════════════════════╝

Every lambda has one input (parameter) and two outputs (return values). In cases where only one output is desired, the convention is that the lambda expression returns 0 for the second output and the invocation uses a Starkov construct to discard it.

Since a lambda can easily itself return a lambda, any number of parameters are possible through currying. Similarly, if more than two return values are desired, simply return the first value in output 1 and another lambda in output 2 which will successively produce the remaining values.

Since we already have lists, we can now implement interesting operations that take a list and a lambda. The map and filter functions (well-known to functional programmers) can be implemented thusly:

    ╔═════╤══════════════════════════════╗        ╔═════╤══════════════════════════╗
    ║  ꜰ  │  filter                      ║        ║  ᴍ  │  map                     ║
    ╟─────┴──────────────────────────────╢        ╟─────┴──────────────────────────╢
    ║  Returns a new list containing     ║        ║  Returns a new list contain-   ║
    ║  only those elements that match a  ║        ║  ing the results of passing    ║
    ║  predicate provided as a lambda    ║        ║  every element through the     ║
    ╚════════════════════════════════════╝        ║  provided lambda function      ║
      ┌───────────────────────┐                   ╚════════════════════════════════╝
      │     ┌─────────┐ ┌───╖ │                     ┌───────────────────────────┐
      │     │ ┌───╖   ├─┤ ‹ ╟─┘                     │       ╓───╖               │
      │     └─┤ › ╟─┐ │ ╘═╤═╝                       ├───────╢ ᴍ ╟───────┐       │
      │       ╘═╤═╝ │ │   └─┬─────────┐             │       ╙───╜       │       │
      │       ┌─┴─╖ │ │     │ ╓───╖   │             │ ┌─────────────────┴───┐   │
    ┌─┴─╖   ┌─┤ · ╟─┘ │     └─╢ ꜰ ╟─┐ │             │ │   ┌───┬─┐           │   │
    │ ꜰ ╟───┤ ╘═╤═╝   │       ╙───╜ │ │             │ │ ┌─┴─╖ └─┘   ┌───╖ ┌─┴─╖ │
    ╘═╤═╝   │ ┌─┴─╖ ┌─┴─╖           │ │             │ └─┤   ╟───────┤ › ╟─┤ ᴍ ║ │
      │     └─┤ ? ╟─┤   ╟───────┐   │ │             │   └─┬─╜       ╘═╤═╝ ╘═╤═╝ │
      │       ╘═╤═╝ └─┬─╜ ┌─┐   ├───┘ │             │   ┌─┴─╖ ╔═══╗ ┌─┴─╖ ┌─┴─╖ │
      │ ╔═══╗ ┌─┴─╖   └───┴─┘ ┌─┴─╖   │             └───┤ ‹ ║ ║ 0 ╟─┤ ? ╟─┤ · ╟─┘
      │ ║ 0 ╟─┤ ? ╟───────────┤ · ╟───┘                 ╘═╤═╝ ╚═══╝ ╘═╤═╝ ╘═╤═╝
      │ ╚═══╝ ╘═╤═╝           ╘═╤═╝                       └─────────────────┘
      └─────────────────────────┘

Observe that both examples use only the first lambda output as they use a Starkov construct to swallow the other one.

The following (partial) code would use the above filter function to extract only the odd numbers from a list.

           (list)
             ↓
           ┌─┴─╖
           │ ꜰ ╟──
           ╘═╤═╝
 ╔═══╗     ╔═╧═╕  ╔═══╗
 ║ 1 ╟──┬──╢   ├──╢ 0 ║
 ╚═══╝ ┌┴┐ ╚═╤═╛  ╚═══╝
       └┬┘   │
        └────┘

This example provides a lambda expression that bitwise-ands its input with 1, thus returning 0 for even numbers, causing those to be filtered out. (The second return value is set to zero as is conventional for single-output lambdas.)

Since the only datatype in Funciton is the arbitrary-size integer, a compliant interpreter must allocate a non-zero integer to every lambda closure the program creates. The lambda expression box returns an integer that identifies the closure, and the lambda invocation box will use the number to identify the lambda closure to invoke. This approach has many advantages; in particular, you can automatically have lists of functions. The integers returned are required to be non-zero as a convenience so that the user code can still use the number 0 to mean null or false in cases where a lambda is optional.

Now that we have lambda expressions, we can do some of the things that are typical of functional programming languages. For example, if we take two lambdas that each produce only one output, we can implement function composition easily:

           ╓───╖
      ┌────╢ ∘ ╟────┐
      │    ╙───╜    │
      │   ┌───┬─┐   │          ╔════════════════════════╗
      │ ┌─┴─╖ └─┘ ┌─┴─╖ ┌─┐    ║  Function composition  ║
      └─┤   ╟─────┤   ╟─┴─┘    ╟────────────────────────╢
        └─┬─╜     └─┬─╜        ║  (f∘g)(x) = f(g(x))    ║
          │ ╔═══╗ ╔═╧═╕        ╚════════════════════════╝
          │ ║ 0 ╟─╢   ├─┐
          │ ╚═══╝ ╚═╤═╛ │
          │       └─┘   │
          └─────────────┘

Lazy sequences

Another exciting use-case for lambda expressions is lazy sequences (also known as iterators or enumerables).

First, we decide to use the number 0 to represent an empty sequence. The lambda expression feature specifically allocates only positive integers to represent lambdas for exactly this kind of purpose. This way we can always use the ? function to test if a sequence is empty.

Second, we make use of the ability for lambda expressions to return two outputs. The first output returns the first element (the head), while the second returns the rest of the sequence (the tail). If the rest of the sequence is empty, this is 0; otherwise, it’s another lambda. This way, we can retrieve element after element simply by calling the lambdas until we get 0.

To illustrate this, let’s write a function that takes an input value and returns a lazy sequence containing just that value and nothing else:

           ╓───╖
           ║ ⌑ ║
           ╙─┬─╜
     ╔═══╗ ╔═╧═╕  ┌─┐
     ║ 0 ╟─╢   ├──┴─┘
     ╚═══╝ ╚═╤═╛
             │

The function’s argument goes into the lambda expression’s first output, making it the head of the sequence. The second output is set to 0, which indicates the end of the sequence. The completed lambda expression is returned as the function’s output. (The lambda expression doesn’t care what value you pass into it when you call it, so it uses a Starkov construct to nullify its input.)

We can go further than this. If we wanted a sequence containing lots of repetitions of a value, we could concatenate sequences like the above, but there’s a more elegant way. Consider this function:

           ╓───╖
           ║ ⁞ ║
           ╙─┬─╜
           ╔═╧═╕  ┌─┐
         ┌─╢   ├──┴─┘
         │ ╚═╤═╛
         └───┴──┐
                │

This is almost the same function as above, only instead of the 0 (which would indicate the end of the sequence) it returns another copy of the same lambda. This means that every time you call the lambda, you get the value and another copy of the same lambda. You keep getting the same value and never reach the end of the sequence. We have created an infinite sequence that repeats a value indefinitely.

Since the sequences are lazy — meaning the next element in a sequence is not evaluated until you ask for it by invoking the lambda — they do not actually have to terminate. You only need a terminating sequence if your code actually traverses the whole sequence until it encounters the 0 that indicates its end.

To round this up, let’s take a look at the truncate function. You could use this to turn an infinite sequence into a finite one for output.

         ┌────────────┬───────────┐
       ┌─┴─╖          │  ╓───╖    │
     ┌─┤   ╟─────┐    └──╢ ȶ ╟──┐ │
     │ └─┬─╜   ┌─┴─╖     ╙───╜  │ │
     │   └─────┤ · ╟────┐       │ │
     │         ╘═╤═╝    │       │ │    ╔═════════════════════════════╗
     │   ┌───┐ ┌─┴─╖  ╔═╧═╕     │ │    ║  truncate                   ║
     │  ┌┴┐  │ │ ȶ ╟──╢   ├─┬─┐ │ │    ╟─────────────────────────────╢
     │  └┬┘  │ ╘═╤═╝  ╚═╤═╛ └─┘ │ │    ║  Truncates a lazy sequence  ║
     │ ┌─┴─╖ └───┘    ┌─┴─╖     │ │    ║  after at most n elements   ║
     │ │ ♯ ║ ┌────────┤ · ╟───┐ │ │    ╚═════════════════════════════╝
     │ ╘═╤═╝ │        ╘═╤═╝   ├─┘ │
     │  ┌┴┐  │        ┌─┴─╖   │   │
     │  └┬┘  │   ┌────┤ ? ╟───┘   │
     │   └───┘   │    ╘═╤═╝       │
     │         ╔═╧═╗  ┌─┴─╖       │
     └─────────╢ 0 ╟──┤ ? ╟───────┘
               ╚═══╝  ╘═╤═╝
                        │

If the input sequence is empty, or n = 0, return an empty sequence. In all other cases, return a lambda expression that will retrieve the head of the original sequence and then call truncate recursively with n decremented. Eventually this will reach 0 and the sequence terminates.

Observe how this will evaluate each element of the original sequence only if and when the new sequence is evaluated. All the other familiar functional operations — such as map, filter, zip, etc. — can be implemented similarly, allowing them to be lazy and work on infinite sequences — unlike the map and filter function we defined for lists above, which always evaluate the entire list.

All functions

The interpreter comes with an extensive library of useful functions, all listed here:

Fundamentals

· cross-nop (used to cross wires, otherwise a no-op)
< / > / ≤ / ≥ less than / greater than / less than or equal to / greater than or equal to
= / ≠ equal to / not equal to
≷ / ⋛ / ⪌ in range (exclusive / min-inclusive max-exclusive / inclusive)
≶ / ⋚ / ⪋ out of range (exclusive / min-exclusive max-inclusive / inclusive)
? / ‽ conditional operator (if c≠0 / c, then y, else n)
function composition

String handling

The number 0 represents the empty string.

string length
string concatenation
ʃ substring (string, index, length)
ʘ index of first occurrence of a substring, or −1 if not found (haystack, needle)
repeat string a number of times (string, number)
reverse string
ǂ single split: given a separator, return everything before and after its first occurrence
parse string containing decimal digits into an integer
convert integer to string

Arithmetic

<< / >> shift left / shift right
^ bitwise XOR (exclusive or)
increment (plus one)
decrement (minus one)
+ addition (plus)
unary minus (negative)
| absolute value
× multiplication (times)
÷ integer division (round down)
% modulo (negative for negative a; the way most programming languages define it)
mod modulo (always non-negative for positive b; the way mathematicians define it)
÷% division and modulo (two-input, two-output function)
exponentiation (to the power of) (up-arrow notation)
! factorial
A Ackermann function

List handling

The number 0 represents the empty list.

shift returns head and tail
unshift returns result list only
« push adds element to end of list
» pop returns leg and foot
[ indexing returns nth element and its bit offset
] reverse indexing returns (length − n)th element’s index and bit offset
replace element set nth element to a new element
index of returns index of first element that matches a predicate (or −1)
last index of returns index of last element that matches a predicate (or −1)
aggregate (fold) computes a single value by iteratively applying a lambda to each element
ʙ bit length returns bit length of all items in list
concat concatenates two lists
ʗ count counts the number of elements that match a predicate
distinct removes duplicate elements from a list
ɛ indexing returns nth element
ɜ reverse indexing returns (length − n)th element
extract sublist takes index and length, works just like substring
filter returns a list containing only elements that match a predicate
ɢ index of index of first occurrence of element (or −1)
ʛ last index of index of last occurrence of element (or −1)
ʜ shift like ‹, but returns head bitlength and tail
ɪ insert insert element at index
ʟ length returns number of items in list
map passes every element through a lambda and returns the new list
ʀ remove remove element at index
reverse reverse list
ʁ bit offset bit offset of nth element
bit offset (reverse) bit offset of (length − n)th element
unshift returns unshifted list and bitlength of new element
ʏ flatten concatenates all the lists in a list of lists into a single list

Lazy sequences

The number 0 represents the empty sequence. Everything else is a lambda expression that returns the head element and the tail. The tail is another lazy sequence, or 0 if the sequence ends here.

return monadic return (lazy sequence containing a single value)
one repeat a value indefinitely
two alternate two values indefinitely
three cycle three values indefinitely
count up all integers starting at x
count up n consecutive integers starting at x
ʭ concatenate concatenate two sequences
ʬ append append a single element to the end of a sequence
equal −1 if two sequences are equal, 0 otherwise
pairs maps a function over the Cartesian product of two sequences
index of index of first element satisfying a predicate (or −1)
last index of index of last element satisfying a predicate (or −1)
first or default first element or default value if sequence empty
last or default last element or default value if sequence empty
min find the minimum value (smallest integer)
max find the maximum value (largest integer)
all −1 if all elements satisfy a predicate, 0 otherwise
ɓ bind monadic bind (map + concatenate)
ɕ count number of elements
ƈ contains −1 if element is in the sequence, 0 otherwise
ɗ default adds an element to a sequence only if it is empty
exists −1 if any element satisfies a predicate, 0 otherwise
element retrieve element by its index
ƒ filter return only elements that match a predicate
ʩ fold/aggregate computes a value by iteratively applying a lambda to each element
ɠ group group by the results of a selector (result is a sequence of sequences)
ɱ map pass every element through a lambda
omit omit (remove) element by its index
ƥ permutations compute all permutations (result is a sequence of sequences)
product product of all elements in a sequence
ɹ reverse reverse a lazy sequence
ʂ subsequences compute all subsequences (result is a sequence of sequences)
ʆ sort asc ascending sort according to the results of a selector
ƪ sort desc descending sort according to the results of a selector
sum sum of all elements in a sequence
ʓ skip remove the first n elements
ȶ truncate stop after at most n elements
ʑ zip pass every pair of same-index elements through a lambda

Additional functions

converts a lazy sequence to a list
converts a list to a lazy sequence
ǁ lazy string split: lazily separate a string at every occurrence of a substring
list string split: separate a string at every occurrence of a substring and return a list
ʝ string join: concatenate sequence of strings using separator
string join: concatenate list of strings using separator
convert string to a lazy sequence of its characters
lazy sequence of primes up to n
lazy sequence of prime factorization of n (empty for n < 2)
λFibo Fibonacci numbers as an infinite lazy sequence
Fibo Fibonacci function (full integer range)
Lucas Lucas function (full integer range)

Example programs

See also

External resources