Referencement
Paradigm(s) | functional, string-rewriting |
---|---|
Designed by | Hakerh400 |
Appeared in | 2019 |
Computational class | Turing complete |
Major implementations | Interpreter |
File extension(s) | .txt |
Referencement is a functional programming language, which is similar to lambda calculus, but this language also has mechanisms for comparing two expressions and assigning a value to an identifier (which makes it possible for a function to achieve side effects). Instead of working with pointers, this language has a simple notation for representing expressions, so that the computation can easily be done using a pencil and paper.
Introduction
Lambda calculus
Lambda calculus is a mathematical system for manipulation lambda expressions. Lambda expression can be represented as a tree which has three different types of nodes: Identifier, Abstraction and Invocation. Identifier is a node which has no children and the value of an identifier is the string representing its name. Abstraction is a node which has a single child, and the value of Abstraction node is a string representing the name of the Abstraction's identifier. Invocation has two children (the order is important) and no other parameters.
The goal is usually to determine whether two lambda expressions are equal or not. Two expressions are considered to be equal if and only if one can be reduced into another (i.e. can be transformed into another by applying some sequence of reduction rules). There are three types of reductions: alpha, beta and eta reduction. Alpha reductions says that argument name of any abstraction can be changed as long as all identifiers that are it's descendant change their name to the same value (excluding the identifiers that are descendants of another abstraction which has the same argument name and that itself is a descendant of the first abstraction). Beta reduction says that if we have an invocation whose first child is an abstraction, then we can replace that invocation with the child of the abstraction, but also replace all identifiers that have the same name as the abstraction's argument with the second child of the invocation. Eta reduction says that if we have an abstraction which contains an invocation whose the second child is an identifier having the same name as the abstraction's argument and the first child is any node which has no descendants that are identifiers having the same name as the abstraction's argument, then we can replace that abstraction with the first child of the invocation.
This was just a brief introduction to lambda calculus for those who are not familiar with it. We also suggest reading Lambda calculus article on Wikipedia. In this paragraph we explained a way of representing lambda expressions as trees and elements as nodes, rather than operators and operands.
Limitations of lambda calculus
While the lambda calculus is proven to be Turing complete, it significantly imposes the paradigm. All functions are "pure" (they cannot have side effects). All identifiers are constant (actually once an identifier gets a value by applying a beta reduction, the identifier gets replaced by the value, making it disappear from the expression). Performing a recursion is tricky and not trivial to implement in most cases. The order of executing function (applying reductions) is not known in advance, it can be random, but the result is guaranteed to remain the same independently of the order of applying reduction rules. Dynamically defining a variable inside a function (abstraction) is impossible directly, but it can be achieved by encapsulating the function into another function and providing the value as its argument (pattern known as IIFE). There are some languages (like Examinable Invocation Vector) that provide a way for programming directly into lambda calculus and performing I/O by examining the content of the reduced lambda expression, but it is pretty much unusable for programming, not only because of lacking any predefined functions and the need to define everything from scratch (while dealing with non-intuitive paradigm), but also because it is very hard to write a fast interpreter (a trivial implementation may have exponential time and space complexity for a lot of expression patterns).
Closures and references
Closures and references do not exist in original lambda calculus, but they represent the feature most practical programming languages offer. First, we modify reduction rules: we forbid alpha and eta reductions and we limit beta reduction to only the leftmost and innermost pattern where beta reduction can be applied. Then we allow functions to have side effects and each time when we perform beta reduction, we instantiate a new closure. Closure is an object that contains references to the parent closure and to the value that has been passed to it during the closure instantiation.
Then it allows referencing identifiers by searching for the identifier name accross the closure chain. It also allows modifying identifier value, which implies side effects (because we can modify identifier of any closure, even a closure than can be referenced from two other closures).
The problem with this approach is because it requires object instantiation and references (which are usually represented as pointers), making it non-trivial for doing it using a pencil and paper. Programming language Referencement aims to solve this problem - it provides a syntax notation for representing closures and references, so that it can be done on a paper, by following simple reduction rules.
Syntax
Any lambda expression is also a valid Referencement expression (lambda symbols are omitted). However, this language introduces some new things. It is possible to specify whether an argument is passed by a value, or by a reference. For example:
x. &y. x (y x) y
In the example above, we have two abstractions and then we have invocation x
with invocation y
with x
and all that invoked with y
. Note the &
symbol. It means that y
is passed by reference. We will explain later what it exactly means.
There are also two new things: abstraction parameters (not to be confused with arguments) and native identifiers. These two things cannot be contained initially in the user's code, but they can appear during reduction. Here is an example:
5-a-7. [2] a
This example shows an abstraction whose argument is a
and having 0th parameter set to 5
and 1st parameter set to 7
. The [0]
thing is an identifier representing the 2nd native identifier (native identifiers are represented in brackets and are not bound to any abstraction). They are handled in a special way.
There are three different abstraction parameters, labeled as 0th, 1st and 2nd parameter. Each abstraction can have or not have any of them. The 2nd parameter is written in braces instead of the function name. For example:
3-{9}-11. {9}
This is an abstraction whose parameters are 3
, 11
and 9
respectively. The name of the abstraction's argument is the literal string {9}
and it cannot be changed.
User-defined identifiers can consist only of one or more alphanumeric characters and/or underscores.
Program start
When program starts, the following expression:
(&a. b. [0] a b) (&a. &b. &c. [1] a b c) (&a. [2] a) (&a. [3] a) (&a. [4] a)
is concatenated to the initial source code (it implicitly constructs the required number of invocations). Then the reduction rules (explained later) are applied until the program terminates.
I/O
Input and output are sequences of bits. If we work with ASCII text, we construct the input array of bits by converting each byte to bit array of length 8, then reversing bits inside each byte, prepending each bit with 1
and appending infinitely many zeros. For example, ASCII string abc
converts to the following sequence of bits:
111010101011111010111010101111101111101010111110000000000...
Output format is similar, but there is no padding with 1s. The following sequence of output bits
100001100100011011000110
represents the same string abc
.
Reduction rules
Apply the following steps iteratively.
Start from the global (top-level) node (the root node of the whole expression). If it is not an invocation, terminate the program. Otherwise go to the first child (left operand) and keep going left as long as the left node is an invocation. When we find a node that is not an invocation, then if the right node of the current invocation is also an invocation, go right then keep going left and so on, until we find an invocation whose both nodes are not invocations. For example:
(x. (y. y) (z. z)) ((a. a a) (b. b) (c. c))
in this example we will pick (a. a a) (b. b)
according to the rules.
For simplicity, we call the left node of the picked invocation A
and the right node B
. Once we picked the invocation, we decide what we will do with it according to the next two paragraphs.
Extended beta reduction
If the left node of the invocation we picked in the previous step is an abstraction, then we apply this step. First, if B
has no 0th parameter, we assign it the 0th parameter as the lowest non-negative integer that is not the 0th parameter of any other abstraction, except A
(but including its descendants). So, in our example, because no abstractions have 0th parameter at all, we assign 0
to the 0th parameter of B
. Now we have:
(x. (y. y) (z. z)) ((a. a a) (0-b. b) (c. c))
Then if B
has no 1st parameter or A
is not a by-reference abstraction, then assign the lowest non-negative integer that is not 1st parameter of any other abstraction, except A
and B
(but including their descendants). So, since in our example no abstractions have 1st parameter at all, we assign 0
to the 1th parameter of B
:
(x. (y. y) (z. z)) ((a. a a) (0-b-0. b) (c. c))
The next step is to apply the standard beta reduction to the invocation A
with B
:
(x. (y. y) (z. z)) ((0-b-0. b) (0-b-0. b) (c. c))
Then we keep going. In the next iteration, we pick invocation (0-b-0. b) (0-b-0. b)
. We have new A
(which is (0-b-0. b)
) and B
(which is (0-b-0. b)
). Now, B
has 0th parameter, so we don't change it. However, A
is not by reference, so we need to update the 1st parameter of B
. The lowest 1st parameter excluding A
and B
is 0
, so it actually remains unchanged. Then we apply standard beta reduction:
(x. (y. y) (z. z)) ((0-b-0. b) (c. c))
We proceed similarly. Here are the remaining steps:
(x. (y. y) (z. z)) (0-c-0. c) (y. y) (z. z) 0-z-0. z
Once we reached the 0-z-0. z
, no more rules can be applied and we terminate the program.
Native identifiers
If A
is a native identifier, then apply this step. There are five possible native identifiers that can appear. They are [0]
, [1]
, [2]
, [3]
, [4]
. Each one is handled in a special manner.
Identifier [0]
There must be two expressions after it and they must be abstractions. For example, consider this expression:
(0-&a-0. &b. &c. [1] a b c) ([0] (0-&a-0. &b. &c. [1] a b c) (1-&a-1. [2] (0-&a-0. &b. &c. [1] a b c)))
Here the first abstraction after [0]
is (0-&a-0. &b. &c. [1] a b c)
(we call it X
) and the second one is (1-&a-1. [2] a)
(we call it Y
). First, we find the lowest 2nd parameter that is not used by any abstraction. In this case, it is 0
. Then we replace all abstractions that have the same 1th parameter as the 1th parameter of X
(in this case it is 0
) with copies of Y
. If Y
itself contains an abstraction that has the same 1th parameter as X
, then replace it (that abstraction) with identifier {z}
where z
is the lowest 2nd parameter we found (in this case {0}
). Finally, we replace [0] X
with (&{z}. {z} {z})
and we encapsulate Y
into an abstraction with by-reference argument {z}
.
Identifier [1]
There must be three abstractions after it (we call them X
, Y
and Z
respectively). If the 0th parameter of X
is equal to the 0th parameter of Y
then replace [1] X Y Z
with Z Z
, otherwise replace it with Z
. For example, if we have
[1] (0-&a-0. a) (1-&a-1. a) (2-&a-2. a)
it will be reduced to
2-&a-2. a
However, if we had
[1] (0-&a-0. a) (0-&a-1. a) (2-&a-2. a)
it would be reduced to
(2-&a-2. a) (2-&a-2. a)
Identifier [2]
Pattern: [2] X
If the next input bit is 1
replace it (the expression) with X X
, otherwise replace it with X
.
Identifier [3]
Pattern: [3] X
Output bit 0
and replace the expression with X
.
Identifier [4]
Pattern: [4] X
Output bit 1
and replace the expression with X
.
Examples
Simple examples
Suppose the source code is the identity function:
a.a
It is the simplest possible source code (it cannot be empty, so the identity function is the simplest expression we can write). The first thing that the interpreter does is to append wrappers of native identifiers. Here we show step-by-step what happens:
1. (a. a) (&a. b. [0] a b) (&a. &b. &c. [1] a b c) (&a. [2] a) (&a. [3] a) (&a. [4] a) 2. (0-&a-0. b. [0] a b) (&a. &b. &c. [1] a b c) (&a. [2] a) (&a. [3] a) (&a. [4] a) 3. (b. [0] (0-&a-0. &b. &c. [1] a b c) b) (&a. [2] a) (&a. [3] a) (&a. [4] a) 4. [0] (0-&a-0. &b. &c. [1] a b c) (1-&a-1. [2] a) (&a. [3] a) (&a. [4] a) 5. (&{0}. {0} {0}) (&{0}. 1-&a-1. [2] a) (&a. [3] a) (&a. [4] a) 6. (0-&{0}-0. 1-&a-1. [2] a) (0-&{0}-0. 1-&a-1. [2] a) (&a. [3] a) (&a. [4] a) 7. (1-&a-1. [2] a) (&a. [3] a) (&a. [4] a) 8. [2] (0-&a-0. [3] a) (&a. [4] a) 9. (0-&a-0. [3] a) (&a. [4] a) 10. [3] (0-&a-0. [4] a) 11. 0-&a-0. [4] a
In step 8 we read bit 0
and in step 10 we write bit 0
. The final reduced expression have no effects to the program's output (it is irrelevant).
Here is another example. Source code is
a. b. c. d. e. (a. b. b) (a d e) d a
and the steps are:
1. (a. b. c. d. e. (a. b. b) (a d e) d a) (&a. b. [0] a b) (&a. &b. &c. [1] a b c) (&a. [2] a) (&a. [3] a) (&a. [4] a) 2. (b. c. d. e. (a. b. b) ((0-&a-0. b. [0] a b) d e) d (0-&a-0. b. [0] a b)) (&a. &b. &c. [1] a b c) (&a. [2] a) (&a. [3] a) (&a. [4] a) 3. (c. d. e. (a. b. b) ((0-&a-0. b. [0] a b) d e) d (0-&a-0. b. [0] a b)) (&a. [2] a) (&a. [3] a) (&a. [4] a) 4. (d. e. (a. b. b) ((0-&a-0. b. [0] a b) d e) d (0-&a-0. b. [0] a b)) (&a. [3] a) (&a. [4] a) 5. (e. (a. b. b) ((0-&a-0. b. [0] a b) (1-&a-1. [3] a) e) (1-&a-1. [3] a) (0-&a-0. b. [0] a b)) (&a. [4] a) 6. (a. b. b) ((0-&a-0. b. [0] a b) (1-&a-1. [3] a) (2-&a-2. [4] a)) (1-&a-1. [3] a) (0-&a-0. b. [0] a b) 7. (a. b. b) ((b. [0] (1-&a-1. [3] a) b) (2-&a-2. [4] a)) (1-&a-1. [3] a) (0-&a-0. b. [0] a b) 8. (a. b. b) ([0] (1-&a-1. [3] a) (2-&a-2. [4] a)) (1-&a-1. [3] a) (0-&a-0. b. [0] a b) 9. (a. b. b) ((&{0}. {0} {0}) (&{0}. 2-&a-2. [4] a)) ((&{0}. {0} {0}) (&{0}. 2-&a-2. [4] a)) (0-&a-0. b. [0] a b) 10. (a. b. b) ((1-&{0}-1. 2-&a-2. [4] a) (1-&{0}-1. 2-&a-2. [4] a)) ((&{0}. {0} {0}) (&{0}. 2-&a-2. [4] a)) (0-&a-0. b. [0] a b) 11. (a. b. b) (2-&a-2. [4] a) ((&{0}. {0} {0}) (&{0}. 2-&a-2. [4] a)) (0-&a-0. b. [0] a b) 12. (b. b) ((&{0}. {0} {0}) (&{0}. 2-&a-2. [4] a)) (0-&a-0. b. [0] a b) 13. (b. b) ((1-&{0}-1. 2-&a-2. [4] a) (1-&{0}-1. 2-&a-2. [4] a)) (0-&a-0. b. [0] a b) 14. (b. b) (2-&a-2. [4] a) (0-&a-0. b. [0] a b) 15. (2-&a-1. [4] a) (0-&a-0. b. [0] a b) 16. [4] (0-&a-0. b. [0] a b) 17. 0-&a-0. b. [0] a b
In step 16 we output bit 1
.
Hello, World!
0.1.2.3.4.(5.6.5 6)(7.8.9.0 9(0.0 4 3 9)7 7 7 8 7 7 8 7 8 7 8 7 7 8 8 7 7 7 8 8 7 8 8 7 7 7 8 8 7 8 8 7 8 8 8 8 7 8 8 7 7 7 8 8 7 8 7 7 7 7 7 7 7 8 7 7 8 8 8 7 8 7 8 7 8 8 8 8 7 8 8 7 7 8 7 7 8 8 8 7 7 7 8 8 7 8 8 7 7 7 8 7 7 8 8 7 8 7 7 7 7 8 7 7)(0.1.1)(0.1.0)0
Cat program
(&0.&1.&2.(3.4.5.6.7.(8.(9.(a.(b.(c.1(0(2 c(d.e.e))(2 b(d.e.d))(2 a(f.g.f 1(h.1(0(g 1))(a f g))c 1)) (2 9(h.(i.1(0(2 i c)(5(h.2 i b)))i)1))(2 8(i.i 7 6 1)))(a 9(h.8(9 1))))1)1)1)1)1)2)((&d.d d)(&d.&e.d d))(&d.&e.e)
Invert bits
(&0.&1.&2.(3.4.5.6.7.(8.(9.(a.(b.(c.(d.1(0(2 d(e.f.f))(2 c(e.f.e))(2 b(g.g d c))(2 a(h.i.h 1(j.1(0(i 1))(a h i))d 1))(2 9(j.(g.1(0(2 g d)(5(j.2 g c)))g)1))(2 8(g.g 7 6 1)))(a 9(j.8(b(9 1)))))1)1)1)1)1 )1)2)((&e.e e)(&e.&f.e e))(&e.&f.f)
Reverse bits
(&0.&1.&2.(3.4.5.6.7.(8.(9.(a.(b.(c.(d.(e.(f.(g.(h.(i.(j.1(0(2 j(k.j))(2 i(l.m.m))(2 h(l.m.l))(2 g(n .n i h))(2 f(l.m.(o.1(0(2 o i)(4 l m(k.2 o h)))o)1))(2 e(p.q.p(k.1(0(q 1))j)(k.l.m.m 1)1))(2 d(l.l)) (2 c(p.q.p 1(k.1(0(q 1))(c p q))i 1))(2 b(l.m.r.s.(o.1(0(2 o i)(e r(k.t.1(0(e s(k.2 m t)))(d(k.2 l t )))d(k.2 o(s m l))))o)1))(2 a(k.(n.1(0(2 n i)(5(k.2 n h)))n)1))(2 9(n.n 7 6 1))(2 8(b i i))(c a(k.2 8(b(a 1)8))))(c(k.g(f(8 i h)i))(k.1(0(9(8 i i)))(2 8(8 i h)))))1)1)1)1)1)1)1)1)1)1)1)1)2)((&l.l l)(& l.&m.l l))(&l.&m.m)