PricK
Paradigm(s) | Concatenative |
---|---|
Designed by | User:Olus2000 |
Appeared in | 2023 |
Memory system | Stack-based |
Computational class | Total, primitive recursive |
Reference implementation | Python |
Influenced by | BlooP |
File extension(s) | .prick |
PricK is a minimalistic stack language equivalent in power to primitive recursive functions. Its only looping construct is a bounded loop, which makes sure every PricK program always terminates on any input, making it a total language. It was designed to prove that a bounded loop language can be capable of interpreting itself.
Language overview
PricK's memory consists of an unbounded stack and an unbounded random access memory, both capable of holding natural numbers which are the sole data type in the language. Like most concatenative languages it has very simple syntax based on whitespace-separated tokens:
<program> ::= <definition> <program> | <body> <definition> ::= <body> ':' <identifier> <body> ::= <empty> | <identifier> <body> | <loop> <body> <loop> ::= '[' <body> '|' <body> ']'
Body of the program is the one after the last definition.
PricK memory is initialised with zeros, and PricK stack yields zeros when empty so it can also be viewed as initialised with infinite zeros.
Words
Valid identifiers are all non-whitespace strings other than :
, [
, |
and ]
. Some identifiers initially correspond to builtin functions:
Identifier | Operation |
---|---|
@
|
Replace the number at the top of the stack with the contents of the cell at that address |
!
|
Pop an address and a value from the stack and set the contents of the cell at that address to that value |
#
|
Push a zero to the stack |
++
|
Increment number at the top of the stack |
Loops
The main means of flow control in PricK are bounded while loops. They consist of two code blocks: predicate (between [
and |
) and body (between |
and ]
). Execution of a loop proceeds as follows:
0. One number is popped from the stack and set as the loop's bound;
1. Predicate is executed;
2. One number is popped from the stack as a condition for step 3;
3. If either the condition or the bound is zero execution continues after the loop;
4. Otherwise the body is executed;
5. Bound is decremented and loop execution continues from step 1.
Definitions
Definition is a body followed by an identifier assignment with :
. Code following a definition can use the assigned identifier to call the body of the definition. The body of a definition isn't available under any identifier to itself or the code before it, making it impossible to use recursion which would break the bounded loop restriction. Definitions may redefine identifiers, including the builtin identifiers @
, !
, #
and ++
, but not syntax elements :
, [
, |
and ]
(though these characters are allowed as parts of longer identifiers).
Computational class
PricK is Total and its programs are equivalent to primitive recursive functions.
BlooP translation
PricK's computational class can be shown by showing its equivalence to BlooP. Unlike for Turing Complete languages a translation both ways is required to prove equivalence.
PricK to BlooP
The main hardship in this part is to encode the potentially infinite memory and stack in BlooP, which only allows storage in cells that are explicitly mentioned in code. Fortunately the cells are unbounded, so it is possible to encode series of values as a product of powers of primes, and pairs of numbers using a pairing function.
DEFINE PROCEDURE ''MINUS'' [M,N]: BLOCK 0: BEGIN LOOP AT MOST M TIMES: BLOCK 1: BEGIN IF NOT OUTPUT + N < M, THEN: ABORT LOOP 1; OUTPUT <= OUTPUT + 1; BLOCK 1: END; BLOCK 0: END. DEFINE PROCEDURE ''DIVISIBLE?'' [M,N]: BLOCK 0: BEGIN CELL(0) <= 0; LOOP AT MOST M TIMES: BLOCK 1: BEGIN IF NOT CELL(0) < M, THEN: ABORT LOOP 1; CELL(0) <= CELL(0) + N; BLOCK 1: END; IF CELL(0) = M, THEN: BLOCK 1: BEGIN OUTPUT <= YES; BLOCK 1: END BLOCK 0: END. DEFINE PROCEDURE ''DIV'' [M,N]: BLOCK 0: BEGIN CELL(0) <= 0; LOOP AT MOST M TIMES: BLOCK 1: BEGIN IF CELL(0) + N > M, THEN: ABORT LOOP 1; CELL(0) <= CELL(0) + N; OUTPUT <= OUTPUT + 1; BLOCK 1: END; BLOCK 0: END. DEFINE PROCEDURE ''GET'' [ENCODED,PRIME]: BLOCK 0: BEGIN CELL(0) <= ENCODED; LOOP AT MOST ENCODED TIMES: BLOCK 1: BEGIN IF NOT DIVISIBLE? [CELL(0),PRIME], THEN: ABORT LOOP 1; CELL(0) <= DIV [CELL(0),PRIME]; OUTPUT <= OUTPUT + 1; BLOCK 1: END; BLOCK 0: END. DEFINE PROCEDURE ''POW'' [M,N]: BLOCK 0: BEGIN OUTPUT <= 1; LOOP N TIMES: BLOCK 1: BEGIN OUTPUT <= OUTPUT * M; BLOCK 1: END; BLOCK 0: END. DEFINE PROCEDURE ''SQRT'' [N]: BLOCK 0: BEGIN LOOP AT MOST N TIMES: BLOCK 1: BEGIN IF (OUTPUT + 1) * (OUTPUT + 1) > N, THEN: ABORT LOOP 1; OUTPUT <= OUTPUT + 1; BLOCK 1: END; BLOCK 0: END.
The stack can be represented as a cons list, with head (top of stack) and rest (stack without the top) encoded using Szudzik's ElegantPair function.
DEFINE PROCEDURE ''REST'' [STACK]: BLOCK 0: BEGIN OUTPUT <= SQRT [STACK]; CELL(0) <= MINUS [STACK,OUTPUT * OUTPUT]; IF NOT CELL(0) < OUTPUT, THEN: BLOCK 1: BEGIN OUTPUT <= MINUS [CELL(0),OUTPUT]; BLOCK 1: END BLOCK 0: END. DEFINE PROCEDURE ''TOP'' [STACK]: BLOCK 0: BEGIN OUTPUT <= SQRT [STACK]; CELL(0) <= MINUS [STACK,OUTPUT * OUTPUT]; IF CELL(0) < OUTPUT, THEN: BLOCK 1: BEGIN OUTPUT <= CELL(0); BLOCK 1: END BLOCK 0: END. DEFINE PROCEDURE ''PUSH'' [STACK,ELT]: BLOCK 0: BEGIN OUTPUT <= STACK + ELT + ELT * ELT; IF ELT < STACK, THEN: BLOCK 1: BEGIN OUTPUT <= (STACK * STACK) + ELT; BLOCK 1: END BLOCK 0: END.
Memory can be encoded as a product of powers of primes, with nth prime encoding nth cell and 0th prime being 2. The hard part of this is to find the nth prime, which the following code accomplishes naively by checking each number up to (n+2)^2
for primality and taking the nth one of those. This is a very generous upper bound based on the one listed on wikipedia.
DEFINE PROCEDURE ''PRIME?'' [N]: BLOCK 0: BEGIN CELL(0) <= 2; LOOP AT MOST N TIMES: BLOCK 1: BEGIN IF DIVISIBLE? [N,CELL(0)], THEN: QUIT BLOCK 0; IF CELL(0) * CELL(0) > N, THEN: ABORT LOOP 1; CELL(0) <= CELL(0) + 1; BLOCK 1: END; OUTPUT <= YES; BLOCK 0: END. DEFINE PROCEDURE ''NTH-PRIME'' [N]: BLOCK 0: BEGIN CELL(0) <= 0; OUTPUT <= 2; LOOP AT MOST (N + 2) * (N + 2) TIMES: BLOCK 1: BEGIN IF NOT CELL(0) < N, THEN: ABORT LOOP 1; OUTPUT <= OUTPUT + 1; IF PRIME? [OUTPUT], THEN: BLOCK 2: BEGIN CELL(0) <= CELL(0) + 1; BLOCK 2: END BLOCK 1: END; BLOCK 0: END. DEFINE PROCEDURE ''GET-AT'' [MEMORY,INDEX]: BLOCK 0: BEGIN OUTPUT <= GET [MEMORY,NTH-PRIME [INDEX]]; BLOCK 0: END. DEFINE PROCEDURE ''SET'' [MEMORY,INDEX,VALUE]: BLOCK 0: BEGIN OUTPUT <= DIV [MEMORY,POW [NTH-PRIME [INDEX],GET-AT [MEMORY,INDEX]]] * POW [NTH-PRIME [INDEX],VALUE]; BLOCK 0: END.
With all of this out of the way a translation can be defined. Cells 0 and 1 will hold stack and memory respectively. Cell 2 will be used for a stack of loop bounds. User-defined words will be called with these as arguments, and return a pairing of memory and stack.
Feature of PricK | Translation to BlooP |
---|---|
Definition of word x
|
DEFINE PROCEDURE ''x'' [STACK,MEMORY]: BLOCK 0: BEGIN CELL(0) <= STACK; CELL(1) <= MEMORY; CELL(2) <= 0; ... OUTPUT <= PUSH [CELL(1),CELL(0)]; BLOCK 0: END. |
Main body of the program | DEFINE PROCEDURE ''UNUSED-IDENTIFIER'' [STACK,MEMORY]: BLOCK 0: BEGIN CELL(0) <= STACK; CELL(1) <= MEMORY; CELL(2) <= 0; ... OUTPUT <= PUSH [CELL(1),CELL(0)]; BLOCK 0: END. |
Call to a user-defined word x
|
CELL(3) <= x [CELL(0),CELL(1)]; CELL(0) <= TOP [CELL(3)]; CELL(1) <= REST [CELL(3)]; |
Call to builtin @
|
CELL(3) <= TOP [CELL(0)]; CELL(0) <= REST [CELL(0)]; CELL(3) <= AT [CELL(1),CELL(3)]; CELL(0) <= PUSH [CELL(0),CELL(3)]; |
Call to builtin !
|
CELL(3) <= TOP [CELL(0)]; CELL(0) <= REST [CELL(0)]; CELL(4) <= TOP [CELL(0)]; CELL(0) <= REST [CELL(0)]; CELL(1) <= SET [CELL(1),CELL(3),CELL(4)]; |
Call to builtin #
|
CELL(0) <= PUSH [CELL(0),0]; |
Call to builtin ++
|
CELL(0) <= PUSH [REST [CELL(0)],TOP [CELL(0)] + 1]; |
[ .pred. | .body. ]
|
CELL(2) <= PUSH [CELL(2),TOP [CELL(0)]]; CELL(0) <= REST [CELL(0)]; .pred. CELL(3) <= TOP [CELL(0)]; CELL(0) <= REST [CELL(0)]; CELL(4) <= TOP [CELL(2)]; CELL(2) <= REST [CELL(2)]; LOOP AT MOST CELL(4) TIMES: BLOCK N: BEGIN IF CELL(3) = 0, THEN: ABORT LOOP N; .body. .pred. CELL(3) <= TOP [CELL(0)]; CELL(0) <= REST [CELL(0)]; BLOCK N: END; |
This completes the first half of the proof: BlooP is at least as powerful as PricK.
BlooP to PricK
For the sake of this translation availability of all extensions mentioned in the Language Extensions section is assumed, that is: stack shuffling and math primitives, decimal number support, and auxiliary stack. Main hardship in this part is handling the arbitrary breaking out of loops and blocks. To do that each bock, loop, and if will return a status output of 0
or N 1
indicating that it should break out of block N
. Then the rest of the code after the block will be executed or not based on the status output.
Because each procedure in BlooP uses a predefined amount of CELL
s and an OUTPUT
variable, it is possible to assign each of them a base
offset such that ranges [base,base+max_cell_num+1]
don't overlap for any two procedures. This allows allocation of PricK memory cells to those variables in a way that never has to be saved because no two procedures overlap and a procedure can't call itself.
Feature of BlooP | Translation to PricK |
---|---|
DEFINE PROCEDURE ''X'' [ARG1,ARG2,...,ARGN]: BLOCK M: BEGIN .body. BLOCK M: END. |
N [ 1 | >aux ] 0 base ! 0 .body. 1 [ | drop 0 ] base @ N [ 1 | aux> drop ] : X |
IF EXPR, THEN: BLOCK N: BEGIN .body. BLOCK N: END .after. |
1 [ EXPR | .body. ] 1 swap 1 [ | drop dup N != 1 swap 1 [ | 0 0 0 ] swap drop 0 ] 1 [ | 0 .after. 0 ] |
LOOP AT MOST EXPR TIMES: BLOCK N: BEGIN .body. BLOCK N: END; .after. |
EXPR [ 1 swap 1 [ | drop dup N != dup 1 [ | 0 swap 0 ] swap drop 0 0 ] | 0 .body. ] 1 swap 1 [ | drop dup N != 1 swap 1 [ | 0 0 0 ] swap drop 0 ] 1 [ | 0 .after. 0 ] |
LOOP EXPR TIMES: BLOCK N: BEGIN .body. BLOCK N: END; .after. |
EXPR [ 1 swap 1 [ | drop dup N != dup 1 [ | 0 swap 0 ] swap drop 0 0 ] | 0 .body. ] 1 swap 1 [ | drop dup N != 1 swap 1 [ | 0 0 0 ] swap drop 0 ] 1 [ | 0 .after. 0 ] |
IF EXPR, THEN: QUIT BLOCK N; .after. |
1 [ EXPR | drop N 1 ] 1 swap 1 [ | 0 0 ] 1 [ | 0 .after. 0 ] |
IF EXPR, THEN: ABORT LOOP N; .after. |
1 [ EXPR | drop N 1 ] 1 swap 1 [ | 0 0 ] 1 [ | 0 .after. 0 ] |
CELL(N) <= EXPR |
EXPR N base + ++ ! |
OUTPUT <= EXPR |
EXPR base ! |
X [EXPR1,EXPR2,...,EXPRN] |
EXPR1 EXPR2 ... EXPRN X |
EXPR1 + EXPR2 |
EXPR1 EXPR2 + |
EXPR1 * EXPR2 |
EXPR1 EXPR2 * |
EXPR1 OR EXPR2 |
EXPR1 EXPR2 + |
EXPR1 AND EXPR2 |
EXPR1 EXPR2 * |
EXPR1 > EXPR2 |
EXPR1 EXPR2 - |
EXPR1 = EXPR2 |
EXPR1 EXPR2 != 1 swap 1 [ | drop 0 0 ] |
Access to argument number N , 1-indexed
|
N [ 1 | aux> ] dup N [ 1 | swap >aux ] |
Access to CELL(N)
|
N base + ++ @ |
Access to OUTPUT
|
base @ |
Literal number N
|
N |
YES |
1 |
NO |
0 |
This completes the second part of the proof: PricK is at least as powerful as BlooP, thus proving the equivalence.
Self interpreter
Because there is no universal primitive recursive function PricK and equivalent languages aren't capable of self-interpretation.
Language extensions
Implementations may provide additional builtin functions to make computation more efficient. Here are some propositions for such additions, along with proofs that they don't increase PricK's power.
Basic stack shuffling
By delegating specific cells in memory as scratch variables it is possible to shuffle values on the stack. Basic stack shuffling commands common in other stack languages can be implemented in PricK:
# ++ : tmp0 # ++ ++ ++ : tmp1 # ++ ++ ++ ++ ++ : tmp2 tmp0 ! tmp0 @ : id id tmp0 @ : dup tmp0 ! : drop tmp0 ! tmp1 ! tmp0 @ tmp1 @ : swap tmp1 ! id tmp1 @ tmp0 @ : over tmp2 ! swap tmp2 @ swap : rot
Maths
Base PricK provides only the incrementation operator, but many other mathematical functions are well within its power, although at a cost of significant performance loss which is why they are expected to be provided as builtin optimised functions:
[ # ++ | ++ ] : + # swap # swap [ # ++ | swap drop dup ++ ] drop : -- [ # ++ | -- ] : - tmp0 ! tmp1 ! # tmp0 @ [ # ++ | tmp1 @ + ] : * tmp2 ! # swap id ++ tmp0 @ [ tmp2 @ - dup | swap ++ swap ] drop : / over over swap - rot rot - + : !=
An interesting side effect of totality of PricK is that the slightly altered naive division algorithm presented here won't loop forever on division by zero but instead will return the dividend as if division by one occured.
Numbers
An expected convenience feature is defining all numeric identifiers to be functions that push the corresponding numbers to the stack. There's no way to define an infinite amount of identifiers in base PricK, but the following defintions facilitate a sort of space-separated decimal notation:
# ++ ++ ++ ++ ++ dup + * : 0 0 ++ : 1 0 ++ ++ : 2 0 ++ ++ ++ : 3 0 ++ ++ ++ ++ : 4 0 ++ ++ ++ ++ ++ : 5 0 ++ ++ ++ ++ ++ ++ : 6 0 ++ ++ ++ ++ ++ ++ ++ : 7 0 ++ ++ ++ ++ ++ ++ ++ ++ : 8 0 ++ ++ ++ ++ ++ ++ ++ ++ ++ : 9 # 2 1 3 7 : 2137
Auxiliary stack
To facilitate arbitrary stack shuffling it's useful to have a second stack where data is stored temporarily. Addition of such stack in an implementation won't expand PricK's capabilities since it can be emulated in memory, for example by using even memory cells for it, and leaving odd memory cells for normal RAM use. This way RAM remains unbounded and the new stack is unbounded as well.
# 7 : asp asp @ dup -- -- asp ! @ : aux> asp @ ++ ++ dup asp ! ! : >aux asp @ @ : aux@ dup + ++ @ : @ dup + ++ ! : ! # : tmp0 # ++ : tmp1 # 2 : tmp2 # 3 : asp
Code examples
For clarity examples in this section assume that facilities implemented in the Language Extensions section are implemented.
Fibonacci numbers
The naive recursive implementation requires some significant changes to turn it into a bounded loop which makes it non-trivial, so a different approach is presented here. Input and output is one number on the stack.
tmp0 ! # # ++ tmp0 @ [ # ++ | over + swap ] drop
Compact variant
To make parsing and interpretation simpler a compact version of PricK was conceived: in it each character is a separate token no matter if it's whitespace or not, and the builtin increment function is represented by the character +
. Undefined tokens (characters) are treated as no-ops, so whitespace can still be used for code clarity until it's assigned a body.
Definitions from Stack shuffling, Maths and Numbers sections translated to compact PricK:
#+:a #+++:b #+++++:c a!a@:i ia@:% a!:. a!b!a@b@:~ b!ib@a@:o c!~c@~:r [#+|+]:p #~#~[#+|~.%+].:- [#+|-]:m a!b!#a@[#+|b@+]:* c!#~i+a@[c@-%|~+~].:/ #+++++%p*:0 0+:1 0++:2 0+++:3 0++++:4 0+++++:5 0++++++:6 0+++++++:7 0++++++++:8 0+++++++++:9 #2137:J