FUnctional staCK

From Esolang
Jump to navigation Jump to search

FUnctional staCK is a minimal staCK based programming language created by User:CatCatDeluxe that shares some traits with FUnctional languages. For short, it can be called FUCK, a perfectly innocuous name.

Despite the perfectly innocuous name, this is actually quite a serious project, with an interpreter in Zig currently in the works. The interpreter is also for a school project, where this language will probably be referred to by a better name.

FUCK's main features are:

  • A global stack, the main method of passing around values.
  • Local variable declaration with pattern matching
  • First class functions, which can capture local variables

The three data types in FUCK are functions, numbers (64-bit floats), and symbols.

Comments are denoted by --, like in lua. This may cause syntax confusion with the rules for names, but at least it looks nice.

Examples

Cat program

-- read characters until a newline
{getch! (10: | @!)}!
-- print the stack in reverse order
{val: @! val putch! | ()}!

Truth machine

get-num!
(
| 1: {1 print! @!}!
| 0: 0 print!
)

Features

Primitives

Numbers

Numbers are 64-bit floating point values. A numeric literal is pushed by simply writing it. Exponential syntax is supported.

1
2.0
1e100

Symbols

A symbol is a unique identifier, written by an apostrophe followed by a name.

'symbol
'symboly-symbol
'underscore_symbol

A symbol can also be an operator name. The rules for what a symbol can be called are exactly the same as for regular identifiers. See more info in Names.

'+
'*+===+++==*

During compilation, symbols are simply turned into integer IDs. If two identifiers are considered equal, two symbols will also be equal. This means that the following are all equal:

'symboly-symbol
'symboly_symbol
'symbolySymbol

(These rules are explained in more detail in Names.)

Match statements

Match statements are one of the two methods of control flow in FUCK. They are composed of one or more branches, which each contain zero or more checks/declarations. Each branch can to store values to locals and perform checks to decide whether to execute or not.

Match statements are written using :. A match statement spans the entire enclosing block. The names and/or checks are placed before the colon, and the code to execute is placed after.

The below example is a match with one branch and no checks. In its single branch, the local a = 1 and b = 2. After executing, the stack would be 2, 1, since the locals were pushed in reverse order.

1 2
(a b: b a)

The following code will pick the second branch, because there are not enough items on the stack for the first one. After this program executes, there will be no values on the stack, because they were consumed and stored in d and e.

1 2
(a b c: c b a | d e: )

If no branches in a match can execute, an error is emitted:

1
(a b c: c b a | d e: ) -- error! no branch accepts just one value

An empty branch can be added to suppress the error. It must be added at the end, because the first branch to pass all checks will be the one to execute. After this program executes, the stack is 1, since the branch that executed did not consume any values.

1
(a b c: c b a | d e: | ()) -- nothing 

note: The branch cannot be completely whitespace (e.g. the second branch of (a: |)), because branches that are completely empty are ignored. You could keep that branch from being ignored by adding a colon (a: | :) or an empty block (a: | ()).

Shadowing

Name shadowing is allowed. Declaring a name in a branch will shadow names from upper scopes.

Checks

Instead of just naming locals, a branch may also include checks, which are regular code inside of round brackets. Instead of consuming a value and assigning it a name, the value is moved to a temporary second stack, where code contained inside of the brackets is run. The check passes if, after this code, the temorary stack is non-empty and its top value is truthy.

1 2
-- As long as the second-to-top value = 1, the top value is added back on.
((1=) b: b)
-- stack = [2]

Writing an identifier i before a branch's : is shorthand for (i=):

1 'sym 2
-- both are identical, but one is nicer to look at:
((1=) ('sym=) x: x)
(1 'sym x: x)
Special case: name equality

If a name appears twice or more in a branch's checks, the branch's will execute only if each value consumed by that name are all equal. _ does not count, because it does not bind any name at all.

For example, to create your own = function you could write:

{a a: 1 | _ _: 0}

Functions

Functions are created by enclosing code in curly brackets.

{+} -- Adds the top two values

(note: + is just a regular identifier. The only difference between operator-like identifiers and regular ones is that a call instruction is automatically placed after operator-like names.)

The top value of the stack can be called using the character !. Of course, it must be a function or else an error is emitted.

1 2 {+}!
-- stack = [3]

Functions can be combined with matches, and are allowed to capture locals:

1 2 (a b: {b a})
-- stack = [{2 1}]
!
-- stack = [2, 1]
Recursion

The @ character can be used to push the currently executing function onto the stack, which makes control flow via recursion easy. For example, this short function would reverse the stack:

{
| val: @! val
| : -- This empty branch is needed to avoid an error once the stack is empty
}

The only limit to depth of recursion is your available memory. Any call that would be the last operation in a function (with control flow taken into account) is optimized into a tail call.

Function checks

Another kind of check for match branches exists, which can match on the result of a function.

{1 2 3}
({a b c}: b b b)
-- stack = [2, 2, 2]

Function checks can even contain nested checks:

{1 {2 3} 4}
({a {b c} (4=)}: c b a)
-- stack = [3, 2, 1]

To perform a function check, the function is executed in an empty temporary stack. It is only executed once per match block. That means if multiple branches use a function check, it is only run once and the results are saved for the next check. This is important to note for functions with side effects.

In this truth-machine example, get-num (which reads a single number from stdin) is only called once:

get-num
(
| {1}: {1 print! @!}!
| {0}: 0 print!
)

Name equality applies across levels to function checks, too:

1 {1 {1} 1}
(a {a {a} a}: 1 | _: 0)
-- stack = [1]

Standard library

The FUCK "standard library" contains the following functions, which are defined globally. The behavior of a function when it does not have the correct number or types of arguments is to emit an error and halt execution, unless stated otherwise.

Name Function
= If the top two values are equal, pushes 1. Otherwise, pushes 0.
~= Inverse of =
+ Adds the top two numbers.
-

Subtracts the top two numbers. The top is subtracted from the second-to-top. For example, 1 2 - is -1.

* Multiplies the top two numbers.
/ Divides the top two numbers. The second-to-top is divided by the top. For example, 1 2 / is 0.5.
and Boolean AND
or Boolean OR
not Boolean NOT
getch Blocks until a character is read from stdin.
putch Writes a character to stdout. It would be cool if it printed a unicode character if the number is above 127 but that feature may or may not be implemented.
print Prints a text representation of the top value.

Specifics

Names

Names are used to refer to locals, captures, and possibly global constants. There are two types of names, regular names and operator names. The only difference between the a regular and operator name is that an operator name will automatically insert a call instruction after being referenced, while a regular name will not.


A regular name consists of groups of length 1+ of the characters a-z and 0-9, separated by single delimiter characters, either - or _. Uppercase letters may also separate groups, and are converted to lowercase and become the first letter of a group.

Additional rules:

  • Uppercase letters directly following the start of a group become lowercase and part of the same group.
  • Any leading or trailing delimiters are not included in the name.
  • A name may not start with a digit, because the parser will try to parse a number instead.

Regular names are considered equal when they have the same groups, not when their text is exactly the same.


An operator name consists of any non-alphanumeric, non-reserved, printable ASCII character. The only reserved characters in the language are ()[]{},.|'"#` and whitespace. All other printable ASCII characters are allowed in a name, as long as it follows the other restrictions.


A single underscore will parse to an empty regular name, which is an error when used to try to reference something, but can be used in a declaration to not assign a name to the value.


The syntax might seem ambiguous, but it was carefully designed (at 3am) (to be amusing). It has already been implemented in the interpreter and will not be removed or changed.

To help, here are some examples of what is and isn't a name. Regular names are highlighted in blue, operator names are highlighted in green, and single-underscore names are highlighted in red. Text that is not part of a name is not highlighted. Note that once the language is complete, some of these ambiguous situations will emit a parser warning.

-a-hyphenated-name_ -- Leading and trailing separators are not counted
double--hyphen (this is now a comment)
_______ -- These are all separate names.

Truthiness

A value in FUCK is considered truthy as long as it is not the empty function or the floating point number 0.

Equality

The methods of determining equality for the different data types are as follows:

  • Numbers: Exact floating-point comparison
  • Symbols: Compares IDs. Symbols with the equal identifiers have the same ID. For more info, see Names
  • Plain functions: Functions are equal if their code is the same.
  • Closures: Closures are equal if their code is the same, and their captures are all equal.
  • Builtin functions: Some names, such as +, = or getch, are builtin functions. When pushed, they are wrappers for function pointers. Builtins are equal if their function pointers are equal.

Using this, you can easily test for deep equality of data structures that use closures to store values.

Data structures

Although the language has few features, many data structures can be implemented using closures. Due to the limited feature set, the performance of most of these data structures are not fast, but a lot can be done with the few features included.

Tuple

A tuple can be created simply by writing a function that pushes the values:

{1 2 3}
'sym 4 (x y: {x y})

This is a function to create a tuple of length n. When called it will push all the values at once:

{
| 0: {}
| val n:
    n 1- @!
    (next: {val next!})
}

Linked list

A linked list can be implemented as a function that returns some value, and a the next node of the list, with an empty function marking the last element:

{1 {2 {3 {4 {}}}}

The list can be easily unpacked using pattern matching:

{
| {val next}: val next @!
| {}:
}

The length can be obtained using a similar method combined with an accumulator:

{list:
  0 -- use an accumulator to take advantage of tail calls
  list {
  | {_ next}: 1+ next @!
  | {}:
  }!
}

Finally, a function to create a list of length n can be implemented like this:

{
| 0: {}
| head n:
    (n 1-) @! -- create the tail with n - 1 elements
    (tail: {head tail}) -- compose the list
}

Map

This is not an actual hash map, but it does map keys to values. The map cannot be called directly to retrieve a value, because then modifying it would be impossible.

Create a map (an empty function is an empty map):

{{}}

Get by key, push nothing if not found:

{
| k {k v _}: v
| k {_ _ next}: k next @!
| k {}:
}

Add or modify a pair:

{
| newk newv {newk _ next}: {newk newv next} -- modify an entry
| newk newv {k v next}: newk newv next @! (next: {k v next}) -- search for the next entry
| newk newv {}: {newk newv {}} -- update an empty map
}