C()

From Esolang
Jump to navigation Jump to search

C() (pronounced C called) is a sketch for an esoteric programming language by User:Rdococ that tries to approximate purely functional programming at a low level.

Read 'C' as 'generic low-level language'. :)

Ideas

C() is a thought experiment to answer the question, "Can there exist a language that is purely functional, low-level and C-like?". I define the terms in the following ways:

  • Low-level means that a language closely matches hardware machine code; it does not abstract too far away from it.
  • Purely functional means that a language models some side effects explicitly, and abstracts all others away as optimizations.
  • C-like means that the language 'feels' like it is similar to C in the same way that C++ is. This means that it should target the von Neumann computer architecture, be performant, and it should also allow the programmer to shoot themselves in the foot.

The two first definitions are somewhat in contention with each other. The language must simultaneously not abstract too much, and also abstract away invisible side effects.

My approach was to create a low-level language that requires side effects to be explicitly modelled in a similar fashion to Rust. In keeping with C-likeness, the ownership checking in C() is not comprehensive nor safe.

Changes

Firstly, a miscellaneous change: the main() procedure can now return a value of any type. This value is the 'return value' of the program, and implementations may choose to print it to standard output.

Expressions

In C(), there are no statements within functions. All function bodies are single expressions, the value of which is returned from the function.

int add(int x, int y): x + y;

The meaning of the = operator has been changed. a.b = c returns a copy of a with the mutation made. With pointers, *a.b = c updates the value of *a in place and constitutes a lend (see the section on pointers below). Constants can still be declared as usual with const.

struct Point {
  int x;
  int y;
};
Point mkpoint(int x, int y): Point {x, y};
Point movedUp(Point p): p.y = p.y + 1;

Point main(): movedUp(mkpoint(2, 3));

Pointers

There are owning and non-owning pointers, which are classified as different types. Owning pointers can be implicitly cast into non-owning pointers, but not vice versa.

C() borrows the new operator from C++ and uses it to create pointers to values, including primitives and structs. new returns an owning pointer.

Owning pointers use the symbol ! rather than *, though * is the dereferencing operator for both.

Point !p1 = new Point;

Owning pointers can be reused for new values with the = operator. Specifically, *a.b = c updates *a in-place and returns the pointer, still owned.

Point !moveRight(Point !p): *p.x = *p.x + 1;
Point main(): *moveRight(lend new Point {2, 3});

There are three ways you can pass owning pointers into functions.

  • f(p) - This gives the called function an unowning pointer. Using = to update the pointer results in a compile-time error.
  • f(lend p) - This gives the called function an owning pointer. = can update the value under the pointer.
  • f(give p) - This gives the called function an owning pointer, and also frees the pointer after the call returns.

Unowned pointers can only be passed into functions the first way.

It is a compile-time error to pass in a pointer multiple times if you are lending, giving or updating it. However, this is the extent of C()'s "borrow checking". If you store the pointer in a struct and pass that, the pointer is still usable and this results in undefined behaviour. You are simply notifying C() that you are using the pointer in certain ways, and it is your responsibility to ensure that you are.

If a value is given into a function and then given into another function, it will be freed multiple times, causing undefined behaviour. 'Lend' should be used for this purpose instead. Both together are required to recreate the behaviour of uniqueness typing in other purely functional languages.

The order of evaluation in functions is undefined. Engineering a situation where a value is pointed to by multiple owning pointers and updated in place is undefined behaviour.

Arrays

Arrays behave in the same way as pointers, as they are essentially syntax sugar for pointers in C. new T[N] creates an array pointer. !T[] refers to the type of owning array pointers, and T[] for unowning ones.

Functions

No purely functional language is complete without first-class functions. C() supplements C's function pointers with true first-class functions and simplifies the syntax.

!int(int) addTo(int x): (int (int y): x + y);

First-class functions capture variables lexically by value, but those variables can themselves be pointers. First-class functions are themselves pointers to accommodate for the variable sizes of lambdas with differing captured variables, and can be owned or unowned.

The type of a first-class function is written as !T(U, V, ...) or *T(U, V, ...), depending on the types the function takes and returns, and whether the function is owned or not. This type does not include the types of lexically captured values.

First-class functions can be called regardless of whether they are owned or unowned. If an owned first-class function is called with the syntax (give f)(...), it will be freed after the call returns.

Conclusion

With the most straightforward implementation, you can bypass C()'s checks to perform side effects. Alas, we are attempting to square a circle. Our result is an allegedly "pure" functional language with unsafe and insufficiently checked uniqueness typing, as well as C-like syntax. If anything, it's the functional equivalent to C++.

Examples

Add examples of your own if you feel bored.

Vector arithmetic

struct Vector {
	float x;
	float y;
}

Vector !translate(Vector !a, Vector *b): *a = {*a.x + *b.x, *a.y + *b.y};
Vector !scale(Vector !a, float b): *a = {*a.x * b, *a.y * b};

Higher order functions

Arrays can be used in a similar way to linked lists if you perform pointer arithmetic. The map function here recurses by modifying the values array it owns in place, performing pointer arithmetic to get the "tail" of the array, mapping that, and then undoing the pointer arithmetic to return the modified head.

!float[] map(float(float) f, !float[] values, int size): size == 0 ? values : map(f, lend ((values[0] = f(values[0])) + 1), size - 1) - 1;
float combine(float(float, float) f, float[] values, int size): size == 0 ? 0 : (size == 1 ? values[0] : f(values[0], combine(f, values + 1, size - 1)));