Geo

Geo is a simple interpreted toy programming language. I, b_jonas created it in 2003 for a programming course that required implementing a parser using yacc and lex. I published it to the world around 2006.

Language
The geo language is dynamically typed. Values can be integer or array. Array values are always handled by reference, can be mutated in place, and are reference counted. There's some simple arithmetic and printing integers available. Arrays are automatically extended to as long as you need as if they contained an infinity of zero integer values.

Conditionals and looping is supported via a somewhat strange syntax stolen from the scan language:  breaks out of the innermost parenthesized expression if   is true and makes that parenthesized expression return the value of   in that case. When the condition is false,  is not evaluated. The  part can be omitted, in which case a zero integer is used. is similar but the condition is negated. There are two kinds of parenthesized expressions: ordinary ones like  which execute the statements and the expression in order and return the value of the expression, and loops like   which run the statements in an infinite loop. The consequence of these rules is that  is an if-else conditional expression;   is a while loop; and   is a do-while loop.

Variables are implicitly created when you refer to them, and they are global except when you use the  keyword, which localizes them to the innermost parenthesized expression. In assignments, the lvalue can be any of a single variable, an indexed array, or a bracketed array of lvalues. You can thus explode the contents of an array to individual variables eg. or do parallel assignments eg..

I was supposed to add user-defined first-class functions, but I abandoned the interpreter before I implemented them.

Interpreter
The interpreter is written in old C++ together with yacc (bison) and lex (flex). The C++ code is ancient and might break on newer optimizing compilers.

The input is parsed to a token list which is stored indefinitely. The yacc grammar doesn't build a parse tree, it executes expressions immediately and remembers only their value. Loops are implemented by rewinding the stored token stream, feeding the yacc parser old tokens. Conditionals are implemented with grammar rules that skip a balanced sequence of tokens without evaluating their meaning. Localized variables are stored on separate stack rather than stored in the parser stack for some unknown reason.

Lvalues can be arbitrarily long and you can't tell in advance whether they are an lvalue or an rvalue. Nevertheless, lvalues and rvalues are parsed with different grammar rules and stored as different types. This would technically make the grammar require an unbounded lookahead, so there's a kludge for converting a partly built lvalue to an rvalue as soon as we find out that it can't be used as an lvalue.

Example
The following code prints prime numbers until 1000 using Eratosthenēs's sieve.

max= 1000; var a= []; var n= 2; ( ( a[n]?; print n; var k= n; ( k<max!; k+= n; a[k]= 1; *); ); ++n<max!; *)

Links
Compiler source code