Geo

From Esolang
Jump to: navigation, search

Geo is a simple interpreted toy programming language. I, b_jonas created it in 2003 for a programming course that required implementing a parser using yacc and lex. I published it to the world around 2006.

Language

The geo language is dynamically typed. Values can be integer or array. Array values are always handled by reference, can be mutated in place, and are reference counted. There's some simple arithmetic and printing integers available. Arrays are automatically extended to as long as you need as if they contained an infinity of zero integer values.

Conditionals and looping is supported via a somewhat strange syntax stolen from the scan language: cond ? expr breaks out of the innermost parenthesized expression if cond is true and makes that parenthesized expression return the value of expr in that case. When the condition is false, expr is not evaluated. The expr part can be omitted, in which case a zero integer is used. cond ! val is similar but the condition is negated. There are two kinds of parenthesized expressions: ordinary ones like (stmt0; stmt1; ...; stmtN; expr) which execute the statements and the expression in order and return the value of the expression, and loops like (stmt0; stmt1; ...; stmtN; *) which run the statements in an infinite loop. The consequence of these rules is that (cond? exprT; exprF) is an if-else conditional expression; (cond!; stmt0; ... stmtN; *) is a while loop; and (stmt0; ...; stmtN; cond!; *) is a do-while loop.

Variables are implicitly created when you refer to them, and they are global except when you use the var keyword, which localizes them to the innermost parenthesized expression. In assignments, the lvalue can be any of a single variable, an indexed array, or a bracketed array of lvalues. You can thus explode the contents of an array to individual variables eg. [var0, var1, var2] = arr or do parallel assignments eg. [var0, var1] = [expr0, expr1].

I was supposed to add user-defined first-class functions, but I abandoned the interpreter before I implemented them.

Interpreter

The interpreter is written in old C++ together with yacc (bison) and lex (flex). The C++ code is ancient and might break on newer optimizing compilers.

The input is parsed to a token list which is stored indefinitely. The yacc grammar doesn't build a parse tree, it executes expressions immediately and remembers only their value. Loops are implemented by rewinding the stored token stream, feeding the yacc parser old tokens. Conditionals are implemented with grammar rules that skip a balanced sequence of tokens without evaluating their meaning. Localized variables are stored on separate stack rather than stored in the parser stack for some unknown reason.

Lvalues can be arbitrarily long and you can't tell in advance whether they are an lvalue or an rvalue. Nevertheless, lvalues and rvalues are parsed with different grammar rules and stored as different types. This would technically make the grammar require an unbounded lookahead, so there's a kludge for converting a partly built lvalue to an rvalue as soon as we find out that it can't be used as an lvalue.

Example

The following code prints prime numbers until 1000 using Eratosthenēs's sieve.

max= 1000;
var a= [];
var n= 2;
(
(
a[n]?;
print n;
var k= n;
(
k<max!;
k+= n;
a[k]= 1;
*);
);
++n<max!;
*)

Links

Compiler source code