Zaddy

From Esolang
Jump to navigation Jump to search
Zaddy
Designed by Corbin
Appeared in 2025
Computational class pushdown automata
Reference implementation [1]
Influenced by CHR, META II, OMeta, Parsley, Tmg, Zephyr ASDL

META Restricted CHR(Zaddy), or Zaddy for short, is a living language for describing automatically-described compilers. It is described via its own metacompiler, which compiles itself to RPython. Zaddy was forked from META II following Neighbors' approach.[1]

Formally, Zaddy is the smallest language accepted by the reference implementation and compiling itself to a fixed point, rather than a language generated by some grammar.

Packrat PEG parsing

Zaddy implements Ford's Packrat algorithm for fast recursive-descent parsing with backtracking.[2]

Tree algebras

Zaddy supports definitions of tree algebras using inline Zephyr ASDL. For example, the following abstract syntax models a Python-like indentation-sensitive basic-block target language:

.signature py
stmt = Statement(string line)
     | Compound(string head, stmt* block)
     | Conditional(string test, stmt* block)
     | Handler(stmt* block, stmt* handler)

Zephyr tree algebras provide Herbrand universes over strings and lists. A homomorphism between tree algebras can be defined as a functor; the recursion scheme will always be a hylomorphism.

.functor py
Statement(line)          -> [.line(line)] ;
Compound(head, block)    -> .if (block)
                            .then [.line(head ':'), .block(block)]
                            .else [.line(head ': pass')] ;
Conditional(test, block) -> .if (block)
                            .then [.line('if ' test ':'), .block(block)]
                            .else [] ;
Handler(block, handler)  -> .if (block) .then
                                .if (handler)
                                .then [.line('try:'), .block(block), .line('except ParseError:'), .block(handler)]
                                .else (*block)
                            .else [] ;

This functor uses the builtin tree algebra, which is specialized for writing line-oriented languages with indented blocks, for output. This builtin algebra has the signature:

.signature builtin
builtin = Line(string) | Block(builtin*)

Constraint Handling Rules

Zaddy embeds Constraint Handling Rules (CHR), including a pure fragment of Prolog, for describing non-homomorphic rewriting of tree algebras. For example, the following rules implement the greater-than and maximum relations over natural numbers:

.rules
gte(N, zero).
gte(N, M) :- succ(X, N), succ(Y, M), gte(X, Y).
maxLeft   @ max(N, zero, M) <=> N = M.
maxRight  @ max(N, M, zero) <=> N = M.
maxSucc   @ succ(X, L), succ(Y, N), succ(Z, M) \ max(L, N, M) <=>
            max(X, Y, Z).

Building on top of those relations, the following rules[3] implement an efficient union-find data structure with user-defined equivalence:

.rules
make      @ make(A) <=> root(A, zero).
union     @ union(A, B) <=> find(A, X), find(B, Y), link(X, Y).
findNode  @ branch(A, B), find(A, X) <=> find(B, X), branch(A, X).
findRoot  @ root(A, _) \ find(A, X) <=> X = A.
linkEq    @ link(A, A) <=> true.
linkLeft  @ link(A, B), root(A, N), root(B, M) <=>
            gte(N, M) | branch(B, A), succ(M, M1), max(R, N, M1), root(A, R).
linkRight @ link(B, A), root(A, N), root(B, M) <=>
            gte(N, M) | branch(B, A), succ(M, M1), max(R, N, M1), root(A, R).

The reference implementation compiles CHR to a miniKanren-style persistent constraint store[4] encapsulated in a simple logic monad.[5] Rules refine the constraint store iteratively in a process similar to the loop-oriented compilation of CHR(Java).[6]

Binding-time Hoare logic

Zaddy unifies the concepts of wikipedia:Hoare logic and binding-time improvement[7] into a single class of semantic actions. The parser becomes an abstract interpreter over a specified semantic domain.

For example, consider the following construction from Zaddy's reference implementation, where a parsing flag "pf" controls some internal state. We can model the flag as a Boolean value:

.domain
pf : bool ;

Then, we may annotate the parser with Hoare triples. The annotations must be precise; Zaddy will compile them down to approximations which will be automatically simulated. For example, to set the flag, we have a precondition that the flag is the complement of being enabled (and so may be either disabled or unknown) and a postcondition that the flag is enabled:

SET = .pre { ~+pf } { .out('self.pf = True' .nl) } .post { +pf } ;

This can be seen as a Hoare specification of Schorre's original "SET" assembly instruction from META II. When compiled, Zaddy will generate optimizing parsers which only set the parse flag when it must. This can create Jones improvements — binding-time improvements in the generated parsers and their generated outputs. For example, in this fragment, an error guard is removed if it statically cannot be entered, and since the .error() method happens to throw an exception, we can know that the flag is set:

.pre { ~+pf } { .out('if not self.pf: self.error()' .nl) } .post { +pf }

We can perform a similar improvement when parsing tokens that cannot fail:

$( TX3 .pre { ~+pf } { .out('if not self.pf: return' .nl) } .post { +pf } )

For correctness, the parsing machine can reset its abstract domain with the .top command, as shown here at the top-level rule for tokens:

TR = ID .out('def parse' * '(self):' .nl .lm+) ':' .top TX1 ';' .out(.lm-) ;

Examples

The hello-world of parsing languages is wikipedia:S-expressions over some simple collection of atoms. The following grammar generates a parser which compiles S-expressions to RPython code.

.syntax SEXP
SEXP = .out('def parseSExpOntoDomain(d):' .nl .lm+ 'return ')
       EXP .out(.nl) ;
EXP = LIST / ATOM .out('d.atom("' * '")') ;
LIST = OPEN
       ( CLOSE .out('d.sexp([])')
       / .out('d.sexp([' .nl .lm+)
         $( EXP .out(',' .nl) )
         CLOSE .out(.lm- '])') ) ;
.tokens
ATOM  : WS .token CHAR $CHAR .tokout ;
OPEN  : WS .any('() ;
CLOSE : WS .any(')) ;
CHAR  : .any('a:'z) ;
WS    : $.any(10!13!' ) ;
.domain
.end

The following example compiles Brainfuck to C. It is a handwritten translation of this program in Tmg by Andriy Makukha which is based on this program in C by Cody Burgett. It is able to compile and run Jon Ripley's Lost Kingdom, generating over 187 thousand lines of C and taking two minutes to compile with GCC.

.functor c
Include(h)      -> [.line('#include' h)] ;
Statement(l)    -> [.line(l ';')] ;
Block(head, cs) -> [.line(head ' {'), .block(cs), .line('}')] ;
.functor bf
Right(n)    -> [c.Statement('ptr += ' n)] ;
Left(n)     -> [c.Statement('ptr -= ' n)] ;
Plus(n)     -> [c.Statement('*ptr += ' n)] ;
Minus(n)    -> [c.Statement('*ptr -= ' n)] ;
Output      -> [c.Statement('putchar(*ptr)')] ;
Input       -> [c.Statement('*ptr = getchar()')] ;
Loop(exprs) -> [c.Block('while (*ptr)', exprs)] ;
Program(p)  -> [c.Include('<stdio.h>'),
                c.Include('<stdlib.h>'),
                c.Block('int main(int argc, char **argv)', [
                    c.Statement('unsigned char *ptr = calloc(30 * 1000, 1)'),
                    c.Block('if (!ptr)', [
                        c.Statement('fprintf(stderr, "Error allocating memory.\\n")'),
                        c.Statement('return 1')
                    ])] p [
                    c.Statement('free(ptr)'),
                    c.Statement('return 0')
                ])] ;
.grammar BF
BF := SQN:p COMMENT -> bf.Program(p) ;
SQN := ( COMMENT OP )*:ops -> *ops ;
OP := '>'+:cs -> bf.Right(#cs)
    / '<'+:cs -> bf.Left(#cs)
    / '+'+:cs -> bf.Plus(#cs)
    / '-'+:cs -> bf.Minus(#cs)
    / '.'     -> bf.Output
    / ','     -> bf.Input
    / '[' SQN:exprs ']' -> bf.Loop(exprs) ;
COMMENT := ( !( '>' / '<' / '+' / '-' / '.' / ',' / '[' / ']' ) .any )* ;

References

  1. Tutorial: Metacompilers Part 1, James M. Neighbors. 2008. https://www.bayfronttechnologies.com/mc_tutorial.html
  2. Bryan Ford. 2002. Packrat parsing: simple, powerful, lazy, linear time, functional pearl. In Proceedings of the seventh ACM SIGPLAN international conference on Functional programming (ICFP '02). Association for Computing Machinery, New York, NY, USA, 36–47. https://doi.org/10.1145/581478.581483
  3. Schrijvers, T., & Frühwirth, T.W. (2004). Implementing and Analysing Union-Find in CHR. https://exia.informatik.uni-ulm.de/fruehwirth/Papers/wclp-union-find.pdf
  4. Bender, D. & Kuper, L. & Byrd, W. & Friedman, D. (2015). Efficient representations for triangular substitutions: A comparison in miniKanren. https://users.soe.ucsc.edu/~lkuper/papers/walk.pdf
  5. Hemann, J., & Friedman, D. P. (2013). µKanren: A Minimal Functional Core for Relational Programming. Workshop on Scheme and Functional Programming, Alexandria, United States. http://webyrd.net/scheme-2013/papers/HemannMuKanren2013.pdf
  6. Wolf, A. (2001). Adaptive Constraint Handling with CHR in Java. In: Walsh, T. (eds) Principles and Practice of Constraint Programming — CP 2001. CP 2001. Lecture Notes in Computer Science, vol 2239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45578-7_18 https://github.com/endario/references/blob/master/Adaptive%20Constraint%20Handling%20with%20CHR%20in%20Java%20Wolf.pdf
  7. N.D. Jones, C.K. Gomard, and P. Sestoft, Partial Evaluation and Automatic Program Generation. With chapters by L.O. Andersen and T. Mogensen. Prentice Hall International, June 1993. xii + 415 pages. ISBN 0-13-020249-5. https://studwww.itu.dk/~sestoft/pebook/