Zaddy

From Esolang
Jump to navigation Jump to search
Zaddy
Designed by Corbin
Appeared in 2025
Computational class pushdown automata
Reference implementation [1]
Influenced by META II, Parsley, Tmg

META Restricted Zaddy, or Zaddy for short, is a living language for describing automatically-described compilers. It is described via its own metacompiler, which compiles itself to RPython. Zaddy was forked from META II.

Formally, Zaddy is the smallest language accepted by the reference implementation and compiling itself to a fixed point, rather than a language generated by some grammar.

Binding-time Hoare logic

Zaddy unifies the concepts of wikipedia:Hoare logic and binding-time improvement (The Book) into a single class of semantic actions. The parser becomes an abstract interpreter over a specified semantic domain.

For example, consider the following construction from Zaddy's reference implementation, where a parsing flag "pf" controls some internal state. We can model the flag as a Boolean value:

.domain
pf : bool ;

Then, we may annotate the parser with Hoare triples. The annotations must be precise; Zaddy will compile them down to approximations which will be automatically simulated. For example, to set the flag, we have a precondition that the flag is the complement of being enabled (and so may be either disabled or unknown) and a postcondition that the flag is enabled:

SET = .pre { ~+pf } { .out('self.pf = True' .nl) } .post { +pf } ;

This can be seen as a Hoare specification of Schorre's original "SET" assembly instruction from META II. When compiled, Zaddy will generate optimizing parsers which only set the parse flag when it must. This can create Jones improvements — binding-time improvements in the generated parsers and their generated outputs. For example, in this fragment, an error guard is removed if it statically cannot be entered, and since the .error() method happens to throw an exception, we can know that the flag is set:

.pre { ~+pf } { .out('if not self.pf: self.error()' .nl) } .post { +pf }

We can perform a similar improvement when parsing tokens that cannot fail:

$( TX3 .pre { ~+pf } { .out('if not self.pf: return' .nl) } .post { +pf } )

For correctness, the parsing machine can reset its abstract domain with the .top command, as shown here at the top-level rule for tokens:

TR = ID .out('def parse' * '(self):' .nl .lm+) ':' .top TX1 ';' .out(.lm-) ;

Examples

The hello-world of parsing languages is wikipedia:S-expressions over some simple collection of atoms. The following grammar generates a parser which compiles S-expressions to RPython code.

.syntax SEXP
SEXP = .out('def parseSExpOntoDomain(d):' .nl .lm+ 'return ')
       EXP .out(.nl) ;
EXP = LIST / ATOM .out('d.atom("' * '")') ;
LIST = OPEN
       ( CLOSE .out('d.sexp([])')
       / .out('d.sexp([' .nl .lm+)
         $( EXP .out(',' .nl) )
         CLOSE .out(.lm- '])') ) ;
.tokens
ATOM  : WS .token CHAR $CHAR .tokout ;
OPEN  : WS .any('() ;
CLOSE : WS .any(')) ;
CHAR  : .any('a:'z) ;
WS    : $.any(10!13!' ) ;
.domain
.end

The following example compiles Brainfuck to C. It is a handwritten translation of this program in Tmg by Andriy Makukha which is based on this program in C by Cody Burgett. While not perfect, it is able to compile and somewhat run Jon Ripley's Lost Kingdom, generating over 2 megalines of C and taking several minutes to compile with GCC.

.syntax BF
BF = .out('#include <stdio.h>' .nl
          '#include <stdlib.h>' .nl
          'int main(int argc, char **argv) {' .nl .lm+
          'unsigned char *ptr = calloc(30 * 1000, 1);' .nl
          'if (!ptr) {' .nl .lm+
          'fprintf(stderr, "Error allocating memory.\\n");' .nl
          'return 1;' .nl .lm-
          '}' .nl)
     SQN
     .out('free(ptr);' .nl
          'return 0;' .nl .lm-
          '}' .nl) ;
SQN = $( LOOP / OP ) ;
LOOP = WHILE  .out('while (*ptr) {' .nl .lm+)
       SQN END .out(.lm- '};' .nl) ;
OP = RIGHT .out('--ptr;' .nl)
   / LEFT  .out('++ptr;' .nl)
   / PLUS  .out('++*ptr;' .nl)
   / MINUS .out('--*ptr;' .nl)
   / OUT   .out('putchar(*ptr);' .nl)
   / IN    .out('*ptr = getchar();' .nl) ;
.tokens
RIGHT   : COMMENT .token .any('>) .tokout ;
LEFT    : COMMENT .token .any('<) .tokout ;
PLUS    : COMMENT .token .any('+) .tokout ;
MINUS   : COMMENT .token .any('-) .tokout ;
OUT     : COMMENT .token .any('.) .tokout ;
IN      : COMMENT .token .any(',) .tokout ;
WHILE   : COMMENT .token .any('[) .tokout ;
END     : COMMENT .token .any(']) .tokout ;
COMMENT : $.not('>!'<!'+!'-!'.!',!'[!']) ;
.domain
.end