Metacompiler

From Esolang
Jump to navigation Jump to search

The more the language can see its own structures, the more liberated you can be from the tyranny of a single implementation. ~ A. Kay [1]

In compiler theory, a metacompiler is a compiler which can compile itself. More specifically, a metalanguage is a language where each program represents a compiler and a metacompiler is a program in a metalanguage which describes a compiler for the metalanguage.

Definition

In general formal language theory, a metalanguage is any language which has the capability to describe another language. Within software development and compiler theory, a metalanguage is a programming language which implements compilers.

Let a (Canadian cross) compiler be a program with three associated languages:

  1. The build or implementation language is a language that the program belongs to
  2. The host or source language is a language that the program accepts as input
  3. The target or compiled language is the language which the program can emit as output given the host language

The idea is that the compiler will execute on a machine which natively interprets the build language, emitting an intermediate program which can be executed on an interpreter, emulator, or hardware for the host language. The three languages are often grouped into a tombstone diagram for the compiler.

Then, a metalanguage is a host language which contains a program whose build language contains its target as a sublanguage and furthermore which compiles to its own build text. Such a program is called a metacompiler for the metalanguage. Symbolically, let mc be a metacompiler and let h be the hardware interpreter for its build language; then there exists a fixed point f such that:

f = h(f)(mc) = h(h(f)(mc))(mc) = h(…f…)(mc)

Sometimes f is called a bootstrap program because it must be produced via some ad-hoc boostrapping process. However, once a working version of f exists via some handwritten program that behaves like it, f can be regenerated endlessly in a canonical form by applying mc.

Note that the choice of target language is dependent upon the choice of metalanguage, so a metacompiler is limited in abstractive power by the features of its metalanguage.

Futamura projections

In wikipedia:partial evaluation we are often interested in self-applicable specializers: programs which take as input source code for two programs representing a function and a value and return a single program representing the application of that function onto that value. A compiler-generating compiler cogen applied to a specializer mix twice obeys the following equations, known as the third Futamura projection:[1]

cogen = mix(mix, mix)
mix = cogen(mix, [mix, mix])

Where the square brackets [] indicate pairing. More primitively, we might imagine a "square root"[2] of mix which is curried to accept only one argument at a time:

mcgen = mc(mc)
mc = mcgen(mc, mc)

This is precisely the one-language version of the more general definition given earlier. Moreover, since the fourth, fifth, and sixth Futamura projections are merely iterations of the third Futamura projection[3], creating incremental "stepping stone" compilers, reusing mc at each stage of projection leads to a fixed point where mcgen no longer changes.

Examples

References

  1. N.D. Jones, C.K. Gomard, and P. Sestoft, Partial Evaluation and Automatic Program Generation. With chapters by L.O. Andersen and T. Mogensen. Prentice Hall International, June 1993. xii + 415 pages. ISBN 0-13-020249-5.
  2. Dave Long. 2015. META II: Digital Vellum in the Digital Scriptorium: Revisiting Schorre’s 1962 compiler-compiler. Queue 13, 1 (December 2014 / January 2015), 50–60. https://doi.org/10.1145/2716276.2724586
  3. Robert Glück. 2009. Is there a fourth Futamura projection? In Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation (PEPM '09). Association for Computing Machinery, New York, NY, USA, 51–60. https://doi.org/10.1145/1480945.1480954