Compiler Generation

The previous section suggests broader use of the fundamental technique of compilation by making fast code generation facilities available at run time, that is RTCG. But compilers are too hard to write, too hard to change, and not portable enough. Compiler generation attempts to address these problems by automating the process of creating a compiler (as much as is practical) from an interpreter.

How hard is it to write a compiler? It depends on several things. As the source language becomes higher level and the target language becomes lower level, or it has global consistency requirements (such as static types or modules), complexity increases quickly. It is relatively easy to write a compiler that generates lisp, harder to write one that generates C, and harder yet to generate machine code directly. The difficulties are compounded by generating fast code.

In addition, it is much harder to change a language encoded as a compiler instead of as an interpreter. Thus if software is still under development, its interfaces have not settled, or new interfaces are being created, it is easier to work with an interpreter. Thus compiler generation is especially useful for prototyping, exploratory programming, evolutionary programming, and bottom-up programming.

For example, consider the construction of a toolkit for artificial life (alife) researchers working with cellular automata (CA). Creating a CA rule corresponds to writing a program. Creating a new class of rules corresponds to creating a new language (eg a Margolus Neighborhood language [CAM]). The researcher is exploring a language space. Interpreters are too slow for interactive, visual-bandwidth calculation. Current practice is to compile CA rules to look-up tables, but this fails for any non-trivial state size, as required by eg reaction-diffusion systems [WiKa91]. User-directed evolution [Sims91] of CAs represents an ideal application for nitrous.

What I really want is light(er)weight languages.

Compiler generation attempts to address these problems by automating the process of creating a compiler from an interpreter [someone]. Returning to the power example, this amounts to automatically generating power-gen from power [note coding]:

Off-line (aka self-applicable) partial evaluation (PE) has matured and now with plenty of foresight and a certain amount of tweeking you can convert even a higher-order interpreter into a compiler (whose target language is Scheme) [Jorgensen92].

This makes languages lightweight. Since languages are generated from their definitions they are as easily modified as interpreters (almost). Furthermore, the interpreter can be used to debug as needed. The complexity of the compiler that was originally largely in bookkeeping has been replaced by programmer understanding of binding times, lifting, and termination issues (picking bookkeeping strategies).