Peter Lee

Staged Computation

Quick links to:
  • Optimizing ML with Run-Time Code Generation, by Peter Lee and Mark Leone, PLDI'96. This is probably the best overview/introduction to the concept.
  • Lightweight Run-Time Code Generation, by Mark Leone and Peter Lee, PEPM'94. An early introduction, with connections to partial evaluation.
  • An Overview of Work on Staged Computation

    Like the informal principle of the time-space tradeoff, there is also a tradeoff between general-purpose and special-purpose programs. General-purpose programs are desirable and even necessary for many applications. As a simple example, consider the choice between a general numerical-analysis routine and one that has been written specially for sparse inputs. The special version can be much faster, but might not work well in cases where the inputs are dense. Since generality usually incurs a penalty in run-time performance, programmers still take the trouble to write efficient specialized programs and use them whenever possible. Unfortunately, it is not always possible to predict ahead of time whether the specialized versions can be used.

    Of course, these tradeoffs are as old as computer programming, and so is the idea of attacking them with general-purpose programs that generate special-purpose code at run time. For example, in 1968 Ken Thompson implemented a search algorithm that compiled a user-supplied regular expression into an executable finite-state machine in the form of native code for the IBM~7094. This program was both general (because it accepted any regular expression) and fast (because it generated special-purpose code quickly).

    In order to automate the construction of such programs, we have been developing a prototype compiler for run-time code generation. Our system, called Fabius, is a compiler that takes ordinary programs written in a subset of ML and automatically compiles them into native code that generates optimized native code at run time. The dynamically generated code is often much more efficient than statically generated code because it is optimized using run-time values. Furthermore, Fabius optimizes the code that dynamically generates code by using partial evaluation techniques, thereby completely eliminating the need to manipulate any intermediate representation of code at run time. This results in extremely low overhead: the average cost of run-time code generation is about six instructions executed per generated instruction.

    Although not every program benefits from run-time code generation, we have had little trouble finding realistic programs that run significantly faster - sometimes by more than a factor of four - when run-time code generation is used. In some cases, the ML programs compiled by Fabius even outperform corresponding C programs. For example, the BSD packet filter interpreter, which the BSD operating-system kernel uses for fast selection of network packets on behalf of user processes, runs nearly 40% faster on typical inputs when translated from C to ML and compiled by Fabius.

    Language Design for Staged Computation

    One significant difficulty in the Fabius approach is how to come up with annotations of the programs to best (and, importantly, predictably) exploit run-time code generation. We believe, in fact, that the language must provide the expressive power to allow the programmer to specify computation stages explicitly, and that such specifications must be checkable for consistency against a suitable formal model. Our current research is therefore aimed at making use of earlier work by Frank Pfenning and Rowan Davies on type systems based on temporal and modal logics.

    For more details, see the papers listed at the top of this page. You may also contact me or Mark Leone.