Safe Explicitly Staged Programming
Frank Pfenning
Carnegie Mellon University
Joint work with Peter Lee, Rowan Davies, Aleksandar Nanevski, and Philip Wickline

The Fox Project at Carnegie Mellon
Advanced Programming Languages for Systems Software
  Karl Crary
  Robert Harper
  Peter Lee
  Frank Pfenning
  Many former and current students...

Threads of Research
Typed Intermediate and Low-Level Languages
Karl Crary, Robert Harper
Collaboration with Cornell
Proof-Carrying Code and Logical Frameworks
Peter Lee, Frank Pfenning
Collaborations: Berkeley, Princeton, Yale, Stanford
Commercial Java tech transfer to Cedilla Systems
Staged Computation
Peter Lee, Frank Pfenning

Unifying Themes
Problems motivated by the practice of software development
Logic and type theory as indispensible tools for their solution
Theory and system building go hand-in-hand

Talk Outline
Motivation
Explicit Staging and Types
Assessment
Summary and Future Work

Modularity vs. Efficiency
Goal: building systems that are
highly modular
correct
easy to adapt
fast to construct
How can we achieve this and attain efficiency?
Our approach:
   Types and explicitly staged computation

Configuration in the Foxnet

What is Staged Computation?
Explicit or implicit division of a computation into static or dynamic phases
Run-time code generation
Partial evaluation
In-lining
Macro processing

Staging in Algorithm Design
Standard informal practice
Example: Sparse matrix multiplication
Example: JIT compilation [e.g. packet filter]
Widely applicable
Flexible
Difficult to automate
Hard to maintain during program evolution

Example: Just-in-Time Compilation

Partial Evaluation
Binding-time analysis: static (early) and dynamic (late) data
One-time global program specialization
Serious implementations [e.g. Tempo]
Well-developed theory
Binding-time analysis automatic
Can be unpredictable for programmer
Difficult to maintain under program changes

Run-Time Code Generation
Dynamic generation of code based on values only available at run-time
Serious implementations [e.g. `C, Cyclone]
Widely applicable, including systems programming and middleware
Performance hard to predict
Programs (very) difficult to write

Example: Run-Time Code Generation

Example: Run-Time Code Generation

Why is Staged Programming Difficult?
Departure from simple operational model
Reasoning about program-generating programs
Automation leading to unpredictability
Instability under program evolution

Our Thesis
The inherent tensions between modularity, safety, and efficiency can be relieved through programming with explicit, transparent, and statically verifiable constructs for staging
Applications particularly in complex systems where modularity and data abstraction are critical

Explicit Staging
Staging behavior should be explicit in programs
Promotes readability and maintainability
Supports reliable reasoning about behavior
Ameliorates unpredictability of optimizations
Not unduly burdensome

Transparency
The meaning of staging constructs should be clearly and unambiguously specified
Portability across platforms and compilers
Evolution from an art to a science
Some independence from language paradigm
Issues of efficient implementation remain

Static Verification
Correct staging should be statically enforced
Catch mistakes early
Diagnose notoriously difficult staging errors
Support safe program revision
Prevent inconsistencies at module boundaries

Talk Outline
Motivation
Explicit Staging and Types
Assessment
Summary and Future Work

Types to Guide Language Design
Elegance and uniformity
Static verifiability
Orthogonality to other constructs
Conservative extension property
Consistency across module boundaries
Some measure of  language independence
Reasoning about programs

Intensional Expressions
A type of intensional expressions is a good model for many forms of staged computation
Intensional Expression = Source Code
Term = Compiled Code
Coerce to programs by evaluation (abstractly)
Analyze and transform structurally
In contrast, programs (terms) can only be executed

Manipulating Expressions
E — expression, u — expression variable
box E — term, x — term variable
let box u = M in N —                                   substitute expression denoted by M for u in N
embed(x) = let box u = x in box (...u...)
eval(x) = let box u = x in u
simplify(x) = case x of box (0*u) Þ box (0) | ...

Intensional Typing
Typing judgment
   D ; G |- M : A
D declares expression variables u :: A
G declares value variables x : A
M is a term
A is a type

Expression Types
A — terms denoting expressions of type A
Introduction rule (constructor)
D ; • |- E : A
          D ; G |- box E : A
E depends only on expression variables!

Expression Types Continued
Elimination rule (destructor)
D ; G |- M : A      D,u::A ; G |- N : C
D ; G |- (let box u = M in N) : C
Operational semantics captures staging
                               M Þ box E    [E/u] N Þ V
box E Þ box E        let box u = M in N Þ V

Some Simple Programs
eval : A ® A
   eval (x) = let box u = x in u
quote : A ® A
   quote (x) = let box u = x in box (box u)
app : (A ® B) ® A ® B
   app (f) (x) =
         let box g = f in
              let box u = x in box (g u)

Some Other Examples
match : regexp ® (string ® bool)
eval_poly : poly ® (float ® float)
iprod : int ® (vector ® (vector ® float))
packet_filter :
       instruction array ´ int
         ® (int ´ int ´ int array ® int)

Example from Foxnet
type connection = (msg ® unit) ´ (unit ® unit)
    a pair of functions to send and abort
open : address
               ® ((connection ® (indata ® unit))
                          ® unit)

A Logical Analysis
Types satisfy precisely the laws of the modal logic S4
Multiple world semantics as computation stages
Admits several concrete realizations

Requirements Revisited
Explicit staging via “box” and “let box”
Transparency via simple specification
Static verification via modal type-checking

Run-Time Code Generation
A — generator for code of type A
Modal restrictions guarantee that source expression will be available at run-time
[Davies & Pfenning ‘96, ‘00, ‘01]
Implementation via lightweight generating extensions
[Leone & Lee ‘96]
[Wickline, Lee, Pfenning, Davies ‘98]

Reflective Programming
Extension by pattern matching against expressions [Nanevski & Pfenning]
Intensional interpretation guarantees soundness
Less efficient to implement but more expressive
Meta-ML [Taha et al. ‘99,’00] incorporates modal and temporal types

Partial Evaluation
Requires different type system based on temporal logic
OA — A at the next stage of computation
Captures traditional binding-time analysis precisely and logically
[Davies ‘96]

Talk Outline
Motivation
Explicit Staging and Types
Assessment
Summary and Future Work

Programming
Staging errors caught quickly
Explicit staging not too intrusive
Conservative over host language
Reasonable expressive power of primitives
Predictable programmer efficiency model

Modularity
Types guarantee module consistency
Can express some inter-module optimizations
Not as appropriate for some automatic improvements (e.g. in-lining)

Implementation
PML implementation of light-weight run-time code generation in progress
Results with predecessor Fabius encouraging [Leone & Lee ‘96]
Efficient implementation quite difficult
More experiments needed

Matrix Multiplication in Fabius

Reasoning about Staged Programs
Developed type theory for reasoning about staged programs [Pfenning’01]
Enabled by clean and minimal design
Required formalization of “dead code”

Talk Outline
Motivation
Explicit Staging and Types
Assessment
Summary and Future Work

Summary
Logical and type-theoretic foundation for staged computation
Implementation as run-time code generation or reflective programming
Applications in complex systems where tensions between modularity, safety, and efficiency are greatest

Future Work
Complete implementation, including modules
Optimizations
Transfer to other languages
Applications, including program composition for complex systems (e.g. build your own “OS”)

Conclusion
Designed an explicit, transparent, and statically verifiable language for staged computation
Conservative over existing language(s)
Tools of logic and type theory are critical