Safe Explicitly Staged Programming

Frank Pfenning

Carnegie Mellon University

Joint work with Peter Lee, Rowan Davies, Aleksandar Nanevski, and Philip Wickline

The Fox Project at Carnegie Mellon

Advanced Programming Languages for Systems Software

Karl Crary

Robert Harper

Peter Lee

Frank Pfenning

Many former and current students...

Threads of Research

Typed Intermediate and Low-Level Languages

Karl Crary, Robert Harper

Collaboration with Cornell

Proof-Carrying Code and Logical Frameworks

Peter Lee, Frank Pfenning

Collaborations: Berkeley, Princeton, Yale, Stanford

Commercial Java tech transfer to Cedilla Systems

Staged Computation

Peter Lee, Frank Pfenning

Unifying Themes

Problems motivated by the practice of software development

Logic and type theory as indispensible tools for their solution

Theory and system building go hand-in-hand

Talk Outline

Motivation

Explicit Staging and Types

Assessment

Summary and Future Work

Modularity vs. Efficiency

Goal: building systems that are

highly modular

correct

easy to adapt

fast to construct

How can we achieve this and attain efficiency?

Our approach:

Types and explicitly staged computation

Configuration in the Foxnet

What is Staged Computation?

Explicit or implicit division of a computation into static or dynamic phases

Run-time code generation

Partial evaluation

In-lining

Macro processing

Staging in Algorithm Design

Standard informal practice

Example: Sparse matrix multiplication

Example: JIT compilation [e.g. packet filter]

Widely applicable

Flexible

Difficult to automate

Hard to maintain during program evolution

Example: Just-in-Time Compilation

Partial Evaluation

Binding-time analysis: static (early) and dynamic (late) data

One-time global program specialization

Serious implementations [e.g. Tempo]

Well-developed theory

Binding-time analysis automatic

Can be unpredictable for programmer

Difficult to maintain under program changes

Run-Time Code Generation

Dynamic generation of code based on values only available at run-time

Serious implementations [e.g. `C, Cyclone]

Widely applicable, including systems programming and middleware

Performance hard to predict

Programs (very) difficult to write

Example: Run-Time Code Generation

Why is Staged Programming Difficult?

Departure from simple operational model

Reasoning about program-generating programs

Automation leading to unpredictability

Instability under program evolution

Our Thesis

The inherent tensions between modularity, safety, and efficiency can be relieved through programming with explicit, transparent, and statically verifiable constructs for staging

Applications particularly in complex systems where modularity and data abstraction are critical

Explicit Staging

Staging behavior should be explicit in programs

Promotes readability and maintainability

Supports reliable reasoning about behavior

Ameliorates unpredictability of optimizations

Not unduly burdensome

Transparency

The meaning of staging constructs should be clearly and unambiguously specified

Portability across platforms and compilers

Evolution from an art to a science

Some independence from language paradigm

Issues of efficient implementation remain

Static Verification

Correct staging should be statically enforced

Catch mistakes early

Diagnose notoriously difficult staging errors

Support safe program revision

Prevent inconsistencies at module boundaries

Talk Outline

Motivation

Explicit Staging and Types

Assessment

Summary and Future Work

Types to Guide Language Design

Elegance and uniformity

Static verifiability

Orthogonality to other constructs

Conservative extension property

Consistency across module boundaries

Some measure of language independence

Reasoning about programs

Intensional Expressions

A type of intensional expressions is a good model for many forms of staged computation

Intensional Expression = Source Code

Term = Compiled Code

Coerce to programs by evaluation (abstractly)

Analyze and transform structurally

In contrast, programs (terms) can only be executed

Manipulating Expressions

E — expression, u — expression variable

box E — term, x — term variable

let box u = M in N — substitute expression denoted by M for u in N

embed(x) = let box u = x in box (...u...)

eval(x) = let box u = x in u

simplify(x) = case x of box (0*u) Þ box (0) | ...

Intensional Typing

Typing judgment

D ; G |- M : A

D declares expression variables u :: A

G declares value variables x : A

M is a term

A is a type

Expression Types

A — terms denoting expressions of type A

Introduction rule (constructor)

D ; • |- E : A

D ; G |- box E : A

E depends only on expression variables!

Expression Types Continued

Elimination rule (destructor)

D ; G |- M : A      D,u::A ; G |- N : C

D ; G |- (let box u = M in N) : C

Operational semantics captures staging

                               M Þ box E    [E/u] N Þ V

box E Þ box E        let box u = M in N Þ V

Some Simple Programs

eval : A ® A

   eval (x) = let box u = x in u

quote : A ® A

   quote (x) = let box u = x in box (box u)

app : (A ® B) ® A ® B

   app (f) (x) =

         let box g = f in

              let box u = x in box (g u)

Some Other Examples

match : regexp ® (string ® bool)

eval_poly : poly ® (float ® float)

iprod : int ® (vector ® (vector ® float))

packet_filter :

instruction array ´ int

® (int ´ int ´ int array ® int)

Example from Foxnet

type connection = (msg ® unit) ´ (unit ® unit)

    a pair of functions to send and abort

open : address

               ® ((connection ® (indata ® unit))

                          ® unit)

A Logical Analysis

Types satisfy precisely the laws of the modal logic S4

Multiple world semantics as computation stages

Admits several concrete realizations

Requirements Revisited

Explicit staging via “box” and “let box”

Transparency via simple specification

Static verification via modal type-checking

Run-Time Code Generation

A — generator for code of type A

Modal restrictions guarantee that source expression will be available at run-time

[Davies & Pfenning ‘96, ‘00, ‘01]

Implementation via lightweight generating extensions

[Leone & Lee ‘96]

[Wickline, Lee, Pfenning, Davies ‘98]

Reflective Programming

Extension by pattern matching against expressions [Nanevski & Pfenning]

Intensional interpretation guarantees soundness

Less efficient to implement but more expressive

Meta-ML [Taha et al. ‘99,’00] incorporates modal and temporal types

Partial Evaluation

Requires different type system based on temporal logic

OA — A at the next stage of computation

Captures traditional binding-time analysis precisely and logically

[Davies ‘96]

Talk Outline

Motivation

Explicit Staging and Types

Assessment

Summary and Future Work

Programming

Staging errors caught quickly

Explicit staging not too intrusive

Conservative over host language

Reasonable expressive power of primitives

Predictable programmer efficiency model

Modularity

Types guarantee module consistency

Can express some inter-module optimizations

Not as appropriate for some automatic improvements (e.g. in-lining)

Implementation

PML implementation of light-weight run-time code generation in progress

Results with predecessor Fabius encouraging [Leone & Lee ‘96]

Efficient implementation quite difficult

More experiments needed

Matrix Multiplication in Fabius

Reasoning about Staged Programs

Developed type theory for reasoning about staged programs [Pfenning’01]

Enabled by clean and minimal design

Required formalization of “dead code”

Talk Outline

Motivation

Explicit Staging and Types

Assessment

Summary and Future Work

Summary

Logical and type-theoretic foundation for staged computation

Implementation as run-time code generation or reflective programming

Applications in complex systems where tensions between modularity, safety, and efficiency are greatest

Future Work

Complete implementation, including modules

Optimizations

Transfer to other languages

Applications, including program composition for complex systems (e.g. build your own “OS”)

Conclusion

Designed an explicit, transparent, and statically verifiable language for staged computation

Conservative over existing language(s)

Tools of logic and type theory are critical


	Frank Pfenning
	Carnegie Mellon University
	Joint work with Peter Lee, Rowan Davies, Aleksandar Nanevski, and Philip Wickline


	Advanced Programming Languages for Systems Software
	Karl Crary
	Robert Harper
	Peter Lee
	Frank Pfenning
	Many former and current students...


	Typed Intermediate and Low-Level Languages
		Karl Crary, Robert Harper
		Collaboration with Cornell
	Proof-Carrying Code and Logical Frameworks
		Peter Lee, Frank Pfenning
		Collaborations: Berkeley, Princeton, Yale, Stanford
		Commercial Java tech transfer to Cedilla Systems
	Staged Computation
		Peter Lee, Frank Pfenning


	Problems motivated by the practice of software development
	Logic and type theory as indispensible tools for their solution
	Theory and system building go hand-in-hand


	Motivation
	Explicit Staging and Types
	Assessment
	Summary and Future Work


	Goal: building systems that are
		highly modular
		correct
		easy to adapt
		fast to construct
	How can we achieve this and attain efficiency?
	Our approach:
	Types and explicitly staged computation


	Explicit or implicit division of a computation into static or dynamic phases
	Run-time code generation
	Partial evaluation
	In-lining
	Macro processing


	Standard informal practice
	Example: Sparse matrix multiplication
	Example: JIT compilation [e.g. packet filter]
	Widely applicable
	Flexible
	Difficult to automate
	Hard to maintain during program evolution


	Binding-time analysis: static (early) and dynamic (late) data
	One-time global program specialization
	Serious implementations [e.g. Tempo]
	Well-developed theory
	Binding-time analysis automatic
	Can be unpredictable for programmer
	Difficult to maintain under program changes


	Dynamic generation of code based on values only available at run-time
	Serious implementations [e.g. `C, Cyclone]
	Widely applicable, including systems programming and middleware
	Performance hard to predict
	Programs (very) difficult to write


	Departure from simple operational model
	Reasoning about program-generating programs
	Automation leading to unpredictability
	Instability under program evolution


	The inherent tensions between modularity, safety, and efficiency can be relieved through programming with explicit, transparent, and statically verifiable constructs for staging
	Applications particularly in complex systems where modularity and data abstraction are critical


	Staging behavior should be explicit in programs
	Promotes readability and maintainability
	Supports reliable reasoning about behavior
	Ameliorates unpredictability of optimizations
	Not unduly burdensome


	The meaning of staging constructs should be clearly and unambiguously specified
	Portability across platforms and compilers
	Evolution from an art to a science
	Some independence from language paradigm
	Issues of efficient implementation remain


	Correct staging should be statically enforced
	Catch mistakes early
	Diagnose notoriously difficult staging errors
	Support safe program revision
	Prevent inconsistencies at module boundaries


	Elegance and uniformity
	Static verifiability
	Orthogonality to other constructs
	Conservative extension property
	Consistency across module boundaries
	Some measure of language independence
	Reasoning about programs


	A type of intensional expressions is a good model for many forms of staged computation
	Intensional Expression = Source Code
	Term = Compiled Code
	Coerce to programs by evaluation (abstractly)
	Analyze and transform structurally
	In contrast, programs (terms) can only be executed


	E — expression, u — expression variable
	box E — term, x — term variable
	let box u = M in N — substitute expression denoted by M for u in N
	embed(x) = let box u = x in box (...u...)
	eval(x) = let box u = x in u
	simplify(x) = case x of box (0*u) Þ box (0) \| ...


	Typing judgment
	D ; G \|- M : A
	D declares expression variables u :: A
	G declares value variables x : A
	M is a term
	A is a type


	A — terms denoting expressions of type A
	Introduction rule (constructor)
	D ; • \|- E : A
	D ; G \|- box E : A
	E depends only on expression variables!


	Elimination rule (destructor)
	D ; G \|- M : A D,u::A ; G \|- N : C
	D ; G \|- (let box u = M in N) : C
	Operational semantics captures staging
	M Þ box E [E/u] N Þ V
	box E Þ box E let box u = M in N Þ V


	eval : A ® A
	eval (x) = let box u = x in u
	quote : A ® A
	quote (x) = let box u = x in box (box u)
	app : (A ® B) ® A ® B
	app (f) (x) =
	let box g = f in
	let box u = x in box (g u)


	match : regexp ® (string ® bool)
	eval_poly : poly ® (float ® float)
	iprod : int ® (vector ® (vector ® float))
	packet_filter :
	instruction array ´ int
	® (int ´ int ´ int array ® int)


	type connection = (msg ® unit) ´ (unit ® unit)
	a pair of functions to send and abort
	open : address
	® ((connection ® (indata ® unit))
	® unit)


	Types satisfy precisely the laws of the modal logic S4
	Multiple world semantics as computation stages
	Admits several concrete realizations


	Explicit staging via “box” and “let box”
	Transparency via simple specification
	Static verification via modal type-checking


	A — generator for code of type A
	Modal restrictions guarantee that source expression will be available at run-time
	[Davies & Pfenning ‘96, ‘00, ‘01]
	Implementation via lightweight generating extensions
	[Leone & Lee ‘96]
	[Wickline, Lee, Pfenning, Davies ‘98]


	Extension by pattern matching against expressions [Nanevski & Pfenning]
	Intensional interpretation guarantees soundness
	Less efficient to implement but more expressive
	Meta-ML [Taha et al. ‘99,’00] incorporates modal and temporal types


	Requires different type system based on temporal logic
	OA — A at the next stage of computation
	Captures traditional binding-time analysis precisely and logically
	[Davies ‘96]


	Staging errors caught quickly
	Explicit staging not too intrusive
	Conservative over host language
	Reasonable expressive power of primitives
	Predictable programmer efficiency model