/ 8P,,9Types Induce Invariants These invariants provide Safety properties: well-typed programs do not go wrong . Equational properties: when are two expressions interchangeable in all contexts. Representation independence (parametricity).d)< Types as Safety CertificatesTyping is a sufficient condition for these invariants to hold. Well-typed implies well-behaved. Not (necessarily) checkable at run-time! Types form a certificate of safety. Type checking = safety checking. A practical sufficient condition for safety.?J$N N !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\x^_`bcdefghijklmnopqrstwyz{|}~Root EntrydO){s@Current User7GSummaryInformation(PowerPoint Document(DocumentSummaryInformation8 0" DWingdingsRoman(Wblbv0b( 00DCourier Newman(Wblbv0b( 01@DSymbol Newman(Wblbv0b( 0 `. @n?" dd@ @@``@8h% %Z! #$&c$3f@$uʚ;2Nʚ;g43d3dv0xbppp@<4!d!d` 04b)Wb<4dddd` 04b)Wb<4BdBd` 04b)Wbg4,d,dv0xbxp@pp2*___PPT9z?%O=iV4The Practice of Type Theory in Programming Languages 55Robert Harper Carnegie Mellon University August, 2000|AcknowledgementsThanks to Reinhard Wilhelm for inviting me to speak! Thanks to my colleagues, former, and current students at Carnegie Mellon.& f6 LAn Old Story Once upon a time (es war einmal), there were those who thought that typed high-level programming languages would save the world. Ensure safety of executed code. Support reasoning and verification. Run efficiently (enough) on stock hardware. If we all programmed in Pascal (or Algol or Sima]ula or & ), all of our problems would be solved. BZpZaZpa RWhat Happened Instead VThings didn t worked out quite as expected or predicted. COTS software is mostly written in low-level, unsafe languages (ie, C, C++) Some ideas have been adopted (eg, objects and classes), most haven t. Developers have learned to work with less-than-perfect languages, achieving astonishing results.*9ZZ9 ,Languages Ride Again But the world has changed: strong safety assurances are more important than ever. Mobile code on the internet. Increasing reliance on software in real life . Schneider made a strong case for language-based security mechanisms. Languages aren t just languages any more. Rich body of work on logics, semantics, type systems, verification, compilation.RME}9M!} aLanguage-Based Security Key idea: program analysis is more powerful than execution monitoring. This talk is about one approach to taking this view seriously, typed certifying compilation.FS Type Theory and Languages Type theory has emerged as the central organizing principle for language & Design: genericity, abstraction, and modularity mechanisms. Implementation: type inference, flow analysis. Semantics: domain theory, logical relations.2K@ What is a Type System?A type system is a syntactic discipline for enforcing levels of abstraction. Ensures that bad things do not happen. A type system rules out programs. Adding a function to a string Interpreting an integer as a pointer Violating interfaceslMZ'Z#ZXZM' XZv!What is a Type System? .How can this be a good thing? Expressiveness arises from strictures: restrictions entail stronger invariants Flexibility arises from controlled relaxation of strictures, not from their absence. A type system is fundamentally a verification tool that suffices to ensure invariants on execution behavior.m@!; /bTypes Induce InvariantsPTypes induce invariants on programs. If e : int, then its value must be an integer. If e : int int, then it must be a function taking and yielding integers. If e : filedesc, then it must have been obtained by a call to open. If e : int{H}, then no low clearance expression can read its value. % (>/ 8P,,9Types Induce Invariants These invariants provide Safety properties: well-typed programs do not go wrong . Equational properties: when are two expressions interchangeable in all contexts. Representation independence (parametricity).d)< Types as Safety CertificatesTyping is a sufficient condition for these invariants to hold. Well-typed implies well-behaved. Not (necessarily) checkable at run-time! Types form a certificate of safety. Type checking = safety checking. A practical sufficient condition for safety.?J$N N% 1 $The HLL Assumption This is well and good, but & Programs are compiled to unsafe, low-level machine code. We want to know that the object code is safe. HLL assumption: trust the correctness of the compiler and run-time system. A huge assumption. Spurred much research in compiler correctness.(gKB = & Certifying Compilers Idea: propagate types from the source to the object code. Can be checked by a code recipient. Avoids reliance on compiler correctness. Based on a new approach to compilation. Typed intermediate languages. Type-directed translation.|:ZMZ(Z9Z%?(9 dITyped Intermediate LanguagesGeneralize syntax-directed translation to type-directed translation. intermediate languages come equipped with a type system. compiler transformations translate both a program and its type. translation preserves typing: if e:T then e*:T* after translationE ,%,Typed Intermediate LanguagesClassical syntax-directed translation: Source = L1 L2 & Ln = Target : T1 Type system applies to the source language only. Type check, then throw away types.'G1#' !1#Typed Intermediate LanguagesType-directed translation: Source = L1 L2 & Ln = Target : : : T1 T2 & Tn Maintain types during compilation. Translate a program and its type. Types guide translation process.Pr#C <#,&Typed Closure ConversionThe type ST becomes the ADT type Envval env : Envval code : Env * S -> T The application F(a) becomes call(F.code,(F.env, a)) Functions are implementations of the ADT. Essentially, functions are represented as objects. Environment = private fields, code = method.0*` &*`-Typed Object Code ;Typed Assembly Language (TAL) type information ensures safety generated by compiler very close to standard x86 assembly Type information captures types of registers and stack type assumptions at branch targets (including join points) Relies heavily on polymorphism! eg, callee-saves registers, enforcing abstractionZZZZXZ Z2ZZX 2> Typed Assembly Language fact: ALL rho.{r1:int, sp:{r1:int, sp:rho}::rho} jgz r1, positive mov r1,1 ret positive: push r1 ; sp : int::{t1:int,sp:rho}::rho sub r1,r1,1 call fact[int::{r1:int,sp:rho}::rho] imul r1,r1,r2 pop r2 ; sp : {r1:int,sp:rho}:: ret|+; Tracking Stronger PropertiesFamiliar type systems go a long way. Ensures minimal sanity of code. Ensures compliance with interfaces. Especially if you have polymorphism. Refinement types take a step further. Track value range invariants. Array bounds checks, null pointer checks, red-black invariants, etc.%ZiZ&ZcZiRHO&CFRefinement TypespFirst idea: subset types. e : { x : T | P(x) } iff e:T and |= P(e) Examples: Pascal-like sub-ranges0..n = { n : int | 0 n < length(A) } Non-null objects Red-black condition on RBT s) l ,?PC -Refinement typesChecking value range properties is undecidable! eg, cannot decide if 0 e < 10 for general expressions e Checker must include a theorem prover to validate object code. either complex and error prone, or too weak to be usefulv0:?90"9HWRRefinement TypesSecond idea: proof carrying code. (e, p) : { x:T | P(x) } iff e:T and p |- P(e) Provide a proof of the range property. How to obtain it? How to represent it? Verifier checks the types and the proof. using a proof checker, not a proof finder2".'')* ' bP &KFinding ProofsTo use A[n] safely, we must prove that 0 n < size(A). If we insert a run-time check, it s easy! if 0 n < size(A) then *(A+4n) else fail In general we must find proofs. Instrumented analysis methods. Programmer declarations. c* 8 $ 8lR+*Representing ProofsHow do we represent the proofs? Need a formal logic for reasoning about value range properties (for example). Need a proof checker for each such formalism. But which logic should we use? How do we accommodate change? Which properties are of interest?x |@ |@Logical Frameworks3The LF logical framework is a universal language for defining logical systems. Captures uniformities of a large class of logical systems. Provides a formal definition language for logical systems. Proof checking is reduced to a very simple form of type checking. One type checker yields many proof checkers!OvB- (1" Logical FrameworksSpecify the syntactic categories as LF types: term : type. formula : type. proof : formula -> type. For each formula, there is a type of proofs of that formula in the logic.b.6J6Logical FrameworksSpecify generators of each category: zero : term. succ : term -> term. implies : formula -> formula. modus_ponens :proof (implies A B) ->proof A -> proof B. Validity checking = type checking!N%z# z#General Certified CodehThe logic is part of the safety certificate! Logic of type safety. Logic of value ranges. Logic of space requirements. Proofs are LF terms for that logic. Checker is parameterized on specification of the logic (an LF signature ). LF type checker checks proofs in any logic (provided it is formalized in LF).-J$$J Some Challenges Can certified compilation really be made practical? TALC [Morrisett] for safe C . TILT [CMU] for Standard ML [in progress]. SML/NJ [Yale] for Standard ML [in progress]. Touchstone [Necula, Lee] for safe C .&44 Some Challenges Can refinements be made useful and practical? Dependent ML [Pfenning, Xi] Dependently-Typed Assembly [Harper, Xi] Experience with ESC is highly relevant. A difference is that refinements are built in to the language.L.D(?.D(? uZSome PredictionsCertifying compilation will be standard technology. Code will come equipped with checkable safety certificates. Type systems will become the framework for building practical development tools. Part of the program text. Mechanically checkable.4<Q2 < +2 Further Information http://www.typetheory.com Pbr (c( F/0|DTimes New Roman(Wbbv0b( 0DArialNew Roman(Wbbv0b( 0" DWingdingsRoman(Wbbv0b( 00DCourier Newman(Wbbv0b( 01@DSymbol !"#$%&'()*+,-./01234568 Oh+'0 px4 L\ | The Practice of Type TheoryRobert Harperf Dagstuhl Tenth Anniversary tOC:\WINDOWS\Profiles\rwh\Application Data\Microsoft\Templates\Research Talk.potRobert Harperfi1878t HMicrosoft PowerPointwh\@B@` |=a@{s:G g 1& &&#TNPP 2OMi & TNPP &&TNPP 3f- "-- !-- "-&G& - &Gy& --iyH-- @"Arial- .12 mThe Practice of Type Theory $!!' !!$!!#!! . .+2 ewin Programming Languages !'! !11!!! !! !! .--Q1-- @"Arial- .2 Y Robert Harper. ..2 Carnegie Mellon University # . .2 AbAugust, 2000.--"System-&TNPP &% 1 $The HLL Assumption This is well and good, but & Programs are compiled to unsafe, low-level machine code. We want to know that the object code is safe. HLL assumption: trust the correctness of the compiler and run-time system. A huge assumption. Spurred much research in compiler correctness.(gKB = & Certifying Compilers Idea: propagate types from the source to the object code. Can be checked by a code recipient. Avoids reliance on compiler correctness. Based on a new approach to compilation. Typed intermediate languages. Type-directed translation.|:ZMZ(Z9Z%?(9 dITyped Intermediate LanguagesGeneralize syntax-directed translation to type-directed translation. intermediate languages come equipped with a type system. compiler transformations translate both a program and its type. translation preserves typing: if e:T then e*:T* after translationE ,%,Typed Intermediate LanguagesClassical syntax-directed translation: Source = L1 L2 & Ln = Target : T1 Type system applies to the source language only. Type check, then throw away types.'G1#' !1#Typed Intermediate LanguagesType-directed translation: Source = L1 L2 & Ln = Target : : : T1 T2 & Tn Maintain types during compilation. Translate a program and its type. Types guide translation process.Pr#C <#,&Typed Closure ConversionThe type ST becomes the ADT type Envval env : Envval code : Env * S -> T The application F(a) becomes call(F.code,(F.env, a)) Functions are implementations of the ADT. Essentially, functions are represented as objects. Environment = private fields, code = method.0*` &*`-Typed Object Code ;Typed Assembly Language (TAL) type information ensures safety generated by compiler very close to standard x86 assembly Type information captures types of registers and stack type assumptions at branch targets (including join points) Relies heavily on polymorphism! eg, callee-saves registers, enforcing abstractionZZZZXZ Z2ZZX 2> Typed Assembly Language fact: ALL rho.{r1:int, sp:{r1:int, sp:rho}::rho} jgz r1, positive mov r1,1 ret positive: push r1 ; sp : int::{t1:int,sp:rho}::rho sub r1,r1,1 call fact[int::{r1:int,sp:rho}::rho] imul r1,r1,r2 pop r2 ; sp : {r1:int,sp:rho}:: ret|+; Tracking Stronger PropertiesFamiliar type systems go a long way. Ensures minimal sanity of code. Ensures compliance with interfaces. Especially if you have polymorphism. Refinement types take a step further. Track value range invariants. Array bounds checks, null pointer checks, red-black invariants, etc.%ZiZ&ZcZiRHO&CFRefinement TypespFirst idea: subset types. e : { x : T | P(x) } iff e:T and |= P(e) Examples: Pascal-like sub-ranges0..n = { n : int | 0 n < length(A) } Non-null objects Red-black condition on RBT s) l ,?PC -Refinement typesChecking value range properties is undecidable! eg, cannot decide if 0 e < 10 for general expressions e Checker must include a theorem prover to validate object code. either complex and error prone, or too weak to be usefulv0:?90"9HWRRefinement TypesSecond idea: proof carrying code. (e, p) : { x:T | P(x) } iff e:T and p |- P(e) Provide a proof of the range property. How to obtain it? How to represent it? Verifier checks the types and the proof. using a proof checker, not a proof finder2".'')* ' bP &KFinding ProofsTo use A[n] safely, we must prove that 0 n < size(A). If we insert a run-time check, it s easy! if 0 n < size(A) then *(A+4n) else fail In general we must find proofs. Instrumented analysis methods. Programmer declarations. c* 8 $ 8lR+*Representing ProofsHow do we represent the proofs? Need a formal logic for reasoning about value range properties (for example). Need a proof checker for each such formalism. But which logic should we use? How do we accommodate change? Which properties are of interest?x |@ |@Logical Frameworks3The LF logical framework is a universal language for defining logical systems. Captures uniformities of a large class of logical systems. Provides a formal definition language for logical systems. Proof checking is reduced to a very simple form of type checking. One type checker yields many proof checkers!OvB- (1" Logical FrameworksSpecify the syntactic categories as LF types: term : type. formula : type. proof : formula -> type. For each formula, there is a type of proofs of that formula in the logic.b.6J6Logical FrameworksSpecify generators of each category: zero : term. succ : term -> term. implies : formula -> formula. modus_ponens :proof (implies A B) ->proof A -> proof B. Validity checking = type checking!N%z# z#General Certified CodehThe logic is part of the safety certificate! Logic of type safety. Logic of value ranges. Logic of space requirements. Proofs are LF terms for that logic. Checker is parameterized on specification of the logic (an LF signature ). LF type checker checks proofs in any logic (provided it is formalized in LF).-J$$J Some Challenges Can certified compilation really be made practical? TALC [Morrisett] for safe C . TILT [CMU] for Standard ML [in progress]. SML/NJ [Yale] for Standard ML [in progress]. Touchstone [Necula, Lee] for safe C .&44 Some Challenges Can refinements be made useful and practical? Dependent ML [Pfenning, Xi] Dependently-Typed Assembly [Harper, Xi] Experience with ESC is highly relevant. A difference is that refinements are built in to the language.L.D(?.D(? uZSome PredictionsCertifying compilation will be standard technology. Code will come equipped with checkable safety certificates. Type systems will become the framework for building practical development tools. Part of the program text. Mechanically checkable.4<Q2 < +2 Further Information http://www.typetheory.com Pbr\ 8s