CS 15-212: Principles of Programming
(Spring 2009)

Schedule of Classes

In this course, there will be two types of lectures:

Problem solving lectures are listed below with a blue background. They will look at Computer Science problems from a mathematical perspective, emphasizing an abstract, logical, understanding of problems and solutions. You will not see a single line of code in these lectures. Each problem solving lecture is followed by a problem solving recitation (in light blue) dedicated to reinforcing the introduced concepts through exercises.
Programming techniques lectures are listed with a green background. They serve two purposes: first they introduce some advanced, but language-independent, programming concepts (e.g., polymorphism and modularity); second they expose you to a non-imperative programming language, currently ML. Here, we will use the abstract problem solving techniques you have learned in the blue lectures and the advanced programming concepts to solve real problems. Each programming techniques lecture is followed by a programming techniques recitation (in light green) dedicated to reinforcing the introduced concepts through exercises.

At a glance ...

Sun 11 Jan Lecture 1	Welcome and Course Introduction We outline the course, its goals, and talk about various administrative issues. Inductive Computing We begin by reviewing a few simple proofs of nuerical properties by mathematical and complete induction and infer their distinguishing factor as the necessary fallback on a base case. We use this observation to give a computational expression to the participating entities as inductive function definitions. We conclude by discussing the differences between inductively defined functions and recursive functions. Concepts: Campfire induction Mathematical induction Complete induction Simultaneous induction Inductive function definitions Induction and recursion Further readings: Induction Programming in Standard ML: Chapter 25
Mon 12 Jan Recitation 1	Exercises on Inductive Computing Exercises: Readings:
Tue 13 Jan Lecture 2	Introduction to SML This lectures introduces the technical underpinnings of the language Standard ML as well as some elementary constructs. In particular, it discusses the functional programming paradigm, the characteristics of strongly typed languages, and interpreter-based execution. Concepts: Functional programming Strongly-typed languages Execution models SML execution model SML base types SML function basics Inductive function definitions in SML Further readings: History of ML Other functional languages Programming in Standard ML: Chapter 1-2, 4, 7
Wed 14 Jan Recitation 2	SML, Style Exercises: Sum Square Factorial Readings: Notes about style Using the SML/NJ System SML/NJ top level environment Other ML compilers

Sun 18 Jan Lecture 3	Inductive Data Structures The concept of inductive definition, introduced in the specific case of natural numbers, is easily extended to generic data structures as long as they are finite. These correspond to the notion of freely-generated expressions from abstract algebra. We demonstrate this technique in the case of lists and trees. Such inductive definition of data structures is paralled by a simple generalization of traditional proofs by induction, called structural induction. Concepts: Universal algebra Term algebra Initial term algebras Lists Trees Structural induction List induction Tree induction Further readings: Programming in Standard ML: Chapter 26
Mon 19 Jan Recitation 3	Exercises on Inductive Data Structures Exercises: Readings:
Tue 20 Jan Lecture 4	Datatypes, Patterns, and Lists The algebraic concept of inductive data structure is available in ML as the datatype mechanism. This provides a way to represent data in a natural fashion. We define datatypes for a variety of data structures and show how to write functions for them. We see that the structure of a datatype definition naturally leads to clausal function definitions, a convenient way to work with them based on pattern matching. We discuss at length ML's predefined support for lists and introduce the programming concept of polymorphism. Concepts: SML datatype Patterns Polymorphism SML lists SML equality types Further readings: Programming in Standard ML: Chapters 5-6, 8, 9-10
Wed 21 Jan Recitation 4	Exercises on Datatypes, Patterns, and Lists Exercises: Factorial (using pattern matching) Fibonacci Odd and Even (mutual recursion) List of Coins Polymorphic Trees Readings:

Sun 25 Jan Lecture 5	Proving Properties of Programs Inductive definitions can be used not only to describe data structures, but also to give a formal specification to programming concepts such as typing and evaluation. We concentrate on ML evaluation and show how the methodology of structural induction can be used to prove properties about programs. We apply it to termination proofs, equivalence proofs, and to prove the correctness of an ML function with respect to a specification. Along the way, we stumble upon common techniques to prove uncooperative theorems by induction, in particular the concept of generalization. Concepts: Evaluation semantics Induction on program execution Correctness proofs Termination proofs Equivalence proofs Further readings: Programming in Standard ML: Chapters 24-26
Mon 26 Jan Recitation 5	Exercises on Properties of Programs Exercises: Readings:
Tue 27 Jan Lecture 6	Declarations, Binding, and Scope We take a closer look at how ML manages declarations. Declarations evaluate to environments, which collects a set of bindings of variables to values which can be used in subsequent declarations or expressions. We introduce the mechanisms provided in ML for local declarations and expand on the rules of scope, which explain how references to identifiers are resolved. This is somewhat tricky for recursive function declarations. We also discuss tail recursion, a form of recursion that is somewhat like the use of loops in imperative programming. This form of recursion is often especially efficient and easy to analyze. Accumulator arguments play an important role in tail recursion. Concepts: Binding and scope Environments SML declarations Tail recursion Time complexity Further readings: SML Basis library Programming in Standard ML: Chapter 3
Wed 28 Jan Recitation 6	Exercises on Declarations, Binding and Scope Exercises: Factorial (straight,tail) Reverse (straight,tail) Flatten (straight,tail) Proof of equivalence Readings:

Sun 1 Feb Lecture 7	Representation Invariants As inductive definitions get more complex, it becomes a challenge to convince oneself (and prove to others) that they are actually correct. The argument usually relies on invariants that the representation is expected to satisfy at any time. Making these invariants explicit is extremely useful. We demonstrate it on red/black trees, a relatively complex search tree. Concepts: Representation invariants Red-black trees Further readings: Search trees Red-black tree animation (YouTube) Programming in Standard ML: Chapter 32-33
Mon 2 Feb Recitation 7	Exercises on Representation Invariants Exercises: Merge Sort Quick Sort Readings:
Tue 3 Feb Lecture 8	Modularity We discuss the separation between specification and implementation in the context of code reuse. The specification describes the functionalities of an abstract data type, with its syntactic aspects expressed as a module interface. The implementation consists of specific code that realizes these functionalities, code whose details are invisible to the user of a module. We also introduce advanced concepts such as parametric modules. The discussion is concretized by examining the specific module system of ML, in particular the concepts of signature, structure, functor and ascription. Concepts: Modules Abstract data types Parameterized modules The SML module system Further readings: Programming in Standard ML: Chapters 18-23
Wed 4 Feb Recitation 8	Exercises on Modularity Exercises: The QUEUE signature The QueueL structure The QueueLL structure The Stack structure The test file Readings:

Sun 8 Feb Lecture 9	First-Class Functions Higher-order functions are functions that manipulate other functions. We introduce the notion of nameless functions and functions as first-class values. We discuss currying, a common transformation between some traditional functions and some higher-order functions, and study situations where one or the other representation is advantageous. We present some standard higher-order functions that allow to concisely work with lists and other inductive data structures. Concepts: Functions as values Currying Higher-order functions on lists Higher-order functions on inductive data structures Staged computation Further readings: Jeffrey Dean and Sanjay Ghemawat: MapReduce: Simplified Data Processing on Large Clusters, 6th Symposium on Operating System Design and Implementation (OSDI'04), 2004. Programming in Standard ML: Chapter 11
Mon 9 Feb Recitation 9	Exercises on Higher-Order Functions Exercises: Some exercices The ARRAY signature Functional implementation of ARRAY Use cases Readings:
Tue 10 Feb Lecture 10	Continuations One very useful application of higher-order functions is as a way to control the execution of a program: continuations act as "functional accumulators." The basic idea of the technique is to implement a function f by defining a tail-recursive function f' that takes an additional argument, the continuation. This continuation is a function; it encapsulates the computation that should be done on the result of f. In the base case, instead of returning a result, we call the continuation. In the recursive case we augment the given continuation with whatever computation should be done on the result. Concepts: Continuations Functional accumulators Controlling execution Success and failure continuations Further readings: Programming in Standard ML: Chapter 29
Wed 11 Feb Recitation 10	Exercises on Continuations Exercises: The COOKING signature The structure implementing COOKING Cooking "lasagnas" Readings:

Sun 15 Feb Lecture 11	Puzzles and games Implementing games efficiently requires explicit and complex control of the execution. In this lecture, we look at games and their representation as a mathematical problem. We define games and game trees, and study two strategies for choosing the next move in a game. Concepts: Puzzles Non-determinism Search strategies Games Game tree search strategies Further readings:
Mon 16 Feb Recitation 11	Exercises on Game Tree Search Exercises: Useful functions for solving the n-Queens problem Solving the n-Queens game with continuations Readings:
Tue 17 Feb Lecture 12	Exceptions Exceptions are another way to control execution programmatically. We see that exceptions can be used not only to signal error conditions, but also in backtracking search procedures or other patterns of control where a computation needs to be partially undone. Exceptions are the first type of effect that we encounter; they may cause an evaluation to be interrupted or aborted. Concepts: Effects Exceptions and handlers Backtracking Controlling execution Further readings: Programming in Standard ML: Chapters 12, 29
Wed 18 Feb Recitation 12	Exercises on Exceptions Exercises: Solving the n-Queens game with exceptions Readings:

Sun 22 Feb Lecture 13	Combinators Combinators are functions of functions, that is, higher-order functions used to combine functions. One example is the function composition operator. The basic idea is to think at the level of functions, rather than at the level of values returned by those functions. Combinators are defined using the pointwise principle. Concepts: Function spaces Combinators Pointwise principle Further readings:
Mon 23 Feb Recitation 13	Midterm review Old midterms
Tue 24 Feb Midterm	Midterm
Wed 25 Feb Recitation 14	Exercises on Combinators Exercises: Readings:

Sun 1 Mar Lecture 14	Co-Inductive definitions Inductive data structures such as list and trees are meant to be finite. Surprisingly, many of the operations defined on them make sense also when removing the finiteness constraint, except that we take care that these operations return useful results even when applied to potentially infinite entities. Mathematically, such data structures are said to be co-inductively defined. We briefly examine how to prove properties about co-inductive definitions. Concepts: Co-inductive data structures Co-inductive definitions Proofs by co-induction Further readings: Programming in Standard ML: Chapter 15
Mon 2 Mar Recitation 15	Exercises on Co-Inductive Definitions Exercises: Readings:
Tue 3 Mar Lecture 15	Demand-Driven Computation Data streams (as in YouTube for example) are a prominent computational instance of a co-inductive data structure. We discuss how to support them in a programming language through the concept of lazy evaluation: functions in ML are evaluated eagerly, meaning that the arguments are reduced before the function is applied. An alternative is for function applications and constructors to be evaluated in a lazy manner, meaning expressions are evaluated only when their values are needed in a further computation. In an eager language such as ML, lazy evaluation can be simulated by relying on the on the fact that functions are values to "suspend" the computation. This style of evaluation is essential when working with potentially infinite data structures, such as streams, which arise naturally in many applications. Then, streams are lazy lists whose values are determined by suspended computations that generate the next element of the stream only when forced to do so Concepts: Lazy evaluation Suspensions Streams Further readings: Programming in Standard ML: Chapter 15, 30
Wed 4 Mar Recitation 16	Exercises on Demand-Driven Computation Exercises: Basic Streams Basic Stream Tests Better Streams Better Stream Tests Readings:

Sun 8 Mar Lecture 16	Decidability This lecture investigates the limits of computation. We introduce the Halting problem, a simple problem that no program will ever be able to solve. We then show that many other problems are not computable through two important techniques: diagonalization (which is a direct argument), and problem reduction (which shows that a problem is undecidable by giving a reduction from another undecidable problem) Concepts: Decision problems Decision procedures Halting problem Diagonalization Church thesis Reduction Further readings:
Mon 9 Mar Recitation 17	Exercises on Decidability Exercises: Readings:
Tue 10 Mar Lecture 17	State and Ephemeral Data Structures The programming techniques used so far in the course have, for the most part, been "purely functional". Some problems, however, are more naturally addressed by keeping track of the "state" of an internal machine. Typically this requires the use of mutable storage. ML supports mutable cells, or references, that store values of a fixed type. The value in a mutable cell can be initialized, read, and changed (mutated), and these operations result in effects that change the store. Programming with references is often carried out with the help of imperative techniques. Imperative functions are used primarily for the way they change storage, rather than for their return values References allow us to create data structures that are ephemeral, i.e., that change over time. Their main advantage is the ability to maintain state as a shared resource among many routines. Another advantage in some cases is the ability to write code that is more time-efficient than purely functional code. On the other hand, they are conceptually complex and therefore error-prone: our routines may accidentally and irreversibly change the contents of a data structure; variables may be aliases for each other. As a result it is much more difficult to prove the correctness of code involving ephemeral data structures. Concepts: Mutable cell References Storage effects (random number generator) Ephemeral data structures Value Restriction Further readings: Programming in Standard ML: Chapter 13, 28
Wed 11 Mar Recitation 18	Exercises on References and Ephemeral Data Structures Exercises: Problem Set Counter 1 Counter 2 Readings:

Sun 15 Mar Lecture 18	Computability Although the Halting problem says that there is no hope to build a program that will give a yes/no answer to many problems of interest, it is possible to write programs that will return a yes answer but may run forever when the answer is no. This is called a semi-decision procedure. For some other problems, it is always possible to correctly return a no, but a positive answer may never be returned. There is also a class of problems for which either a yes or a no answer may not be returned in finite time. Concepts: Problem Hierarchy Semi-decision Further readings:
Mon 16 Mar Recitation 19	Exercises on Computability Exercises: Readings:
Tue 17 Mar Lecture 19	Memoization We continue with streams, and complete our implementation by introducing a memoizing delay function. Memoization ensures that a suspended expression is evaluated at most once. When a suspension is forced for the first time, its value is stored in a reference cell and simply returned when the suspension is forced again. The implementation that we present makes a subtle and elegant use of a "self-modifying" code technique with circular references Concepts: Memoization Circular references Further readings: Programming in Standard ML: Chapter 31
Wed 18 Mar Recitation 20	Exercises on Memoization Exercises: Basic Streams Signature Basic Stream Implementation Basic Memoized Stream Implementation Better Stream Tests Readings:

Sun 22 Mar	No class (Spring Break)
Mon 23 Mar
Tue 24 Mar
Wed 25 Mar

Sun 29 Mar Lecture 20	Language Hierarchy and Regular Languages With this class, we begin at looking at the mathematical ingredients that are involved in building a programming language. We start by studying languages in general and classify them into a hierarchy based on their expressiveness and complexity. Several layers of this hierarchy are found in a typical interpreter. We examine in some detail regular languages and their relations to regular expressions and finite-state automata. Concepts: Formal languages Language hierarchy Regular languages Regular expressions Finite state automata Further readings:
Mon 30 Mar Recitation 21	Exercises on Language Hierarchy and Regular Languages Exercises: Regular Expressions Readings:
Tue 31 Mar Lecture 21	Regular Expressions and Lexical Analysis Regular expressions - and their underlying finite-state automata - are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep. Regular expression pattern matching has a number of simple and elegant implementation in ML. Regular expressions are the key ingredient of lexical analysis, a pre-processing step carried out by many application to recognize legitimate words. Examples include compiling programming languages, processing natural languages, or manipulating HTML pages to extract structure. As an example, we study a lexical analyzer for a simple language of arithmetic expressions. Concepts: Compilation Tokenization Lexical analysis Further readings: Programming in Standard ML: Chapter 27
Wed 1 Apr. Recitation 22	Exercises on Regular Expressions and Lexical Analysis Exercises: Lex file for a simple calculator Lex file extended with comments Using the generated parsers Readings:

Sun 5 Apr. Lecture 22	Grammars Context-free grammars arise naturally in a variety of applications. The "Abstract Syntax Charts" in programming language manuals are one instance. The underlying machine for a context-free language is a pushdown automaton, which maintains a read-write stack that allows the machine to "count" Concepts: Context-free languages grammars Push-down automata Further readings:
Mon 6 Apr. Recitation 23	Exercises on Grammars Exercises: Readings:
Tue 7 Apr. Lecture 23	Parsing In this lecture we rely on context-free grammars to parse programs. We look at the two main parsing techniques, shift-reduce and recursive descent. Shift-reduce parsing uses a stack to delay application of rewrite rules, enabling operator precedence to be enforced. Recursive descent parsing is another style that uses recursion in a way that mirrors the grammar productions. Although parser generator tools exist for restricted classes of grammars, a direct implementation can allow greater flexibility and better error handling. We present an example of a shift-reduce parser for a grammar of arithmetic expressions Concepts: Recursive descent parsing Shift-reduce parsing Further readings:
Wed 8 Mar Recitation 24	Exercises on Parsing Exercises: Calculator Language Readings:

Sun 12 Apr Lecture 24	Programming Language Semantics Formal languages give us a way to determine whether a program is syntactically valid. Getting them to tell us whether it makes any sense, or how to compute a result, is better addressed by semantic means. In this lecture, we introduce one of the basic infrastructure for doing so, and apply it for describing how to evaluate a program. Concepts: Judgments and rules Typing Evaluation Further readings:
Mon 13 Apr Recitation 25	Exercises on Programming Language Semantics Exercises: Readings:
Tue 14 Apr Lecture 25	Interpreters We implement the evaluation semantics into an evaluator, and put it together with lexical analysis and parsing. The result is an interpreter that evaluates arithmetic expressions directly, rather than by constructing an explicit translation of the code into an intermediate language, and then into machine language, as a compiler does. Our first example uses the basic grammar of arithmetic expressions, interpreting them in terms of operations over the rational numbers. We then extend this simple language to include conditional statements, variable bindings, function definitions, and recursive functions Concepts: Evaluator Type checker Interpreter Further readings:
Wed 15 Apr Recitation 26	Exercises on Interpreters Exercises: Calculator Language Readings:

Sun 19 Apr Lecture 26	Concurrency All techniques discussed so far assumed a sequential model of computation, where all the code was executed on a single machine. Whenever computation takes places on several processors at the same time, we have concurrency. One form of concurrency is parallel computing, where the computation is split among the available processors. Another form is distributed computing where independent machines collaborate by exchanging information. In this lecture, we introduce this general taxonomy from a mathematical perspective and discuss some of the issues that distinguish concurrent from sequential systems, especially communication. We explore the main techniques to address these problems. Concepts: Parallelism Distributed Systems Shared-memory systems Semaphors Message passing Further readings:
Mon 20 Apr Recitation 27	Exercises on Concurrency Exercises: The "Hello World" example The "Producer-Consumer" example Makefile Readings:
Tue 21 Apr Lecture 27	Networking Networked applications are an especially important case of distributed systems. We explore a few of the main concepts by showing how to write a simple web server. Concepts: Asynchronous message passing Protocols Further readings:
Wed 22 Apr Recitation 28	Final review Old finals

Mon 27 Apr
2-5pm
(1190)

Final

Iliano Cervesato