CS 15-212: Principles of Programming
(Spring 2009)

Course Information

 [  Logistics  |  Course Links  |  Calendar of Classes  |  Coursework Calendar  ]

Logistics

Lectures:  Su,Tu   16:00 - 17:20 (room 2147)
Recitations:  Mo,We 11:00 - 11:50 (room 2147)

Class Webpage:   http://qatar.cmu.edu/cs/15212

Instructor: Iliano Cervesato
Office hours:  by appointment (check schedule)
Office:  CMU-Q 1008
Email: 
Co-instructor: Thierry Sans
Office hours:  by appointment
Office:  CMU-Q 1019
Email: 

Course Links

Calendar of Classes

Click on a class day to go to that particular lecture or recitation. Due dates for homeworks are set in bold. The due date of the next homework blinks.

Coursework Calendar

Hw1
5%
Hw2
5%
Hw3
5%
Hw4
10%
Midterm
10%
Hw5
10%
Hw6
10%
Hw7
5%
Hw8
10%
Final
15%
Posted 11 Jan 17 Jan 24 Jan 31 Jan 24 Feb 14 Feb 07 Mar 21 Mar 04 Apr 27 Apr (2pm)
1190
Due
(23:59)
17 Jan 24 Jan 31 Jan 14 Feb 07 Mar 21 Mar 04 Apr 18 Apr
Corrected 24 Jan 31 Jan 07 Feb 21 Feb 03 Mar 14 Mar 04 Apr 11 Apr 25 Apr 04 May

About this course

 [  Description  |  Prerequisites  |  Software  |  Readings  |  Grading  |  Assessment  ]

Description

This course has the purpose of introducing students who have had experience with basic data structures and algorithms to more advanced skills, concepts and techniques in programming and Computer Science in general. This will be accomplished along three dimensions.

Prerequisites

You must have completed CS 15-211 (Fundamental Data Structures and Algorithms)

Software

The course relies extensively on the programming language Standard ML (SML) and related utilities, mainly ML-Lex, ML-Yacc and Concurrent ML. The particular implementation we will be working with is Standard ML of New Jersey (SML/NJ), version 110.65.

SML at CMU-Q

A reference build has been made available in the Unix clusters. To run it, you need to login into your Unix account. In Windows, you do this by firing PuTTy and specifying unix.qatar.cmu.edu as the machine name. When the PuTTy window comes up, type sml, do your work, and then hit CTRL-D when you are done.

You can edit your files directly under Unix (the easiest way is to run the X-Win32 utility from Windows and then run the Emacs editor from the PuTTy window by typing emacs - see also this tutorial).

If you want to do all this from your own laptop, you first need to install X-Win32 from here. PuTTy is pre-installed in Windows.

SML on Your Own Laptop

If you want, you can install a personal copy of SML/NJ on your laptop. To do this, download this file and follow these instructions Personal copies are for your convenience: all software will be evaluated on the reference environment on unix.qatar.cmu.edu. You need to make sure that your homework assignments work there before submitting them. To do so, you need to transfer your files onto unix.qatar.cmu.edu and test them there. You can do so by using the PSFTP utility which comes with PuTTy (or any of the many more user-friendly FTP front-ends).

Documentation

Useful documentation can be found on the SML/NJ web site. The following files will be particularly useful:

Readings

The 15-212 Wiki

The material for all lectures can be found on the 15-212 wiki. This is wiki, not a textbook. The main differences are:

Further References

Grading

Assessment

Course Objectives

This course seeks to develop students who:

  1. can leverage the mathematical structure of a problem to develop a solution
  2. can use abstraction and modularity to manage complexity
  3. can use formal arguments to prove the correctness of a problem solution
  4. master a non-declarative programming paradigm
  5. have gained advanced skills, concepts and techniques in programming and Computer Science

Learning Outcomes

Upon successful completion of this course, students will be able to:

  1. explain and use basic programming language concepts such as typing, evaluation, declarations, expressions, values, and types
  2. explain and use advanced programming language concepts such as data types, pattern matching, polymorphism, higher-order functions, continuations, exceptions, streams, memoization, modularity, formal language hierarchy
  3. design recursive algorithms and develop recursive programs
  4. use mathematical induction to prove program correctness
  5. model problems in Computer Science using lists, trees and graphs
  6. program symbolic solutions to problems using data types and pattern matching
  7. use polymorphism and functional arguments to build reusable program modules
  8. develop abstract and parametric modules for code reusability
  9. write a grammar for a language and program a basic parser
  10. recognize non-computable problems and give formal arguments to support non-computability claims

Schedule of Classes

In this course, there will be two types of lectures:

At a glance ...


Sun 11 Jan
Lecture 1
Welcome and Course Introduction
We outline the course, its goals, and talk about various administrative issues.

Inductive Computing
We begin by reviewing a few simple proofs of nuerical properties by mathematical and complete induction and infer their distinguishing factor as the necessary fallback on a base case. We use this observation to give a computational expression to the participating entities as inductive function definitions. We conclude by discussing the differences between inductively defined functions and recursive functions.
Mon 12 Jan
Recitation 1
Exercises on Inductive Computing
Exercises:
Readings:
Tue 13 Jan
Lecture 2
Introduction to SML
This lectures introduces the technical underpinnings of the language Standard ML as well as some elementary constructs. In particular, it discusses the functional programming paradigm, the characteristics of strongly typed languages, and interpreter-based execution.
Wed 14 Jan
Recitation 2
SML, Style

Sun 18 Jan
Lecture 3
Inductive Data Structures
The concept of inductive definition, introduced in the specific case of natural numbers, is easily extended to generic data structures as long as they are finite. These correspond to the notion of freely-generated expressions from abstract algebra. We demonstrate this technique in the case of lists and trees. Such inductive definition of data structures is paralled by a simple generalization of traditional proofs by induction, called structural induction.
Mon 19 Jan
Recitation 3
Exercises on Inductive Data Structures
Exercises:
Readings:
Tue 20 Jan
Lecture 4
Datatypes, Patterns, and Lists
The algebraic concept of inductive data structure is available in ML as the datatype mechanism. This provides a way to represent data in a natural fashion. We define datatypes for a variety of data structures and show how to write functions for them. We see that the structure of a datatype definition naturally leads to clausal function definitions, a convenient way to work with them based on pattern matching. We discuss at length ML's predefined support for lists and introduce the programming concept of polymorphism.
Further readings:
Wed 21 Jan
Recitation 4
Exercises on Datatypes, Patterns, and Lists

Sun 25 Jan
Lecture 5
Proving Properties of Programs
Inductive definitions can be used not only to describe data structures, but also to give a formal specification to programming concepts such as typing and evaluation. We concentrate on ML evaluation and show how the methodology of structural induction can be used to prove properties about programs. We apply it to termination proofs, equivalence proofs, and to prove the correctness of an ML function with respect to a specification. Along the way, we stumble upon common techniques to prove uncooperative theorems by induction, in particular the concept of generalization.
Mon 26 Jan
Recitation 5
Exercises on Properties of Programs
Exercises:
Readings:
Tue 27 Jan
Lecture 6
Declarations, Binding, and Scope
We take a closer look at how ML manages declarations. Declarations evaluate to environments, which collects a set of bindings of variables to values which can be used in subsequent declarations or expressions. We introduce the mechanisms provided in ML for local declarations and expand on the rules of scope, which explain how references to identifiers are resolved. This is somewhat tricky for recursive function declarations. We also discuss tail recursion, a form of recursion that is somewhat like the use of loops in imperative programming. This form of recursion is often especially efficient and easy to analyze. Accumulator arguments play an important role in tail recursion.
Wed 28 Jan
Recitation 6
Exercises on Declarations, Binding and Scope
Exercises:
Readings:

Sun 1 Feb
Lecture 7
Representation Invariants
As inductive definitions get more complex, it becomes a challenge to convince oneself (and prove to others) that they are actually correct. The argument usually relies on invariants that the representation is expected to satisfy at any time. Making these invariants explicit is extremely useful. We demonstrate it on red/black trees, a relatively complex search tree.
Mon 2 Feb
Recitation 7
Exercises on Representation Invariants
Exercises:
Readings:
Tue 3 Feb
Lecture 8
Modularity
We discuss the separation between specification and implementation in the context of code reuse. The specification describes the functionalities of an abstract data type, with its syntactic aspects expressed as a module interface. The implementation consists of specific code that realizes these functionalities, code whose details are invisible to the user of a module. We also introduce advanced concepts such as parametric modules. The discussion is concretized by examining the specific module system of ML, in particular the concepts of signature, structure, functor and ascription.
Wed 4 Feb
Recitation 8
Exercises on Modularity

Sun 8 Feb
Lecture 9
First-Class Functions
Higher-order functions are functions that manipulate other functions. We introduce the notion of nameless functions and functions as first-class values. We discuss currying, a common transformation between some traditional functions and some higher-order functions, and study situations where one or the other representation is advantageous. We present some standard higher-order functions that allow to concisely work with lists and other inductive data structures.
Further readings:
Mon 9 Feb
Recitation 9
Exercises on Higher-Order Functions
Tue 10 Feb
Lecture 10
Continuations
One very useful application of higher-order functions is as a way to control the execution of a program: continuations act as "functional accumulators." The basic idea of the technique is to implement a function f by defining a tail-recursive function f' that takes an additional argument, the continuation. This continuation is a function; it encapsulates the computation that should be done on the result of f. In the base case, instead of returning a result, we call the continuation. In the recursive case we augment the given continuation with whatever computation should be done on the result.
Wed 11 Feb
Recitation 10
Exercises on Continuations

Sun 15 Feb
Lecture 11
Puzzles and games
Implementing games efficiently requires explicit and complex control of the execution. In this lecture, we look at games and their representation as a mathematical problem. We define games and game trees, and study two strategies for choosing the next move in a game.
Mon 16 Feb
Recitation 11
Exercises on Game Tree Search
Tue 17 Feb
Lecture 12
Exceptions
Exceptions are another way to control execution programmatically. We see that exceptions can be used not only to signal error conditions, but also in backtracking search procedures or other patterns of control where a computation needs to be partially undone. Exceptions are the first type of effect that we encounter; they may cause an evaluation to be interrupted or aborted.
Wed 18 Feb
Recitation 12
Exercises on Exceptions

Sun 22 Feb
Lecture 13
Combinators
Combinators are functions of functions, that is, higher-order functions used to combine functions. One example is the function composition operator. The basic idea is to think at the level of functions, rather than at the level of values returned by those functions. Combinators are defined using the pointwise principle.
Further readings:
Mon 23 Feb
Recitation 13
Midterm review
Tue 24 Feb
Midterm
Midterm
Wed 25 Feb
Recitation 14
Exercises on Combinators
Exercises:
Readings:

Sun 1 Mar
Lecture 14
Co-Inductive definitions
Inductive data structures such as list and trees are meant to be finite. Surprisingly, many of the operations defined on them make sense also when removing the finiteness constraint, except that we take care that these operations return useful results even when applied to potentially infinite entities. Mathematically, such data structures are said to be co-inductively defined. We briefly examine how to prove properties about co-inductive definitions.
Mon 2 Mar
Recitation 15
Exercises on Co-Inductive Definitions
Exercises:
Readings:
Tue 3 Mar
Lecture 15
Demand-Driven Computation
Data streams (as in YouTube for example) are a prominent computational instance of a co-inductive data structure. We discuss how to support them in a programming language through the concept of lazy evaluation: functions in ML are evaluated eagerly, meaning that the arguments are reduced before the function is applied. An alternative is for function applications and constructors to be evaluated in a lazy manner, meaning expressions are evaluated only when their values are needed in a further computation. In an eager language such as ML, lazy evaluation can be simulated by relying on the on the fact that functions are values to "suspend" the computation. This style of evaluation is essential when working with potentially infinite data structures, such as streams, which arise naturally in many applications. Then, streams are lazy lists whose values are determined by suspended computations that generate the next element of the stream only when forced to do so
Further readings:
Wed 4 Mar
Recitation 16
Exercises on Demand-Driven Computation

Sun 8 Mar
Lecture 16
Decidability
This lecture investigates the limits of computation. We introduce the Halting problem, a simple problem that no program will ever be able to solve. We then show that many other problems are not computable through two important techniques: diagonalization (which is a direct argument), and problem reduction (which shows that a problem is undecidable by giving a reduction from another undecidable problem)
Mon 9 Mar
Recitation 17
Exercises on Decidability
Exercises:
Readings:
Tue 10 Mar
Lecture 17
State and Ephemeral Data Structures
The programming techniques used so far in the course have, for the most part, been "purely functional". Some problems, however, are more naturally addressed by keeping track of the "state" of an internal machine. Typically this requires the use of mutable storage. ML supports mutable cells, or references, that store values of a fixed type. The value in a mutable cell can be initialized, read, and changed (mutated), and these operations result in effects that change the store. Programming with references is often carried out with the help of imperative techniques. Imperative functions are used primarily for the way they change storage, rather than for their return values References allow us to create data structures that are ephemeral, i.e., that change over time. Their main advantage is the ability to maintain state as a shared resource among many routines. Another advantage in some cases is the ability to write code that is more time-efficient than purely functional code. On the other hand, they are conceptually complex and therefore error-prone: our routines may accidentally and irreversibly change the contents of a data structure; variables may be aliases for each other. As a result it is much more difficult to prove the correctness of code involving ephemeral data structures.
Further readings:
Wed 11 Mar
Recitation 18
Exercises on References and Ephemeral Data Structures
Readings:

Sun 15 Mar
Lecture 18
Computability
Although the Halting problem says that there is no hope to build a program that will give a yes/no answer to many problems of interest, it is possible to write programs that will return a yes answer but may run forever when the answer is no. This is called a semi-decision procedure. For some other problems, it is always possible to correctly return a no, but a positive answer may never be returned. There is also a class of problems for which either a yes or a no answer may not be returned in finite time.
Further readings:
Mon 16 Mar
Recitation 19
Exercises on Computability
Exercises:
Readings:
Tue 17 Mar
Lecture 19
Memoization
We continue with streams, and complete our implementation by introducing a memoizing delay function. Memoization ensures that a suspended expression is evaluated at most once. When a suspension is forced for the first time, its value is stored in a reference cell and simply returned when the suspension is forced again. The implementation that we present makes a subtle and elegant use of a "self-modifying" code technique with circular references
Further readings:
Wed 18 Mar
Recitation 20
Exercises on Memoization

Sun 22 Mar
No class (Spring Break)
Mon 23 Mar
Tue 24 Mar
Wed 25 Mar

Sun 29 Mar
Lecture 20
Language Hierarchy and Regular Languages
With this class, we begin at looking at the mathematical ingredients that are involved in building a programming language. We start by studying languages in general and classify them into a hierarchy based on their expressiveness and complexity. Several layers of this hierarchy are found in a typical interpreter. We examine in some detail regular languages and their relations to regular expressions and finite-state automata.
Mon 30 Mar
Recitation 21
Exercises on Language Hierarchy and Regular Languages
Exercises:
Readings:
Tue 31 Mar
Lecture 21
Regular Expressions and Lexical Analysis
Regular expressions - and their underlying finite-state automata - are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep. Regular expression pattern matching has a number of simple and elegant implementation in ML. Regular expressions are the key ingredient of lexical analysis, a pre-processing step carried out by many application to recognize legitimate words. Examples include compiling programming languages, processing natural languages, or manipulating HTML pages to extract structure. As an example, we study a lexical analyzer for a simple language of arithmetic expressions.
Further readings:
Wed 1 Apr.
Recitation 22
Exercises on Regular Expressions and Lexical Analysis

Sun 5 Apr.
Lecture 22
Grammars
Context-free grammars arise naturally in a variety of applications. The "Abstract Syntax Charts" in programming language manuals are one instance. The underlying machine for a context-free language is a pushdown automaton, which maintains a read-write stack that allows the machine to "count"
Further readings:
Mon 6 Apr.
Recitation 23
Exercises on Grammars
Exercises:
Readings:
Tue 7 Apr.
Lecture 23
Parsing
In this lecture we rely on context-free grammars to parse programs. We look at the two main parsing techniques, shift-reduce and recursive descent. Shift-reduce parsing uses a stack to delay application of rewrite rules, enabling operator precedence to be enforced. Recursive descent parsing is another style that uses recursion in a way that mirrors the grammar productions. Although parser generator tools exist for restricted classes of grammars, a direct implementation can allow greater flexibility and better error handling. We present an example of a shift-reduce parser for a grammar of arithmetic expressions
Further readings:
Wed 8 Mar
Recitation 24
Exercises on Parsing
Exercises:
Readings:

Sun 12 Apr
Lecture 24
Programming Language Semantics
Formal languages give us a way to determine whether a program is syntactically valid. Getting them to tell us whether it makes any sense, or how to compute a result, is better addressed by semantic means. In this lecture, we introduce one of the basic infrastructure for doing so, and apply it for describing how to evaluate a program.
Further readings:
Mon 13 Apr
Recitation 25
Exercises on Programming Language Semantics
Exercises:
Readings:
Tue 14 Apr
Lecture 25
Interpreters
We implement the evaluation semantics into an evaluator, and put it together with lexical analysis and parsing. The result is an interpreter that evaluates arithmetic expressions directly, rather than by constructing an explicit translation of the code into an intermediate language, and then into machine language, as a compiler does. Our first example uses the basic grammar of arithmetic expressions, interpreting them in terms of operations over the rational numbers. We then extend this simple language to include conditional statements, variable bindings, function definitions, and recursive functions
Further readings:
Wed 15 Apr
Recitation 26
Exercises on Interpreters
Exercises:
Readings:

Sun 19 Apr
Lecture 26
Concurrency
All techniques discussed so far assumed a sequential model of computation, where all the code was executed on a single machine. Whenever computation takes places on several processors at the same time, we have concurrency. One form of concurrency is parallel computing, where the computation is split among the available processors. Another form is distributed computing where independent machines collaborate by exchanging information. In this lecture, we introduce this general taxonomy from a mathematical perspective and discuss some of the issues that distinguish concurrent from sequential systems, especially communication. We explore the main techniques to address these problems.
Mon 20 Apr
Recitation 27
Exercises on Concurrency
Tue 21 Apr
Lecture 27
Networking
Networked applications are an especially important case of distributed systems. We explore a few of the main concepts by showing how to write a simple web server.
Further readings:
Wed 22 Apr
Recitation 28
Final review

Mon 27 Apr
2-5pm
(1190)
Final
Final

Iliano Cervesato