Logistics

Lectures:  M,W   10:00 - 11:20 (room A165)
Recitations:  T., 10:30 - 11:20 (room A156B)

Class Webpage:   http://qatar.cmu.edu/course/15-212

Instructor:
Paul Zagieboylo
Office hours:  Monday at least, by appointment for now
Office:  A068
Email: 
Phone:  492-8459
Iliano Cervesato
Office hours:  U-R 15:00-16:00
Office:  C148
Email: 
Phone:  492-8955
TA:
Jessica Mink
Office hours:  by appointment
Office:  Wean 3130
Email: 

Bulletin Boards:
Announcements (from instructors):  qatar.class.15-212.announce
Discussion (among students and instructors):  qatar.class.15-212.discuss
To post, send email to: 

News!

26 Apr 2007 The solutions for assignments 3, 4, and 5 have been posted.
18 Apr 2007 Sample finals have been posted.
11 Apr 2007 Homework 6 posted, due 24 Apr.
29 Mar 2007 Homework 5 posted, due 12 Apr. Note the change in due date.
8 Mar 2007 Homework 4 posted, due 20 Mar
19 Feb 2007 Solutions to homework 2 posted
18 Feb 2007 Past midterms have been posted.
13 Feb 2007 Homework 3 posted, due 6 Mar
31 Jan 2007 Homework 2 posted, due 13 Feb
31 Jan 2007 Solutions to homework 1 posted
17 Jan 2007 Homework 1 posted, due 30 Jan
14 Jan 2007 Essential Unix tutorial added
14 Jan 2007 Web page created

About this course

Description

This course has the purpose of introducing students who have had experience with basic data structures and algorithms to more advanced skills, concepts and techniques in programming and Computer Science in general. This will be accomplished along three dimensions.

Prerequisites

You must have completed CS 15-211 (Fundamental Data Structures and Algorithms)

Software

The course relies extensively on the programming language Standard ML (SML). The particular implementation we will be working with is Standard ML of New Jersey (SML/NJ), version 110.59.

A reference build has been made available on the Unix cluster. To run it, you need to login into your Unix account. In Windows, you do this by firing PuTTy and specifying unix.qatar.cmu.edu as the machine name. When the PuTTy window comes up, type sml, do your work, and then hit CTRL-D when you are done

You can edit your files directly on under Unix (the easiest way is to use Emacs - see this tutorial), or you can edit them on a campus machine and put them on your "I"-drive, or you can edit them on your local machine and transfer them to the Unix servers.

Useful documentation can be found on the SML/NJ web site. The following two files will be particularly useful:

If you want, you can install a personal copy of SML/NJ on your laptop. To do this, download this file and follow these instructions Personal copies are for your convenience: all software will be evaluated on the reference environment on unix.qatar.cmu.edu. You need to make sure that your homework assignments work there before submitting them.

Readings

No text book is required and just a few lectures have handouts: what a great reason to come to class! It is in your interest to read the handouts before class. Most lectures in the class schedule below reference parts of Professor Harper's forthcoming book: they are relevant to the topic of the class, but we will not necessarily follow them strictly or at all.

The code presented in each class is available electronically by following the "code" links in the class schedule below.

Further References

Grading

Tasks and Percentages

Schedule of Classes

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 15 Jan.
Lecture 1
Welcome and Course Introduction
Evaluation and Typing

We outline the course, its goals, and talk about various administrative issues. We also introduce the language ML which is used throughout the course

Tue 16 Jan.
Recitation
Practice, Style, Hints
  • Field trip to the Computer Lab (A055)
Wed 17 Jan.
Lecture 2
Binding, Scope, and Functions

We introduce declarations which evaluate to environments. An environment collects a set of bindings of variables to values which can be used in subsequent declarations or expressions. We also discuss the rules of scope which explain how references to identifiers are resolved. This is somewhat tricky for recursive function declarations.

  • Key Concepts: Declaration, Environment, Functions, Scope, Standard Libraries, Recursive Functions
  • Code
  • See also Programming in Standard ML: Chapters 3, 4
  • Homework 1 out (due on Tue 30 Jan., 2:12am Doha time)
[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 22 Jan.
Lecture 3
Recursion and Induction

We review the methods of mathematical and complete induction and show how they can be applied to prove the correctness of ML functions. Key is an understanding of the operational semantics of ML. Induction can be a difficult proof technique to apply, since we often need to generalize the theorem we want to prove, before the proof by induction goes through. Sometimes, this requires considerable ingenuity. We also introduce clausal function definitions based on pattern matching.

Tue 23 Jan.
Recitation
Scoping in recursive functions; Complete induction
Wed 24 Jan.
Lecture 4
Datatypes, Patterns, and Lists

One of the most important features of ML is that it allows the definition of new types with so-called datatype declarations. This means that programs can be written to manipulate the data in a natural representation rather than in complex encodings. This goes hand-in-hand with clausal function definitions using pattern matching on given data types. We introduce lists and polymorphic datatypes and functions

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 29 Jan.
Lecture 5
Structural Induction and Tail Recursion

We discuss the method of structural induction on recursively defined types. This technique parallels standard induction on predicates, but has a unique character of its own, and arises often in programming. We also discuss tail recursion, a form of recursion that is somewhat like the use of loops in imperative programming. This form of recursion is often especially efficient and easy to analyze. Accumulator arguments play an important role in tail recursion. As examples we consider recursively defined lists and trees

Tonight 2:12am Doha time: Homework 1 due

Tue 30 Jan.
Recitation
Lists; Equality types
Wed 31 Jan.
Lecture 6
Higher Order Functions and Staged Computation

We discuss higher order functions, specifically, passing functions as arguments, returning functions as values, and mapping functions over recursive data structures. Key to understanding functions as first class values is understanding the lexical scoping rules. We discuss staged computation based on function currying

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 05 Feb.
Lecture 7
Data Structures

  • Key Concepts: Signatures and structures, Signature ascription, Opaque and transparent ascription, Data abstraction, Data persistence, Representation invariants, Binary search trees
  • Code
  • See also Programming in Standard ML: Chapters 18, 20
Tue 06 Feb.
Recitation
Currying, folding, and mapping
Wed 07 Feb.
Lecture 8
Representation Invariants

We demonstrate a complicated representation invariant using Red/Black Trees. The main lesson is to understand the subtle interactions of invariants, data structures, and reliable code production. In order to write code satisfying a strong invariant, it is useful to proceed in stages. Each stage satisfies a simple invariant, and is provably correct. Together the stages satisfy the strong invariant

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 12 Feb.
Lecture 9
Continuations

Continuations act as "functional accumulators." The basic idea of the technique is to implement a function f by defining a tail-recursive function f' that takes an additional argument, called the continuation. This continuation is a function; it encapsulates the computation that should be done on the result of f. In the base case, instead of returning a result, we call the continuation. In the recursive case we augment the given continuation with whatever computation should be done on the result. Continuations can be used to advantage for programming solutions to a variety of problems. In today's lecture we'll look at a simple example where continuations are used to efficiently manage a certain pattern of control. We'll see a related and more significant example in an upcoming lecture when we look at regular expressions

Tonight 2:12am Doha time: Homework 2 due

Tue 13 Feb.
Recitation
Review
Wed 14 Feb.
Lecture 10
Regular Expressions

Regular expressions - and their underlying finite-state automata--are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep. Regular expression pattern matching has a simple and elegant implementation in SML using continuation passing

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 19 Feb.
Lecture 11
Review
Tue 20 Feb.
Recitation
Tail Recursion vs Continuations
Wed 21 Feb. Midterm
[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 26 Feb.
Lecture 12
Combinators

Combinators are functions of functions, that is, higher-order functions used to combine functions. One example is ML's composition operator o. The basic idea is to think at the level of functions, rather than at the level of values returned by those functions. Combinators are defined using the pointwise principle. Currying makes this easy in ML. We first discuss combinators of functions of type int -> int. Then we discuss rewriting our regular expression matcher using combinators. We using staging. The regular expression pattern matching is in one stage, the character functions are in another

  • Key Concepts: Function spaces, Combinators, Pointwise principle
  • Code
Tue 27 Feb.
Recitation
Searching, failing, and stopping early
Wed 28 Feb.
Lecture 13
Exceptions, n-Queens

Exceptions play an important role in the system of static and dynamic checks that make SML a safe language. Exceptions are the first type of effect that we will encounter; they may cause an evaluation to be interrupted or aborted. We have already seen simple uses of exceptions in the course, primarily to signal that invariants are violated or exceptional boundary cases are encountered. We now look a little more closely at what exceptions are and how they can be used. In addition to signaling error conditions, exceptions can sometimes also be used in backtracking search procedures or other patterns of control where a computation needs to be partially undone

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 05 Mar.
Lecture 14
Functors and Substructures

A functor is a parameterized module that acts as a kind of function which takes zero or more structures as arguments and returns a new structure as result. Functors greatly facilitate hierarchical organization in large programs. In particular, as discussed in the next few lectures, they can enable a clean separation between the details of particular definition and higher-level structure, allowing the implementation of "generic" algorithms that are easier to debug and maintain, and that maximize code reuse

Tonight 2:12am Doha time: Homework 3 due

Tue 06 Mar.
Recitation
Ascription, where, and functors
Wed 07 Mar.
Lecture 15
Game Tree Search

In this lecture we give an example of modularity and code reuse by illustrating a generic game tree search algorithm. By carefully specifying the interface between the game and the search procedure, the code can be written very generally, yet still applied to a wide variety of games. We illustrate this through a very simple minimax game tree search algorithm, but the underlying concepts and techniques become even more important as the sophistication of the search algorithm increases

  • Key Concepts: Interface design, Modularity, Game tree search, Min-Max, Alpha-Beta Pruning
  • Code
  • Homework 4 out (due on Tue 20 Mar., 2:12am Doha time)
[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 12 Mar.
Lecture 16
Mutation and State

The programming techniques used so far in the course have, for the most part, been "purely functional". Some problems, however, are more naturally addressed by keeping track of the "state" of an internal machine. Typically this requires the use of mutable storage. ML supports mutable cells, or references, that store values of a fixed type. The value in a mutable cell can be initialized, read, and changed (mutated), and these operations result in effects that change the store. Programming with references is often carried out with the help of imperative techniques. Imperative functions are used primarily for the way they change storage, rather than for their return values

Tue 13 Mar.
Recitation
Arrays and mutable state
Wed 14 Mar.
Lecture 17
Ephemeral Data Structures

Previously, within the purely functional part of ML, we saw that all values were persistent. At worst, a binding might shadow a previous binding. As a result our queues and dictionaries were persistent data structures. Adding an element to a queue did not change the old queue; instead it created a new queue, possibly sharing values with the old queue, but not modifying the old queue in any way. Now that we are able to create cells and modify their contents we can create ephemeral data structures. These are data structures that change over time. The main advantage of such data structures is their ability to maintain state as a shared resource among many routines. Another advantage in some cases is the ability to write code that is more time-efficient than purely functional code. The disadvantages are error and complexity: our routines may accidentally and irreversibly change the contents of a data structure; variables may be aliases for each other. As a result it is much more difficult to prove the correctness of code involving ephemeral data structures. As always, it is a good idea to keep mutation to a minimum and to be careful about enforcing invariants. We present two examples. First, we consider a standard implementation of hash tables. We use arrays to implement generic hash tables as a functor parameterized by an abstract hashable equality type. Second, we revisit the queue data structure, now defining an ephemeral queue. The queue signature clearly indicates that internal state is maintained. Our implementation uses a pair of reference cells containing mutable lists, and highlights some of the subtleties involved when reasoning about references We end the lecture with a few words about ML's value restriction. The value restriction is enforced by the ML compiler in order to avoid runtime type errors. All expressions must have well-defined lexically-determined static types

  • Key Concepts: Ephemeral data structures, Maintaining state with mutable storage, Value restriction
  • Code
  • See also Programming in Standard ML: Chapter 28
[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 19 Mar.
Lecture 18
Streams, Demand-Driven Computation

Functions in ML are evaluated eagerly, meaning that the arguments are reduced before the function is applied. An alternative is for function applications and constructors to be evaluated in a lazy manner, meaning expressions are evaluated only when their values are needed in a further computation. Lazy evaluation can be implemented by "suspending" computations in function values. This style of evaluation is essential when working with potentially infinite data structures, such as streams, which arise naturally in many applications. Streams are lazy lists whose values are determined by suspended computations that generate the next element of the stream only when forced to do so

  • Key Concepts: Demand-driven computation, Eager vs. lazy evaluation, Suspensions, Streams as infinite lists
  • Code
  • See also Programming in Standard ML: Chapter 31

Tonight 2:12am Doha time: Homework 4 due

Tue 20 Mar.
Recitation
Operations on streams; Sequences and flip-flops
Wed 21 Mar.
Lecture 19
Streams, Laziness and Memoization

We continue with streams, and complete our implementation by introducing a memoizing delay function. Memoization ensures that a suspended expression is evaluated at most once. When a suspension is forced for the first time, its value is stored in a reference cell and simply returned when the suspension is forced again. The implementation that we present makes a subtle and elegant use of a "self-modifying" code technique with circular references

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 26 Mar No class (Spring Break)
Tue 27 Mar
Wed 28 Mar
Mon 2 Apr.
Lecture 20
Lexical Analysis and Grammars

Many applications require some form of tokenization or lexical analysis to be carried out as a preprocessing step. Examples include compiling programming languages, processing natural languages, or manipulating HTML pages to extract structure. As an example, we study a lexical analyzer for a simple language of arithmetic expressions

  • Key Concepts: Tokenization, Lexical analysis, Grammar
  • Code (assumes the file stream.sml)
Tue 03 Apr.
Recitation
Languages
Wed 04 Apr.
Lecture 21
Grammars and Parsing

Context-free grammars arise naturally in a variety of applications. The "Abstract Syntax Charts" in programming language manuals are one instance. The underlying machine for a context-free language is a pushdown automaton, which maintains a read-write stack that allows the machine to "count"

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 09 Apr.
Lecture 22
More Parsing and Evaluation (maybe)

In this lecture we continue our discussion of context-free grammars, and demonstrate their role in parsing. Shift-reduce parsing uses a stack to delay application of rewrite rules, enabling operator precedence to be enforced. Recursive descent parsing is another style that uses recursion in a way that mirrors the grammar productions. Although parser generator tools exist for restricted classes of grammars, a direct implementation can allow greater flexibility and better error handling. We present an example of a shift-reduce parser for a grammar of arithmetic expressions

Tue 10 Apr.
Recitation
TBA
Wed 11 Apr.
Lecture 23
Evaluation

We now put together lexical analysis and parsing with evaluation. The result is an interpreter that evaluates arithmetic expressions directly, rather than by constructing an explicit translation of the code into an intermediate language, and then into machine language, as a compiler does. Our first example uses the basic grammar of arithmetic expressions, interpreting them in terms of operations over the rational numbers. In this and the next lecture we extend this simple language to include conditional statements, variable bindings, function definitions, and recursive functions

Tonight 2:12am Doha time: Homework 5 due

[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 16 Apr.
Lecture 24
Interpreters and Recursion

We introduce declaration environments, type environments, and value environments, to distinguish between static declarations and runtime evaluations. The parser produces declaration environments. The type-checker uses the declaration environments to build type environments, and thus perform compile-time type-checking. The evaluator uses the declaration environments to build value environments, and thus perform execution-time evaluation. We extend our set of values to include functions. In order to do this properly, we introduce the notion of a closure, which encapsulates the function definition as an expression together with the necessary variable bindings in the value environment.

Tue 17 Apr.
Recitation
Decidability, tractability, and tiling
Wed 18 Apr.
Lecture 25
Computability, Part I

In this and the next lecture we discuss the computability of functions in ML. By the Church-Turing thesis this is the same notion of computability as we have in recursion theory, with Turing machines, etc. There are two main ideas to show that certain functions are not computable: diagonalization (which is a direct argument), and problem reduction (which shows that a problem is undecidable by giving a reduction from another undecidable problem)

  • Key Concepts: Halting problem, Decision problem, Decision procedure, Semi-decision procedure, Diagonalization argument, Problem reduction, Equality of functions
  • Some notes on computability
[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
Mon 23 Apr.
Lecture 26
Computability, Part II

Tonight 2:12am Doha time: Homework 6 due

Tue 24 Apr.
Recitation
Review for the FINAL
Wed 25 Apr.
Lecture 27
Review for the FINAL
[   Goto lecture   1,   2,   3,   4,   5,   6,   7,   8,   9,   10,   11,   Midterm,   12,   13,   14,   15,   16,   17,   18,   19,   Break,   20,   21,   22,   23,   24,   25,   26,   27,   Final   ]
TBA Final exam

Acknowledgments

The provided code and the notes were authored by Michael Erdmann (CMU).


Iliano Cervesato