15-150: Principles of Functional Programming

Lecture 14: Regular Expressions

Regular expressions--and their underlying finite-state automata--are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep.

Regular expression pattern matching has a simple and elegant implementation in SML using continuation passing.

Key Concepts

Sample Code

It is possible to do proofs of correctness for our regular expression matcher in slightly different ways. The first proves that the matcher returns true if and only if it is given correct input. The second set of notes shows that the matcher returns true if it is given correct input and returns false otherwise. These are slightly different perspectives, and lead to slightly different proof techniques. Let's suppose that the matcher and all continuations involved are total, i.e., always return either true or false. This requires proof, but let's assume it. In that case, the two perspectives on how to prove correctness are logically equivalent. It is largely a matter of taste and convenience which one to pick. Previous experience in 15-150 suggests that the first proof perspective, namely "matcher returns true iff correct input" leads to simpler proof steps. So, we advocate that perspective in this course.

The first set of notes are a good way to get a concise overall perspective on the key issues involved in regular expression matching. They are useful in part because of their brevity. The notes only outline a proof, so do not use them as a template for doing 15-150 assignments.

The second set of notes works out a correctness proof in detail, using the simpler-to-follow proof technique we just mentioned. It is a long proof, but an excellent template for how to prove facts about the regular expression matcher. When doing a homework assignment, this set of notes is a useful reference and template. The second set of notes is useful in part because of its brevity. These notes are a good way to get a concise overall perspective on the key issues involved in regular expression matching. The notes only outline a proof, so do not use them as a template for doing 15-150 assignments.

Some notes on Regular Expression Matching by Bob Harper

Some notes on Regular Expression Matching by Dan Licata, including a proof of termination.