15-150: Principles of Functional Programming

Lecture 14: Regular Expressions

Regular expressions--and their underlying finite-state automata--are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep.

Regular expression pattern matching has a simple and elegant implementation in SML using continuation passing.

Key Concepts

Sample Code

Example evaluation of the matcher

The notes linked below discuss regular expressions.

The notes also discuss proofs of correctness, a topic we will examine during the next lecture. The two sets of notes approach proofs of correctness for our regular expression matcher in slightly different ways:

These are slightly different perspectives, and lead to slightly different proof techniques. Let's suppose that the matcher and all continuations involved are total, i.e., always return either true or false. This requires proof, but let's suppose we know it. In that case, the two perspectives on how to prove correctness are logically equivalent. It is largely a matter of taste and convenience which one to pick. Previous experience in 15-150 suggests that the first proof perspective, namely "matcher returns true iff 'good' input" is conceptually simpler.

The first set of notes works out a correctness proof in detail, using the simpler-to-follow proof technique we just mentioned. It is a long proof, but an excellent template for how to prove facts about the regular expression matcher. When doing a homework assignment, this set of notes is a useful reference and template. The second set of notes is useful in part because of its brevity. These notes are a good way to get a concise overall perspective on the key issues involved in regular expression matching. The notes only outline a proof, so do not use them as a template for doing 15-150 assignments.

The second set of notes also discuss standardization of regular expressions.

Detailed notes on Regular Expression Matching

Concise notes on Regular Expression Matching