# 15-150: Principles of Functional Programming

# Lecture 14: Regular Expressions

Regular expressions--and their underlying finite-state
automata--are useful in many different applications, and are central
to text processing languages and tools such as `awk`

,
`Perl`

, `emacs`

and ` grep`

.

Regular expression pattern matching has a simple and elegant
implementation in SML using continuation passing.

### Key Concepts

- Formal language
- Finite-state automaton
- Regular expression
- Continuation passing
- Proof-directed debugging

It is possible to do proofs of correctness for
our regular expression matcher in slightly different ways. The first
proves that the matcher returns `true` if and only
if it is given correct input. The second set of notes shows that the
matcher returns `true` if it is given correct input and returns
`false` otherwise. These are slightly different perspectives,
and lead to slightly different proof techniques. Let's suppose that
the matcher and all continuations involved are total, i.e., always
return either `true` or `false`. This requires proof,
but let's assume it. In that case, the two perspectives on how to
prove correctness are logically equivalent. It is largely a matter of
taste and convenience which one to pick. Previous experience in
15-150 suggests that the first proof perspective, namely "matcher
returns `true` iff correct input" leads to simpler proof steps.
So, we advocate that perspective in this course.

The first set of notes are a
good way to get a concise overall perspective on the key issues
involved in regular expression matching. They are useful in part because of
their brevity. The notes only outline a
proof, so do not use them as a template for doing 15-150
assignments.

The second set of notes works out a correctness proof in detail,
using the simpler-to-follow proof technique we just mentioned. It is
a long proof, but an excellent template for how to prove facts about
the regular expression matcher. When doing a homework assignment,
this set of notes is a useful reference and template. The second set
of notes is useful in part because of its brevity. These notes are a
good way to get a concise overall perspective on the key issues
involved in regular expression matching. The notes only outline a
proof, so do not use them as a template for doing 15-150
assignments.