Regular expressions--and their underlying finite-state
automata--are useful in many different applications, and are central
to text processing languages and tools such as `awk`

,
`Perl`

, `emacs`

and ` grep`

.

Regular expression pattern matching has a simple and elegant implementation in SML using continuation passing.

- Formal language
- Finite-state automaton
- Regular expression
- Continuation passing
- Proof-directed debugging

The notes linked below discuss regular expressions.

The notes also discuss proofs of correctness, a topic we will examine during the next lecture. The two sets of notes approach proofs of correctness for our regular expression matcher in slightly different ways:

- The first set of notes proves that the matcher returns
`true`if and only if it is given 'good' input. Here 'good' means that the input string can be split into a prefix and a suffix, such that the prefix is in the language of the given regular expression and the given continuation returns`true`when called on the suffix. (See the specs for`match`. Also note that the actual code converts strings to lists of characters, for simplicity.) - The second set of notes shows that the matcher returns
`true`if it is given 'good' input and returns`false`otherwise.

These are slightly different perspectives, and lead to slightly
different proof techniques. Let's suppose that the matcher and all
continuations involved are total, i.e., always return either
`true` or `false`. This requires proof, but let's
assume it. In that case, the two perspectives on how to prove
correctness are logically equivalent. It is largely a matter of taste
and convenience which one to pick. Previous experience in 15-150
suggests that the first proof perspective, namely "matcher returns
`true` iff 'good' input" leads to simpler proof steps. So, we
advocate that perspective in this course.

The first set of notes works out a correctness proof in detail, using the simpler-to-follow proof technique we just mentioned. It is a long proof, but an excellent template for how to prove facts about the regular expression matcher. When doing a homework assignment, this set of notes is a useful reference and template. The second set of notes is useful in part because of its brevity. These notes are a good way to get a concise overall perspective on the key issues involved in regular expression matching. The notes only outline a proof, so do not use them as a template for doing 15-150 assignments.

The second set of notes also discuss standardization of regular expressions.