\documentclass[11pt,twoside]{article}
\usepackage{palatino}
\usepackage{math-cmds}
\usepackage{math-envs}
\usepackage{latin-abbrevs}
\usepackage{verbatim}
\usepackage{amssymb}
\usepackage{proof}
\usepackage{ifthen}
\usepackage{code}
\usepackage{fancyhdr}
\input{label-defns}
\input{generic-defns}
\input{syn-defns}
\input{minml-defns}
% \input{par-defns}
% \input{tinyc-defns}
% \input{mach-defns}
% \input{fj-defns}
\newcommand{\lp}{\mathcd{(}}
\newcommand{\rp}{\mathcd{)}}
\newcommand{\DD}{\mathcal{D}}
\newenvironment{infrule}[2]{\begin{equation}\label{j:#1-r:#2}}{\end{equation}}
\newcommand{\rref}[2]{\ref{j:#1-r:#2}}
\newcommand{\lecdate}{August 28, 2003}
\newcommand{\lecnum}{2}
\newcommand{\lectitle}{Inductive Definitions}
\addtolength{\oddsidemargin}{30pt}
\addtolength{\evensidemargin}{-22pt}
\title{Supplementary Notes on Inductive Definitions}
\author{15-312: Foundations of Programming Languages \\
Frank Pfenning}
\date{Lecture \lecnum\\ \lecdate}
\begin{document}
\pagestyle{fancyplain}
\setlength{\headheight}{14pt}
% \renewcommand{\chaptermark}[1]{\markboth{#1}{}}
% \renewcommand{\sectionmark}[1]{\markright{\thesection\ #1}}
\lhead[\fancyplain{}{\bfseries L\lecnum.\thepage}]%
{\fancyplain{}{\bfseries\lectitle}}
\chead[]{}
\rhead[\fancyplain{}{\bfseries\lectitle}]%
{\fancyplain{}{\bfseries L\lecnum.\thepage}}
\lfoot[{\small\scshape Supplementary Notes}]{{\small\scshape Supplementary Notes}}
\cfoot[]{}
\rfoot[{\small\scshape\lecdate}]{{\small\scshape\lecdate}}
\maketitle
These supplementary notes review the notion of an inductive definition
and give some examples of rule induction. References to Robert Harper's
draft book on \emph{Programming Languages: Theory and Practice} are
given in square brackets, by chapter or section.
Given our general goal to define and reason about programming languages,
we will have to deal with a variety of description tasks. The first is
to describe the grammar of a language. The second is to describe its
static semantics, usually via some typing rules. The third is to
describe its dynamic semantics, often via transitions of an abstract
machine. On the surface, these appear like very different formalisms
(grammars, typing rules, abstract machines) but it turns out that they
can all be viewed as special cases of \emph{inductive definitions}
[Ch.~1]. Following standard practice, inductive definitions will be
presented via judgments and inference rules providing evidence for
judgments.
The first observation is that context-free grammars can be rewritten in
the form of inference rules [Ch.~4.1]. The basic judgment has the form
\[ s \ A \]
where $s$ is a string and $A$ is a non-terminal. This should be read
as the judgment that \textit{$s$ is a string of syntactic category $A$}.
As a simple example we consider the language of properly matched
parentheses over the alphabet $\Sigma = \{ \lp, \rp \}$. This language
can be defined by the grammar
\[ M \bnfdef \varepsilon \bnfalt M\, M \bnfalt \lp M \rp \]
with the only non-terminal $M$. Recall that $\varepsilon$ stands
for the empty string.
Rewritten as inference rules we have:
\begin{infrule}{M}{eps}
\infer{\varepsilon\ M}{}
\end{infrule}
\begin{infrule}{M}{concat}
\infer{s_1\, s_2\ M}{s_1\ M \qquad s_2\ M}
\end{infrule}
\begin{infrule}{M}{paren}
\infer{\lp s\rp\ M}{s\ M}
\end{infrule}
Our interpretation of these inference rules as an inductive definition
of the judgment $s\ M$ for a string $s$ means:
\begin{quote} \it
$s\ M$ holds if and only if there is a deduction of $s\ M$
using rules (\rref{M}{eps}), (\rref{M}{concat}), and
(\rref{M}{paren}).
\end{quote}
Based on this interpretation we can prove properties of strings
in the syntactic category $M$ by rule induction. Here is a very
simple example.
\begin{theorem}[Counting Parentheses]
If $s\ M$ then $s$ has the same number of left and right parentheses.
\end{theorem}
\begin{proof}
By rule induction. We consider each case in turn.
\paragraph*{(Rule~\rref{M}{eps})} Then $s = \varepsilon$.
\begin{tabbing}
$s$ has $0$ left and $0$ right parens \` Since $s = \varepsilon$
\end{tabbing}
\paragraph*{(Rule~\rref{M}{concat})} Then $s = s_1\, s_2$.
\begin{tabbing}
$s_1\ M$ \` Subderivation \\
$s_2\ M$ \` Subderivation \\
$s_1$ has $n_1$ left and right parens for some $n_1$ \` By i.h. \\
$s_2$ has $n_2$ left and right parens for some $n_2$ \` By i.h. \\
$s$ has $n_1+n_2$ left and right parens \` Since $s = s_1\, s_2$ \\
\end{tabbing}
\paragraph*{(Rule~\rref{M}{paren})} Then $s = \lp s'\rp$.
\begin{tabbing}
$s'\ M$ \` Subderivation \\
$s'$ has $n'$ left and right parens for some $n'$ \` By i.h. \\
$s$ has $n'+1$ left and right parens \` Since $s = \lp s'\rp$
\end{tabbing}
\end{proof}
The grammar we gave, unfortunately, is ambiguous [Ch.~4.2]. For example,
there are infinitely many derivations that $\varepsilon\ M$, because
\[ \varepsilon = \varepsilon\, \varepsilon
= \varepsilon\, \varepsilon\, \varepsilon = \cdots
\]
In the particular example of this grammar we would be able to
avoid rewriting it if we can show that the abstract syntax tree
[Ch.~5.1] we construct will be the same, independently of the
derivation of a particular judgment.
An alternative is to rewrite the grammar so that it defines the same
language of strings, but the derivation of any particular string is
uniquely determined. In order to illustrate the concept of
simultaneous inductive definition, we use two non-terminals $L$
and $N$, where the category $L$ corresponds to $M$, while
$N$ is an auxiliary non-terminal.
\[
\begin{array}{rcl}
L & \bnfdef & \varepsilon \bnfalt N\, L \\
N & \bnfdef & \lp L \rp
\end{array}
\]
One can think of $L$ as a list of parenthesized expressions, while $N$ is
a single, non-empty parenthesized expression. This is readily
translated into an inductive definition via inference rules.
\begin{infrule}{L}{eps}
\infer{\varepsilon\ L}{}
\end{infrule}
\begin{infrule}{L}{concat}
\infer{s_1\, s_2\ L}{s_1\ N \qquad s_2\ L}
\end{infrule}
\begin{infrule}{N}{paren}
\infer{\lp s\rp\ N}{s\ L}
\end{infrule}
Note that the definitions of $s\ L$ and $s\ N$ depend on each other.
This is an example of a \emph{simultaneous inductive definition}.
Now there are two important questions to ask: (1) is the new grammar
really equivalent to the old one in the sense that it generates the same
set of strings, and (2) is the new grammar really unambiguous. The
latter is left as a (non-trivial!) exercise; the first one we discuss
here.
At a high level we want to show that for any string $s$, $s\ M$ iff $s\
L$. We break this down into two lemmas. This is because
``if-and-only-if'' statement can rarely be proven by a single
induction, but require different considerations for the two directions.
We first consider the direction where we assume $s\ M$ and try to show
$s\ L$. When writing out the cases we notice we need an additional
lemma. As is often the case, the presentation of the proof is therefore
different from its order of discovery. To read this proof in a more
natural order, skip ahead to Lemma~\ref{lm:M-in-L} and pay particular
attention to the last step in the case of rule (2). That step
motivates the following lemma.
\begin{lemma}[Concatenation]
\label{lm:concat}
If $s_1\ L$ and $s_2\ L$ then $s_1\, s_2\ L$.
\end{lemma}
\begin{proof}
By induction on the derivation of $s_1\ L$. Note that induction
on the derivation on $s_2\ L$ will not work in this case!
\paragraph*{(Rule~\rref{L}{eps})} Then $s_1 = \varepsilon$.
\begin{tabbing}
$s_2\ L$ \` Assumption \\
$s_1\, s_2\ L$ \` Since $s_1\, s_2 = \varepsilon\, s_2 = s_2$
\end{tabbing}
\paragraph*{(Rule~\rref{L}{concat})} Then $s_1 = s_{11}\, s_{12}$.
\begin{tabbing}
$s_{11}\ N$ \` Subderivation \\
$s_{12}\ L$ \` Subderivation \\
$s_2\ L$ \` Assumption \\
$s_{12}\, s_2\ L$ \` By i.h. \\
$s_{11}\, s_{12}\, s_2\ L$ \` By rule~(\rref{L}{concat})
\end{tabbing}
\end{proof}
Now we are ready to prove the left-to-right implication.
\begin{lemma}
\label{lm:M-in-L}
If $s\ M$ then $s\ L$.
\end{lemma}
\begin{proof}
By induction on the derivation of $s\ M$.
\paragraph*{(Rule~\rref{M}{eps})} Then $s = \varepsilon$.
\begin{tabbing}
$s\ L$ \` By rule~(\rref{L}{eps}) since $s = \varepsilon$
\end{tabbing}
\paragraph*{(Rule~\rref{M}{concat})} Then $s = s_1\, s_2$.
\begin{tabbing}
$s_1\ M$ \` Subderivation \\
$s_2\ M$ \` Subderivation \\
$s_1\ L$ \` By i.h. \\
$s_2\ L$ \` By i.h. \\
$s_1\, s_2\ L$ \` By concatenation (Lemma~\ref{lm:concat})
\end{tabbing}
\paragraph*{(Rule~\rref{M}{paren})} Then $s = \lp s'\rp$.
\begin{tabbing}
$s'\ M$ \` Subderivation \\
$s'\ L$ \` By i.h. \\
$\lp s'\rp\ N$ \` By rule~(\rref{N}{paren}) \\
$\varepsilon\ L$ \` By rule~(\rref{L}{eps}) \\
$\lp s'\rp\ L$ \` By rule~(\rref{L}{concat}) and $\lp s'\rp\, \varepsilon = \lp s'\rp$
\end{tabbing}
\end{proof}
The right-to-left direction presents a slightly different problem,
namely that the statement ``\textit{If $s\ L$ then $s\ M$}'' does not
speak about $s\ N$, even though $L$ and $N$ depend on each other. In
such a situation we typically have to generalize the induction
hypothesis to also assert an appropriate property of the auxiliary
judgments ($s\ N$, in this case). This is the first alternative proof
below. The second alternative proof uses a proof principle called
inversion, closely related to induction. We present both proofs to
illustrate both techniques.
\begin{lemma}[First Alternative, Using Generalization]
\label{lm:L-in-M}
\begin{enumerate}
\item \label{pt:L} If $s\ L$ then $s\ M$.
\item \label{pt:N} If $s\ N$ then $s\ M$.
\end{enumerate}
\end{lemma}
\begin{proof}
By simultaneous induction on the given derivations. There are two
cases to consider for part~\ref{pt:L} and one case for part~\ref{pt:N}.
\paragraph*{(Rule~\rref{L}{eps})} Then $s = \varepsilon$.
\begin{tabbing}
$s\ M$ \` By rule~(\rref{M}{eps}) since $s = \varepsilon$
\end{tabbing}
\paragraph*{(Rule~\rref{L}{concat})} Then $s = s_1\, s_2$.
\begin{tabbing}
$s_1\ N$ \` Subderivation \\
$s_2\ L$ \` Subderivation \\
$s_1\ M$ \` By i.h.(\ref{pt:N}) \\
$s_2\ M$ \` By i.h.(\ref{pt:L}) \\
$s_1\, s_2\ M$ \` By rule~(\rref{M}{concat})
\end{tabbing}
\paragraph*{(Rule~\rref{N}{paren})} Then $s = \lp s'\rp$.
\begin{tabbing}
$s'\ L$ \` Subderivation \\
$s'\ M$ \` By i.h.(\ref{pt:L}) \\
$\lp s'\rp\ M$ \` By rule~(\rref{M}{paren})
\end{tabbing}
\end{proof}
For this particular lemma, we could have avoided the generalization and
instead proven (\ref{pt:L}) directly by using a new form of argument
called \emph{inversion}. It is called inversion because it allows us to
reason from the conclusion of an inference rule to its premises, while
normally an inference rule works from the premises to the conclusion.
This is confusing (and often applied incorrectly), so make sure you
understand why and when this is legal by carefully examining the
following proof.
\addtocounter{thm}{-1}
\begin{lemma}[Second Alternative, Using Inversion]
If $s\ L$ then $s\ M$
\end{lemma}
\begin{proof}
By induction on the given derivation. Note the there are only two
cases to consider here instead of three, because there are only
two rules whose conclusion has the form $s\ L$.
\paragraph*{(Rule~\rref{L}{eps})} Then $s = \varepsilon$.
\begin{tabbing}
$s\ M$ \` By rule~(\rref{M}{eps}) since $s = \varepsilon$
\end{tabbing}
\paragraph*{(Rule~\rref{L}{concat})} Then $s = s_1\, s_2$.
\begin{tabbing}
$s_1\ N$ \` Subderivation \\
$s_1 = \lp s_1'\rp$ and $s_1'\ L$ for some $s_1'$ \` By inversion \\
$s_1'\ M$ \` By i.h. \\
$\lp s_1'\rp\ M$ \` By rule~(\rref{M}{paren}) \\
$s_2\ L$ \` Subderivation \\
$s_2\ M$ \` By i.h. \\
$\lp s_1'\rp\, s_2\ M$ \` By rule~(\rref{M}{concat}) \\
$s\ M$ \` Since $s = s_1\, s_2 = \lp s_1'\rp\, s_2$
\end{tabbing}
In this last case, the first line reminds us that we have a
subderivation of $s_1\ N$. By examining all inference rules we can see
that there is exactly one rule that has a conclusion of this form,
namely rule ($\rref{N}{paren}$). Therefore $s_1\ N$ \emph{must} have been
inferred with that rule, and $s_1$ must be equal to $\lp s_1'\rp$ for
some $s_1'$ such that $s_1'\ L$. Moreover, the derivation of $s_1'\ L$
is a subderivation of the one we started with and we can therefore apply
the induction hypothesis to it. The rest of the proof is routine.
\end{proof}
Now we can combine the preceding lemmas into the theorem
we were aiming for.
\begin{theorem}
$s\ M$ if and only if $s\ L$.
\end{theorem}
\begin{proof}
Immediate from Lemmas~\ref{lm:M-in-L} and~\ref{lm:L-in-M}.
\end{proof}
\paragraph*{Some advice on inductive proofs.} Most of the proofs
that we will carry out in the class are by induction. This is simply
due to the nature of the objects we study, which are generally defined
inductively. Therefore, when presented with a conjecture that does not
follow immediately from some lemmas, we first try to prove it by
induction as given. This might involve a choice among several different
given objects or derivations over which we may apply induction. If one
of them works we are, of course, done. If not, we try to analyse the
failure in order to decide if (a) we need to seperate out a \emph{lemma}
to be proven first, (b) we need to \emph{generalize the induction
hypothesis}, or (c) our conjecture might be false and we should look for
a \emph{counterexample}.
Finding a lemma is usually not too difficult, because it can be
suggested by the gap in the proof attempt you find it impossible to
fill. For example, in the proof of Lemma~\ref{lm:M-in-L}, case
(Rule~\rref{M}{concat}), we obtain $s_1\ L$ and $s_2\ L$ by induction
hypothesis and have to prove $s_1\, s_2\ L$. Since there are no
inference rules that would allow such a step, but it seems true
nonetheless, we prove it as Lemma~\ref{lm:concat}.
Generalizing the induction hypothesis can be a very tricky balancing
act. The problem is that in an inductive proof, the property we are
trying to establish occurs twice: once as an inductive assumption and
once as a conclusion we are trying to prove. If we strengthen the
property, the induction hypothesis gives us more information, but
conclusion becomes harder to prove. If we weaken the property, the
induction hypothesis gives us less information, but the conclusion is
easier to prove. Fortunately, there are easy cases such as the first
alternative of Lemma~\ref{lm:L-in-M} in which the nature of the mutually
recursive judgments suggested a generalization.
Finding a counterexample greatly varies in difficulty. Mostly, in this
course, counterexamples only arise if there are glaring deficiencies in
the inductive definitions, or rather obvious failure of properties such
as type safety. In other cases it might require a very deep insight
into the nature of a particular inductive definition and cannot be
gleaned directly from a failed proof attempt. An example of a difficult
counterexample is given by the extra credit Question 2.2 in Assignment 1
of this course. The conjecture might be that every tautology is a
theorem. However, there is very little in the statement of this theorem
or in the definition of \emph{tautology} and \emph{theorem} which would
suggest means to either prove or refute it.
\paragraph*{Three pitfalls to avoid.} The difficulty with inductive
proofs is that one is often blinded by the fact that the proposed
conjecture is true. Similarly, if set up correctly, it will be true
that in each case the induction hypothesis does in fact imply the
desired conclusion, but the induction hypothesis may not be strong
enough to prove it. So you must avoid the temptation to declare
something as ``clearly true'' and prove it instead.
The second kind of mistake in an inductive proof that one often
encounters is a confusion about the direction of an inference rule. If
you reason backwards from what you are trying to prove, you are thinking
about the rules bottom up: ``\textit{If I only could prove $J_1$ then I
could conclude $J_2$, because I have an inference rule with premise
$J_1$ and conclusion $J_2$.}'' Nonetheless, when you write down the
proof in the end you must use the rule in the proper direction. If you
reason forward from your assumptions using the inference rules top-down
then no confusion can arise. The only exception is the proof principle
of inversion, which you can \emph{only} employ if (a) you have
established that a derivation of a given judgment $J$ exists, and (b)
you consider all possible inference rules whose conclusion matches $J$.
In no other case can you use an inference rule ``backwards''.
The third mistake to avoid is to apply the induction hypothesis to a
derivation that is not a subderivation of the one you are given. Such
reasoning is circular and unsound. You must always verify that when you
claim something follows by induction hypothesis, it is in fact legal to
apply it!
\paragraph*{How much to write down.}
Finally, a word on the level of detail in the proofs we give and the
proofs we expect you to provide in the homework assignments. The proofs
in this handout are quite pedantic, but we ask you to be just as
pedantic unless otherwise specified. In particular, you \emph{must}
show any lemmas you are using, and you \emph{must} show the generalized
induction hypothesis in an inductive proof (if you need a
generalization). You also \emph{must} consider all the cases and
\emph{justify each line} carefully. As we gain a certain facility with
such proofs, we may relax these requirements once we are certain you
know how to fill in the steps that one might omit, for example, in a
research paper.
\end{document}