\documentclass[11pt,twoside]{scrartcl}

%opening
\newcommand{\lecid}{15-414}
\newcommand{\leccourse}{Bug Catching: Automated Program Verification}
\newcommand{\lecdate}{} %e.g. {October 21, 2013}
\newcommand{\lecnum}{12}
\newcommand{\lectitle}{Procedures and Ghost State}
\newcommand{\lecturer}{Matt Fredrikson}

\usepackage{listings}

\usepackage{lecnotes}

\usepackage{tikz}

\usepackage[irlabel]{bugcatch}


\usetikzlibrary{automata,shapes,positioning,matrix,shapes.callouts,decorations.text}

\tikzset{onslide/.code args={<#1>#2}{%
  \only<#1>{\pgfkeysalso{#2}} % \pgfkeysalso doesn't change the path
}}

\tikzset{
    invisible/.style={opacity=0,text opacity=0},
    visible on/.style={alt={#1{}{invisible}}},
    alt/.code args={<#1>#2#3}{%
      \alt<#1>{\pgfkeysalso{#2}}{\pgfkeysalso{#3}} % \pgfkeysalso doesn't change the path
    },
  }

\lstset{
  basicstyle=\ttfamily,
  mathescape
}

\begin{document}

\maketitle
\thispagestyle{empty}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Introduction}

The programs that we have discussed so far are somewhat limited. By restricting the statements allowed in programs to simpler forms, we have been able to understand the fundamental ideas behind the formal semantics and proof techniques for reasoning about program behavior. Importantly, the relative simplicity of the language allowed us to do this without becoming overwhelmed with a significant number of cases and details that need to be considered for rigor, but are not essential to these fundamental ideas.

``Real'' programming languages universally support more advanced ways of structuring programs that encourage abstraction, modularity, and reuse. One such construct is the procedure, which gives programmers a way to encapsulate some functionality so that it can be invoked repeatedly in the future. In this lecture we will introduce procedures into the language we have been studying.
We will allow our procedures to make use of call-by-value arguments and recursion, but we will not incorporate some of the features widely-used in imperative languages such as local variables and explicit return values.

We will develop ways of reasoning about procedure calls compositionally, using induction to establish contracts for recursive procedures consisting of pre and postconditions along with termination.
Having established contracts, we will see how logical monotonicity applies to establish a general rule for making repeated use of previous contracts at call sites.
To deal with calls that pass arbitrary expressions as arguments, we will introduce \emph{ghost state} to simplify call expressions by adding additional proof context to keep track of essential relationships across call sites and updates.

\section{Review: programs so far}

So far, we've defined a fairly simple programming language with support for arrays, conditionals, and loops.
\[
\begin{array}{llll}
%
\text{term syntax}
&
  \astrm,\bstrm ~\bebecomes&
  x &(\text{where}~x~\text{is a variable symbol}) \\
  && \alternative 
  c &(\text{where}~c~\text{is a constant literal}) \\
  && \alternative
  a(\astrm) &(\text{where}~a~\text{is an array symbol}) \\
  && \alternative
  \astrm+\bstrm & \\
  && \alternative
  \astrm\cdot\bstrm &
\\
\text{program syntax}
&
  \asprg,\bsprg ~\bebecomes&
  \pupdate{\pumod{x}{\astrm}}&(\text{where}~x~\text{is a variable symbol}) \\
  && \alternative
  \pupdate{\pumod{a(\astrm)}}{\bstrm}&(\text{where}~a~\text{is an array symbol}) \\
  && \alternative
  \ptest{\ivr} & \\
  && \alternative
  \pif{\ivr}{\asprg}{\bsprg} & \\
  && \alternative
  \asprg;\bsprg & \\
  && \alternative
  \pwhile{\ivr}{\asprg}
\end{array}
\]
Semantically, we modeled arrays as functions from their domain ($\mathbb{Z}$) to their range ($\mathbb{Z}$), which meant that the states of our programs are maps from the set of all variables to $\mathbb{Z} \cup (\mathbb{Z} \to \mathbb{Z})$. We then defined the semantics of terms with arrays in them.
\begin{definition}[Semantics of terms]
The \emph{semantics of a term} $\astrm$ in a state $\iget[state]{\I}\in\linterpretations{\Sigma}{V}$ is its value \(\ivaluation{\I}{\astrm}\).
It is defined inductively by distinguishing the shape of term $\astrm$ as follows:
\begin{itemize}
\item \m{\ivaluation{\I}{x} = \iget[state]{\I}(x)} for variable $x$
\item \m{\ivaluation{\I}{c} = c} for number literals $c$
%\item \(\ivaluation{\I}{f(\theta_1,\dots,\theta_k)} = \iget[const]{\I}(f)\big(\ivaluation{\I}{\theta_1},\dots,\ivaluation{\I}{\theta_k}\big)\)
\item \m{\ivaluation{\I}{\astrm+\bstrm} = \ivaluation{\I}{\astrm} + \ivaluation{\I}{\bstrm}}
\item \m{\ivaluation{\I}{\astrm\cdot\bstrm} = \ivaluation{\I}{\astrm} \cdot \ivaluation{\I}{\bstrm}}
\end{itemize}
\end{definition}

\begin{definition}[Transition semantics of programs] \label{def:program-transition}
\indexn{\lenvelope\asprg\renvelope|textbf}%
Each program $\asprg$ is interpreted semantically as a binary reachability relation \m{\iaccess[\asprg]{\I}\subseteq\linterpretations{\Sigma}{V}\times\linterpretations{\Sigma}{V}} over states, defined inductively by
\begin{enumerate}
\item \m{\iaccess[\pupdate{\pumod{x}{\genDJ{x}}}]{\I} = \{(\iget[state]{\I},\iget[state]{\It}) \with \iget[state]{\It}=\iget[state]{\I}~\text{except that}~ \ivaluation{\It}{x}=\ivaluation{\I}{\genDJ{x}}\}}
\\
The final state $\iget[state]{\It}$ is identical to the initial state $\iget[state]{\I}$ except in its interpretation of the variable $x$, which is changed to the value that $\genDJ{x}$ has in initial state $\iget[state]{\I}$.

\item \m{\llbracket\pupdate{\pumod{a(\astrm)}}{\bstrm}\rrbracket = 
\{(\omega,\nu) : \omega = \nu~\text{except}~\nu(a) = \arrupd{\omega(a)}{\omega\llbracket\astrm\rrbracket}{\omega\llbracket\bstrm\rrbracket}\}}
\\
The final state $\iget[state]{\It}$ is identical to the initial state $\iget[state]{\I}$ except in its interpretation of the array symbol $a$, which is updated at position $\omega\llbracket\astrm\rrbracket$ to take the value $\omega\llbracket\bstrm\rrbracket$.

\item \m{\iaccess[\ptest{\ivr}]{\I} = \{(\iget[state]{\I},\iget[state]{\I}) \with \imodels{\I}{\ivr}\}}
\\
The test \(\ptest{\ivr}\) stays in its state $\iget[state]{\I}$ if formula $\ivr$ holds in $\iget[state]{\I}$, otherwise there is no transition.

\item \m{\iaccess[\pif{\ivr}{\asprg}{\bsprg}]{\I} = \{(\iget[state]{\I},\iget[state]{\It}) \with \imodels{\I}{\ivr} ~\text{and}~ \iaccessible[\asprg]{\I}{\It} ~\text{or}~ \inonmodels{\I}{\ivr} ~\text{and}~ \iaccessible[\bsprg]{\I}{\It}\}}
\\
The \m{\pif{\ivr}{\asprg}{\bsprg}} program runs $\asprg$ if $\ivr$ is true in the initial state and otherwise runs $\bsprg$.

\item \m{\iaccess[\asprg;\bsprg]{\I} = \iaccess[\asprg]{\I} \compose\iaccess[\bsprg]{\I}}
\(= \{(\iget[state]{\I},\iget[state]{\It}) \with (\iget[state]{\I},\iget[state]{\Iz}) \in \iaccess[\asprg]{\I},  (\iget[state]{\Iz},\iget[state]{\It}) \in \iaccess[\bsprg]{\I}\}\)
\\
The relation \m{\iaccess[\asprg;\bsprg]{\I}} is the composition \(\iaccess[\asprg]{\I} \compose\iaccess[\bsprg]{\I}\) of relation \(\iaccess[\bsprg]{\I}\) after \(\iaccess[\asprg]{\I}\) and can, thus, follow any transition of $\asprg$ through any intermediate state $\iget[state]{\Iz}$ to a transition of $\bsprg$.

\item \m{\iaccess[\pwhile{\ivr}{\asprg}]{\I} = \big\{(\iget[state]{\I},\iget[state]{\It}) \with}
there are an $n$ and states
\(\iget[state]{\Iz[0]}=\iget[state]{\I},\iget[state]{\Iz[1]},\iget[state]{\Iz[2]},\dots,\iget[state]{\Iz[n]}=\iget[state]{\It}\)
such that for all $0\leq i<n$:
\textcircled{1} the loop condition is true \m{\imodels{\Iz[i]}{\ivr}} and
\textcircled{2} from state $\iget[state]{\Iz[i]}$ is state $\iget[state]{\Iz[i+1]}$ reachable by running $\asprg$ so
\m{\iaccessible[\asprg]{\Iz[i]}{\Iz[i+1]}}
and \textcircled{3} the loop condition is false \m{\inonmodels{\Iz[n]}{\ivr}} in the end$\big\}$
\\
The \(\pwhile{\ivr}{\asprg}\) loop runs $\asprg$ repeatedly when $\ivr$ is true and only stops when $\ivr$ is false.
It will not reach any final state in case $\ivr$ remains true all the time.
For example \m{\iaccess[\pwhile{\ltrue}{\asprg}]{\I} = \emptyset}.
\end{enumerate}
\end{definition}

\section{Adding procedure calls}

Now we will extend our language with procedure calls. We'll assume that our language doesn't have any scoping conventions, so procedures can read and modify any variable in the state. To start out, we'll assume that procedures take no arguments, and can modify any variable or array in the state.

We update the program syntax to add a new alternative for procedure call, distinguished by the presence of parenthesis after the procedure name:
\[
\begin{array}{llll}
\text{program syntax}
&
  \asprg,\bsprg ~\bebecomes&
  \pupdate{\pumod{x}{\astrm}}&(\text{where}~x~\text{is a variable symbol}) \\
  && \alternative
  \pupdate{\pumod{a(\astrm)}}{\bstrm}&(\text{where}~a~\text{is an array symbol}) \\
  && \alternative
  \ptest{\ivr} & \\
  && \alternative
  \pif{\ivr}{\asprg}{\bsprg} & \\
  && \alternative
  \asprg;\bsprg & \\
  && \alternative
  \pwhile{\ivr}{\asprg} \\
  && \alternative
  \pcall{m}{\astrm_1,\ldots,\astrm_n}
\end{array}
\]

\subsection{First, without recursion}
If we can assume that the body of \keywordfont{m} does not make any recursive calls, then we can reason about calls to \texttt{m} in a straightforward way. 

In the call $\pcall{m}{\astrm_1,\ldots,\astrm_n}$, the $\astrm_1, \ldots, \astrm_n$ are called the \emph{actual parameters}. For the corresponding declaration,
\begin{lstlisting}
    proc m(x${}_1$, $\ldots$, x${}_n$) {...}
\end{lstlisting}
the $x_1, \ldots, x_n$ are called the \emph{formal parameters}. The actual parameters are terms that are evaluated in the calling context, using the current state at the moment the call is made. The formal parameters are variables that are assigned the corresponding values of the actuals, for later use in the procedure body. This convention corresponds to the typical call-by-value semantics present in many languages.
To simplify matters when reasoning compositionally, we will assume that \textbf{the only free variables appearing in a procedure are its formal arguments}.
This means that the body of a procedure is only allowed to ``read'' from its arguments, and not from any other variables.

When formalizing the semantics of procedure calls, we need to be careful about the state used to evaluate the actuals. We might be tempted to reduce the semantics of a procedure call to a seemingly equivalent sequence of assignments and inlining operations. For example, in the following let $\asprg$ be the body of \keywordfont{m} and $v_1,\ldots,v_n$ be fresh variables, and assume for the sake of clarity that we are only concerned for the moment with non-recursive procedures.
\[
\iaccess[\pcall{m}{\astrm_1,\ldots,\astrm_n}]{\I} 
= 
\iaccess[\pumod{v_1}{x_1};\ldots;\pumod{v_n}{x_n};\pumod{x_1}{\astrm_1};\ldots;\pumod{x_n}{\astrm_n};\asprg;\pumod{x_1}{v_1};\ldots;\pumod{x_n}{v_n}]{\I} 
\]
The idea behind this definition is that before entering the procedure body, the current values of the variables corresponding to the formal parameters are stored in a set of fresh new variables. The formal parameter variables are then assigned to the terms given as actuals, and the procedure body is run. When it completes, the formals are restored to their values before the call.

This definition captures the notion of locality that we want with respect to formal parameters. Namely, that within the procedure they take the values passed in, and the calling context need not worry about variables that happen to collide with formal parameters being overwritten. However, this definition introduces spurious dependencies between the actuals. Consider the following program.
\begin{lstlisting}
    proc foo(x, y) {
      ...
    }

    x := 1;
    a := 0;
    b := x+2;
    foo(a,b);
\end{lstlisting}
In this example, if we used the above semantics, then the value passed to \keywordfont{foo} in the formal parameter $y$ would be 2, rather than 3 as we would expect.

What we want the semantics to encode is a parallel assignment of all of the formal parameters to their actuals. This leads us to the following definition.
\[
\begin{array}{l@{\hskip 0pt}l}
\iaccess[\pcall{m}{\astrm_1,\ldots,\astrm_n}]{\I} =
\{
  (\iget[state]{\I},\iget[state]{\It}) \with &
  \exists~\iget[state]{\Iz[1]},\iget[state]{\Iz[2]}~\text{where}~\iget[state]{\Iz[1]}=\iget[state]{\I}~\text{except}~
    \iget[state]{\Iz[1]}(x_i)=\ivaluation{\I}{\astrm_i}~\text{for}~i=1,\ldots,n, \\
  &
  (\iget[state]{\Iz[1]},\iget[state]{\Iz[2]}) \in \iaccess[\asprg]{\I}, \\
  &
  \iget[state]{\It}=\iget[state]{\Iz[2]}~\text{except}~\iget[state]{\It}(x_i)=\iget[state]{\I}(x_i)~\text{for}~i=1,\ldots,n
\}
\end{array}
\]

\subsection{Semantics, with recursion}

Consider the following procedure that implements the factorial function. It assumes that the variable $n$ contains the ``input'', and stores the computed factorial value in $r$.

\begin{lstlisting}
proc fact(n) {
  if(n = 0) { r := 1 } 
  else { fact(n-1); r := r * n; }
}
\end{lstlisting}

Before going further, let's have another look at the semantics of procedure calls. In non-recursive cases, we can always inline the body of a procedure, replacing all procedure calls with their corresponding bodies, and repeating that process until there are no further opportunities to do so. When we reach this point, we will be left with a program that doesn't have any procedure calls, and we can apply the semantics for other program constructs that we have built up over the semester.

Now that we want to account for recursion, things are different. If we use this process of inlining, we may never finish because each time we replace a call with its body, we introduce at least one more call to the same procedure! Let's formalize this a bit, and see if we can arrive at further insights.

If $\asprg$ is the body of a recursive procedure \keywordfont{m}, then we will use the notation $\recunfold{\asprg}{k}$ to denote a \emph{syntactic approximation} of $\asprg$ after $k$ levels of inlining recursive calls. When $k=0$, we simply replace the entire body with $\pabort$, which you will recall from earlier is defined as:
$
\pabort \equiv \ptest{\mathit{false}},~\text{with semantics}~\iaccess[\pabort]{\I} = \emptyset
$.
To be precise, we define the syntactic approximation $\recunfold{\asprg}{k}$ inductively on $k$ as follows:
\begin{equation}
\begin{array}{ll}
\recunfold{\asprg}{0} &= \pabort \\
\recunfold{\asprg}{k+1} &= \subst[\asprg]{\pcallnoarg{m}}{\recunfold{\asprg}{k}}
\end{array}
\end{equation}
Where $\subst[\asprg]{\pcallnoarg{m}}{\recunfold{\asprg}{k}}$ denotes $\asprg$ with all instances of $\pcallnoarg{m}$ replaced with $\recunfold{\asprg}{k}$.

We will write $\recunfold{\keywordfont{m}}{k}$ to denote the procedure obtained by replacing the body with $\recunfold{\asprg}{k}$. So, for example applying this to the \keywordfont{fact} procedure from before, \recunfold{\keywordfont{fact}}{1} would correspond to the program:
\begin{lstlisting}
proc fact${}^{(1)}$(n) {
  if(n = 0) { r := 1 } 
  else { abort; r := r*n; }
}
\end{lstlisting}
The second approximation \recunfold{\keywordfont{fact}}{2} would give us:
\begin{lstlisting}
proc fact${}^{(2)}$(n) {
  if(n = 0) { r := 1 } 
  else { 
    if(n-1 = 0) { r := 1 } 
    else { abort; r := r*n; }; 
    r := r*n
  }
}
\end{lstlisting}
In general, we would write \recunfold{\keywordfont{fact}}{k} when $k > 0$ as:
\begin{lstlisting}
proc fact${}^{(k)}$(n) {
  if(n = 0) { r := 1 } 
  else { fact${}^{k-1}$(n-1); r := r*n; }
}
\end{lstlisting}
Let's think about this in the context of concrete executions of \keywordfont{fact}. If we call $\keywordfont{fact}(n)$ with $n \le 0$, then we can reason about the behavior of this by considering \recunfold{\keywordfont{fact}}{1}. The reason is that we will never encounter the ``else'' branch, and return the correct answer $y = 1 = 0!$. So we know that:
\[
\iaccess[\recunfold{\keywordfont{fact}}{1}(n)]{\I} \subseteq \iaccess[\pcall{fact}{n}]{\I}
\]
Similarly, if we call $\pcall{fact}{n}$ with $n \le 1$, then we can reason about the behavior by considering \recunfold{\keywordfont{fact}}{1} and \recunfold{\keywordfont{fact}}{2}. Looking back at the second approximation of \keywordfont{fact} listed above, we see that $n$ will be decremented immediately in the ``else'' branch, and tested against 0 before the $\pabort$ is ever reached. So we know that:
\[
\iaccess[\recunfold{\keywordfont{fact}}{1}(n)]{\I} \cup \iaccess[\recunfold{\keywordfont{fact}}{2}(n)]{\I} \subseteq \iaccess[\pcall{fact}{n}]{\I}
\]
Continuing with this line of reasoning, the pattern is clear. If we want to account for the behavior of calling $\keywordfont{fact}(n)$ where $n$ is at most $k$, then we need to ensure that,
\[
\iaccess[\recunfold{\keywordfont{fact}}{1}(n)]{\I} \cup \cdots \cup \iaccess[\recunfold{\keywordfont{fact}}{k+1}(n)]{\I} \subseteq \iaccess[\pcall{fact}{n}]{\I}
\]
But we want to characterize the full semantics of \keywordfont{fact}, imposing no upper bound on the value of $n$ from which we call it. We can take our reasoning to its logical conclusion, and define the semantics of a procedure call as shown in Definition~\ref{def:procsemantics}.

\begin{definition}[Semantics of procedure calls (with recursion)]
\label{def:procsemantics}
The semantics of $\pcall{m}{\astrm_1,\ldots,\astrm_n}$ are as follows:
\begin{equation}
\label{eq:recsemantics}
\textstyle
\iaccess[\pcall{m}{\astrm_1,\ldots,\astrm_n}]{\I} = \{(\iget[state]{\I},\iget[state]{\It}) \with (\iget[state]{\I},\iget[state]{\It}) \in \bigcup_{k\ge 0}\iaccess[\recunfold{\keywordfont{m}}{k}(\astrm_1,\ldots\astrm_n)]{\I}\}
\end{equation}
\end{definition}
Although we did not talk about the inclusion of $\recunfold{\keywordfont{m}}{0}$ on our way to this definition, note that because $\iaccess[\recunfold{\keywordfont{m}}{0}]{\I} = \emptyset$, it follows trivially.

\section{Reasoning compositionally about procedure calls}

Now that we understand the meaning of procedure calls, we will derive reasoning principles for dealing with them in correctness proofs.
We will start with a rule that lets of make use of the familiar notion of contracts, and then see how to go about showing that contracts hold for a given procedure.

\subsection{Contracts}
However, we often know more about procedures because we write contracts, or precondition-postcondition pairs that specify requirements at the call site and guarantees about the state afterwards. If we assume a precondition $A$ and postcondition $B$ for \keywordfont{m}, then we can avoid having to prove things directly about the body $\alpha$ and instead just show that the contract gives us what we need.

\begin{theorem}
The contract procedure call rule is sound by derivation.
\[
\cinferenceRule[callb|$\dibox{\text{call}}$]{call/contract expand}
{\linferenceRule[formula]
  {
    \lsequent[g]{\Gamma}{A}
    &\lsequent[g]{A}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{B}}
    &\lsequent[g]{B}{\ausfml}
  }
  {
    \lsequent[L]{}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
  }
}{}%
\]
\end{theorem}
\begin{proof}
We can derive this rule using \irref{inlineb}, \irref{Mb}, and two applications of \irref{cut}. The principle follows from monotonicity, and is not specific to the program in the box being a procedure call.
The proof is as follows.
\begin{sequentdeduction}[array]
\linfer[cut+weakenl+weakenr] {
  \lsequent[L]{}{A}\ \ 
  !\linfer[cut+weakenl] {
    \lsequent[g]{A}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{B}}
    !\linfer[Mb] {
      \lsequent[g]{B}{\ausfml}
    } {
      \lsequent[g]{\dbox{\pcall{m}{x_1,\ldots,x_n}}{B}}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
    }
  } {
    \lsequent[g]{A}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
  }
} {
  \lsequent[L]{}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
}
\end{sequentdeduction}
\end{proof}
Note that because the proof of \irref{callb} follows from monotonicity, which also holds for diamond modalities, the corresponding total correctness rule is sound as well.
\begin{theorem}
The contract procedure call rule \irref{calld} is sound by derivation.
\[
\cinferenceRule[calld|$\didia{\text{call}}$]{call/contract expand}
{\linferenceRule[formula]
  {
    \lsequent[g]{\Gamma}{A}
    &\lsequent[g]{A}{\ddiamond{\pcall{m}{x_1,\ldots,x_n}}{B}}
    &\lsequent[g]{B}{\ausfml}
  }
  {
    \lsequent[L]{}{\ddiamond{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
  }
}{}%
\]
\end{theorem}

The rules \irref{callb} and \irref{calld} are convenient in practice because we can decide on the contract $A,B$ once and for all before using the procedure, construct a proof of $\lsequent{A}{\dbox{\alpha}B}$, and reuse that proof whenever we need to reason about a call to \keywordfont{m}. All that we need to do for each call is derive a proof that the calling context entails the precondition ($\lsequent[L]{}{A}$), and a corresponding proof that the postcondition gives the property we're after ($\lsequent{B}{\ausfml}$). This sort of compositionality lets us reuse past work, and is key to scaling verification to larger and more complex programs.

\subsection{Total correctness}

Now that we've seen how to reason compositionally about procedure calls, let's see how we can go about establishing a contract to begin with.
We'll do so in the general case that includes recursion, as non-recursive procedures can be dealt with through sufficient inlining as described above.

We should expect that in order to reason about recursive procedure calls, we will need to use induction. 
Just as when reasoning about loops we need to find an inductive loop invariant that allows us to conclude things about executions that may continue for arbitrarily many steps, when reasoning about procedures we typically need to find a contract that allows us to reason about the effects of calls, recursive or otherwise.

Recall from our discussion of loop convergence the \irref{varwhile} rule.
\[
\dinferenceRule[varwhile|var]{while loop variant}
{\linferenceRule[sequent]
  {\lsequent[L]{}{\inv}
  &\lsequent[G]{\inv,\ivr,\var=n}{\ddiamond{\ausprg}{(\inv\land \var<n)}}
  &\lsequent[G]{\inv,\ivr}{\var\geq0}
  &\lsequent[G]{\inv,\lnot\ivr}{\ausfml}
  }
  {\lsequent[L]{}{\ddiamond{\pwhile{\ivr}{\ausprg}}{\ausfml}}}
}{n~\text{fresh}}%{n\not\in\ivr,\ausprg}
\]
This rule requires that we select a term $\var$ whose value will decrease with each iteration of the loop. It should stay nonnegative as long as the loop continues executing, assuming that we declare 0 to be the lower bound towards which $\var$ converges. As long as it never violates this bound, the invariance of $\var$'s decreasing lets us conclude that the loop terminates.

How can we use similar reasoning with a recursive procedure \keywordfont{m}? Intuitively, we want to associate a similar term $\var$ with each call to \keywordfont{m}, and argue that this term decreases each time \keywordfont{m} makes a recursive call. The rule \irref{recd} below captures this reasoning, where $\asprg$ is the body of \keywordfont{m} and $\bar{x}$ is the list of \keywordfont{m}'s formal arguments.
\[
\dinferenceRule[recd|$\didia{\text{rec}}$]{recursive termination}
{\linferenceRule[sequent]
  {
    \lsequent[L]{\forall \bar{x} . A \land \var < n \limply \ddiamond{\pcall{m}{\bar{x}}}B}{A \land \var = n \limply \ddiamond{\asprg}B}
    &\lsequent{A}{\var \ge 0}
  }
  {
    \lsequent[L]{}{A \limply \ddiamond{\pcall{m}{\bar{x}}}B}
  }
}{n~\text{fresh}}
\]
Intuitively, this rule says that if we want to conclude that \keywordfont{m} terminates in a state described by $B$ when starting in one described by $A$, then we reason about the body assuming that the variant term $\var = n$ when it begins executing. We are allowed to assume that recursive calls beginning in a state where $\var < n$ will terminate in one described by $B$. Importantly, we must also show that the variant $\var$ is bounded below by 0. 

\begin{theorem}[\cite{Apt09}]
The rule \irref{recd} is sound. That is, if we have
\begin{equation}
\label{eq:recd1}
\imodels{}{(\forall \bar{x} . A \land \var < n \limply \ddiamond{\pcall{m}{\bar{x}}}B) \limply (A \land \var = n \limply \ddiamond{\alpha}B)}
\end{equation}
and
\begin{equation}
\label{eq:recd2}
\imodels{}{A\limply\var \ge 0}
\end{equation}
then it is the case that
\begin{equation}
\label{eq:recd3}
\imodels{}{A \limply \ddiamond{\pcall{m}{\bar{x}}}B}
\end{equation}
\end{theorem}
% \begin{proof}
% See~\cite{Apt09}.

% Note that because $n$ does not appear in $A$ or $\var$, we can conclude that $A \equiv \exists n . (A \land \var < n)$. So using the fact that $\imodels{}{A \limply \var \ge 0}$, we have $\exists n . (A \land \var < n) \limply \exists n\ge0 . (A \land \var < n)$. From this we see that it is sufficient to show that the following is implied by (\ref{eq:recd1}) and (\ref{eq:recd2}).
% \[
% \imodels{}{\exists n\ge0.A \land \var < n \limply \ddiamond{\pcall{m}{\bar{x}}}B}
% \]
% Furthermore, because $n$ does not appear in $\asprg$ or $B$, $\exists n\ge0 . (A \land \var < n) \limply \ddiamond{\pcallnoarg{m}}B$ is true if $A \land \var < n \limply \ddiamond{\pcallnoarg{m}}B$ is true for all $n \ge 0$. We then write our goal as follows:
% \[
% \imodels{}{A \land \var < n \limply \ddiamond{\pcall{m}{\bar{x}}}B}
% \]
% and proceed by induction on $n$.
% \begin{description}
% \item[Basis $n = 0$.] We want to show:
% \[
% \imodels{}{A \land \var < 0 \limply \ddiamond{\pcall{m}{\bar{x}}}B}
% \] 
% Fix an arbitrary state $\iget[state]{\I}$. We have that $\imodels{\I}{A \limply \var \ge 0}$, so $\inonmodels{\I}{A \land \var < 0}$. This gives us $\imodels{\I}{A \land \var < 0 \limply \ddiamond{\pcall{m}{\bar{x}}}B}$.

% \item[Induction step $n>0$.] We want to show:
% \[
% \imodels{}{A \land \var < n + 1 \limply \ddiamond{\pcall{m}{\bar{x}}}B}
% \]
% Fix an arbitrary state $\iget[state]{\I}$. The inductive hypothesis gives us that: 
% \[
% \imodels{\I}{A \land \var < n \limply \ddiamond{\pcall{m}{\bar{x}}}B}
% \] 
% From this and (\ref{eq:recd1}), we have that $\imodels{\I}{A \land \var = n \limply \ddiamond{\asprg}B}$. 
% But $\var < n+1 \equiv (\var < n \lor \var = n)$, so then $\imodels{\I}{A \land \var < n+1 \limply \ddiamond{\asprg}B}$. 
% This concludes the inductive case.
% \end{description}
% \end{proof}

\paragraph{Back to \keywordfont{fact}.}
Let's apply this to \keywordfont{fact} from before, and prove that it terminates. One minor annoyance is that \irref{recd} relies on a fresh variable $n$, which we also used in \keywordfont{fact}. We'll modify the program slightly to ensure that the steps of our proof line up with the symbols in \irref{recd}.
\begin{lstlisting}
proc fact(x) {
  if(x = 0) { r := 1 } 
  else { fact(x-1); r := r*x; }
}
\end{lstlisting}
Recall the contract we stated earlier.
\[
\begin{array}{ll}
A &\equiv x \ge 0 \\
B &\equiv r = x! \\
\end{array}
\]
In the proof, we'll use the following shorthand to keep the proof steps more concise:
\[
\asfml \equiv
\forall x. x\ge 0 \land \var < n \limply \ddiamond{\pcall{fact}{x}}r = x!
\]
We have only to decide what to use as a variant. Looking at the code, each time \keywordfont{fact} is called, $x$ decreases from the value that it had on entry to the procedure. So we will use $\var \equiv x$. The proof is then as follows, where $\asprg$ denotes the body of \keywordfont{fact}.
\begin{sequentdeduction}
\linfer[recd+implyr+andl] {
  \linfer[ifd+implyr] {
    \textcircled{a}\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 
    &\linfer[composed+assignd] {
      \lsequent{\asfml,x=n,x>0}{
      \ddiamond{\pcall{fact}{x-1}}r*x=x!}
    } {
      \lsequent{\asfml,x=n,x>0}{\ddiamond{\pcall{fact}{x-1};\pumod{r}{r*x}}r=x!}
    }
  } {
    \lsequent{\asfml,x\ge 0,x=n}{\ddiamond{\alpha}r=x!}
  }
  &\linfer[id]{\lclose}{\lsequent{x \ge 0}{x \ge 0}}
} {
  \lsequent{}{x \ge 0 \limply \ddiamond{\pcall{fact}{x}}{r=x!}}
}
\end{sequentdeduction}
The proof \textcircled{a} corresponds to the branch of the conditional where $x=0$.
Note that this is the base case of the recursive procedure, and follows easily as seen below:
\begin{sequentdeduction}
\linfer[assignd] {
  \linfer[qear]{\lclose}{\lsequent{\asfml,x=n,x=0}{1=x!}}
} {
  \lsequent{\asfml,x=n,x=0}{\ddiamond{\pumod{r}{1}}r=x!}
}
\end{sequentdeduction}
Now we continue with the other branch where the main proof left off, 
where we are ready to deal with the call to \keywordfont{fact}. 
Here is where the fact that we have universally quantified over $x$ will help, as we can instantiate $x$ with $x-1$ to reflect the fact that this is the value of the actual argument that the recursive call is given.
Note that we leave out the antecedent premise for the application of \irref{implyl}; it follows directly from the assumptions $x = n, x > 0$.
\begin{sequentdeduction}[array]
\linfer[alll] {
  \linfer[implyl] {
    \linfer[M] {
      \linfer[qear] {
        \lclose
      } {
        \lsequent{r = (x-1)!}{r*x=x!}
      }
    } {
      \lsequent{\ddiamond{\pcall{fact}{x-1}}{r = (x-1)!},x=n,x>0}{\ddiamond{\pcall{fact}{x-1}}r*x=x!}
    }
  } {
    \lsequent{x-1\ge 0 \land x-1 < n \limply \ddiamond{\pcall{fact}{x-1}}{r = (x-1)!},x=n,x>0}{\ddiamond{\pcall{fact}{x-1}}r*x=x!}
  }
} {
  \lsequent{\forall x. x\ge 0 \land x < n \limply \ddiamond{\pcall{fact}{x}}{r = x!},x=n,x>0}{\ddiamond{\pcall{fact}{x-1}}r*x=x!}
}
\end{sequentdeduction}
This completes the proof.

\section{Ghost state}

One thing to notice about the \irref{calld} and \irref{recd} rules is that they only apply to calls done on actuals corresponding to variables, not arbitrary terms.
So, for example, if we wanted to prove the following line, we cannot use the \irref{recd} rule, because $a+b$ is not a variable.
\[
\lsequent{a \ge 0, b \ge 0}{\ddiamond{\pcall{fact}{a+b}}{r = (a+b)!}}
\]
What we need is a way of introducing the fact that when we call the factorial procedure, we set the formal argument $x$ to the term $a+b$, and also a way of remembering that we did this afterwards.

We will see how to do this by the use of ``ghost state'' in our proof.
Consider the following proof rule \irref{iassign}.
\[
\dinferenceRule[iassign|\text{IA}]{assignment introduction}
{\linferenceRule[sequent]
  {
    \lsequent[L]{}{\dibox{\pumod{y}{\astrm}}{\asfml}}
  }
  {
    \lsequent[L]{}{\asfml}
  }
}{y~\text{new}}
\]
\[
\dinferenceRule[iassignd|\text{IA}]{assignment introduction}
{\linferenceRule[sequent]
  {
    \lsequent[L]{}{\didia{\pumod{y}{\astrm}}{\asfml}}
  }
  {
    \lsequent[L]{}{\asfml}
  }
}{y~\text{new}}
\]
\irref{iassign} is essentially the assignment axiom in reverse. It introduces a new assignment into the program that was not present before. The fact that this rule is sound follows from the assignment axiom, which allows us to conclude that $\asfml \lbisubjunct \dibox{\pumod{y}{e}}{\asfml}$ because $y$ is not mentioned in $\asfml$.

This rule allows us to introduce a new fresh variable into our proof that remembers a value at a particular point. Because the variable never existed in the program, but will affect the proof, we call it a \emph{ghost variable}. 
When using ghost variables, it is important to make sure that the proof maintains forward momentum. At this point in the semester, it may have become second nature to immediately apply \irref{assignb} whenever you see an assignment statement. This would be counterproductive with a ghost variable, as it would leave us right where we began.
\begin{sequentdeduction}
\linfer[iassign] {
  \linfer[assignb] {
    \lsequent[L]{}{\asfml}
  } {
    \lsequent[L]{}{\dibox{\pumod{y}{\astrm}}{\asfml}}
  }
} {
  \lsequent[L]{}{\asfml}
}
\end{sequentdeduction}
In order to move the proof forward after introducing a ghost variable, use the \irref{assignbeqr} rule that we introduced in previous lectures.
\[
\dinferenceRule[assignbeqr|$\dibox{:=}_=$]{}
{\linferenceRule[sequent]
  {
    \lsequent[L]{y=\astrm}{\asfml(y)}
  }
  {
    \lsequent[L]{}{\dibox{\pumod{x}{\astrm}}{\asfml(x)}}
  }
}{y~\text{new}}
\]
In fact, we can reduce the tedium of repeating these steps by stating a derived rule that combines these steps into one. Below \irref{ighost} does exactly this: introduces a fresh variable $y$ into the context that remembers the value of a term $e$.
\[
\dinferenceRule[ighost|\text{GI}]{}
{\linferenceRule[sequent]
  {
    \lsequent[L]{y=\astrm}{\asfml}
  }
  {
    \lsequent[L]{}{\asfml}
  }
}{y~\text{new}}
\]
Now let's go back to the factorial procedure and see it in action.
\begin{sequentdeduction}[array]
\linfer[ighost] {
  \linfer[applyeqr] {
    \lsequent{a \ge 0, b \ge 0, x = a+b}{\ddiamond{\pcall{fact}{x}}{r = x!}}
  } {
    \lsequent{a \ge 0, b \ge 0, x = a+b}{\ddiamond{\pcall{fact}{a+b}}{r = (a+b)!}}
  }
} {
  \lsequent{a \ge 0, b \ge 0}{\ddiamond{\pcall{fact}{a+b}}{r = (a+b)!}}
}
\end{sequentdeduction}
Now the proof is approachable using \irref{calld}.
\begin{sequentdeduction}[array]
\linfer[calld] {
  \linfer[qear]{\lclose}{\lsequent{a \ge 0, b \ge 0, x = a+b}{x \ge 0}}
  !\lsequent{x \ge 0}{\ddiamond{\pcall{fact}{x}}{r = x!}}
  !\linfer[id]{\lclose}{\lsequent{r = x!}{r = x!}}
} {
  \lsequent{a \ge 0, b \ge 0, x = a+b}{\ddiamond{\pcall{fact}{x}}{r = x!}}
}
\end{sequentdeduction}
Of course, the middle premise is exactly what we proved earlier when demonstrating the use of \irref{recd}.
Notice, however, that if we hadn't applied \irref{applyeqr} after \irref{ighost} earlier, to the $(a+b)$ in the postcondition, then we would not have been able to complete the rightmost premise. 
The reason is that the context we gained from \irref{ighost} relating $x$ to $a+b$ is no longer available when reasoning about the post-state, due to the fact that \irref{calld} is a special case of monotonicity.
Oftentimes when using ghost state in a proof, a bit of planning and foresight is needed to set things up to ensure that the proof will eventually close out.

\subsection{Invariants over destructive updates}
Another common use of ghost variables is to deal with the fact that some programs change the values of variables that are needed for contracts.
An example of this is any sorting procedure, which must modify its input.
\begin{lstlisting}
    proc BubbleSort(a, n) {
      i := n-1;
      while(1 $\le$ i) {
        PushRight(a, i);
        i := i - 1;
      }
    }
\end{lstlisting}
Any complete specification must say not only that the returned array is sorted, but that it contains the same contents as the original array that was passed in.
We capture this below with the $\keywordfont{perm}$ predicate, which is true when its arguments are permutations of eachother.
The ghost variable $c$ ``memorizes'' the original array $a$ so that it can be used in the postcondition.
\[
\begin{array}{ll}
A &\equiv \keywordfont{arrayeq}(a,c) \land 0 < n \\
B &\equiv \keywordfont{perm}(a,c) \land \keywordfont{sorted}(a,0,n)
\end{array}
\]
This is a useful and general tool to use in proofs, and it is not restricted to procedure contracts.
Consider the \keywordfont{PushRight} procedure itself.
\begin{lstlisting}
    proc PushRight(a, i) {
      j := 0;
      while(j < i) {
        if(a(j) > a(j+1))
          t := a(j+1);
          a(j+1) :=  a(j);
          a(j) := t;
        }
        j := j + 1;
      }
    }
\end{lstlisting}
This procedure increases the range for which $a$ is sorted by one element, and maintains along the way that the array is a permutation of its original contents.
A nice way to prove this is to utilize the fact that permutations are transitive.
\[
\forall x,y,z . \keywordfont{perm}(x,y) \land \keywordfont{perm}(y,z) \limply \keywordfont{perm}(x,z)
\]
Namely, we can maintain an invariant which says that $a$ is a permutation of the original (ghosted) $c$:
\[
J \equiv \keywordfont{perm}(a,c)
\]
Then to show that the invariant is preserved, we create a new ghost variable $d$ to hold the contents of $a$ at the beginning of a loop iteration, and prove that the contents of $a$ at the end of the iteration are a permutation of $d$.
The transitive property does the rest.

\section{Summary of today's rules}

\[
\cinferenceRule[callb|$\dibox{\text{call}}$]{call/contract expand}
{\linferenceRule[formula]
  {
    \lsequent[L]{}{A}
    &\lsequent[g]{A}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{B}}
    &\lsequent[g]{B}{\ausfml}
  }
  {
    \lsequent[L]{}{\dbox{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
  }
}{}%
\]

\[
\cinferenceRule[calld|$\didia{\text{call}}$]{call/contract expand}
{\linferenceRule[formula]
  {
    \lsequent[g]{\Gamma}{A}
    &\lsequent[g]{A}{\ddiamond{\pcall{m}{x_1,\ldots,x_n}}{B}}
    &\lsequent[g]{B}{\ausfml}
  }
  {
    \lsequent[L]{}{\ddiamond{\pcall{m}{x_1,\ldots,x_n}}{\ausfml}}
  }
}{}%
\]

\[
\dinferenceRule[recd|$\didia{\text{rec}}$]{recursive termination}
{\linferenceRule[sequent]
  {
    \lsequent[L]{\forall \bar{x} . A \land \var < n \limply \ddiamond{\pcall{m}{\bar{x}}}B}{A \land \var = n \limply \ddiamond{\asprg}B}
    &\lsequent{A}{\var \ge 0}
  }
  {
    \lsequent[L]{}{A \limply \ddiamond{\pcall{m}{\bar{x}}}B}
  }
}{n~\text{fresh}}
\]

\[
\dinferenceRule[ighost|\text{GI}]{}
{\linferenceRule[sequent]
  {
    \lsequent[L]{y=\astrm}{\asfml}
  }
  {
    \lsequent[L]{}{\asfml}
  }
}{y~\text{new}}
\]

\bibliography{platzer,bibliography}
\end{document}