\section{Application model}
\label{sec:app_model}

Our model interactive application has a very simple main loop which
repeatedly gets the next message from the operating system and calls
the appropriate handler for the message.  Messages result from user
actions that occur aperiodically.  Message handlers compute display
and sound updates that serve as feedback to the user and determine his
subsequent actions.  The call to the message handler is the root node
of an activation tree whose structure is not fully known until after
it has been executed, but which depends on the handler, the message,
and the state of the program.  It is important to bound the execution
time of the tree from above in order to maintain responsiveness and
from below since it need not execute faster than humans can detect or
expect.  Even for trees which cannot possibly be executed with
imperceptible delay, bounding their execution times makes the user's
waiting time predictable and consistent.

%The  remainder  of this  section describes this  model  in more detail
%using examples from an image editing application and from an acoustic
%room modeling application.

\subsubsection*{Example applications}

We will use two interactive applications as examples in the following.
The first is an image editing application and the second is a acoustic
room modeling application.~\footnote{To better understand these
applications, we implemented very simple versions which we believe are
similar to their full-featured counterparts (eg, Adobe
Photoshop~\cite{PHOTOSHOP-MANUAL}) except for their feature sets.}
These applications serve as models for broader classes of
applications.  The room modeling application is a computer aided
design application based on a physical simulation and can also be seen
as a computational steering application.  Image editing is typical of
the large document editing applications common on personal computers.
As described in Section~\ref{sec:outline}, we intend to extend our
application set using an infrastructure to collect relevant traces
from off-the-shelf programs.  Ideally, we will be able to add existing
commercial Microsoft Windows applications to our application set.

Image editing gives the user tools to manipulate an in-memory visual
image as a whole and in part.  Some tools involve image-processing
operations such as boxcar convolution on large regions of the image,
while others involve emulating real-world tools such as pens, brushes,
or spray paint cans.  It is important to point out that image sizes
are rapidly growing, and the limits imposed by photographic film, drum
scanners, and digital cameras imply that truly vast (100s of megabytes
to gigabytes) images will need to be edited.  At the same time, the
functionality of image editing software, in terms of the
sophistication of image filters and how they can be applied is rapidly
advancing.

Acoustic room modeling involves designing a human space using a CAD
tool and being able to hear what such a room will sound like from
different positions in it.  For a given room configuration (room
geometry and material properties, listener positions, and loudspeaker
positions), we compute an impulse response function for each
listener/loudspeaker pair.  For a particular listener/loudspeaker
pair, we convolve the music signal coming from the loudspeaker with
the impulse response function, giving the room-filtered signal the
listener would hear from that loudspeaker.  By doing this for each
loudspeaker and summing the resulting room-filtered signals, we
simulate what the listener would hear given the room
configuration.~\footnote{Ignoring the listener's Head Related Transfer
Function, Doppler effects during movement, and other issues that are
beyond the scope of this document.  Interested parties should look
at~\cite{BEGAULT-3D-SOUND}.} The impulse responses are computed by
simulating the wave equation using a finite difference
method~\cite{SMITH-FINITE-DIFF}. In steady state, periodic convolution
and summation occurs to compute the sound output.  When the user
changes the room configuration by moving a loudspeaker, wall, or
himself, the (expensive) computation of the impulse responses is
repeated.

As an alternative to sound, the user can view the impulse response
functions directly, or can view the frequency response characteristics
of the room computed from them.  In this mode, the acoustic room
modeling application shares many characteristics with the design
optimization applications (for example, shape
optimization~\cite{malcevic97}) currently being discussed in the Quake
project~\cite{QUAKE-SUPERCOMPUTING-96,QUAKE-JOURNAL-98}.  In both
acoustic room modeling and design optimization, the user repeatedly
adjusts the model parameters (furniture position and composition; soil
and rock composition), simulates the physical system, and views the
results.  In the room modeling application, the user iteratively
adjusts the parameters to achieve a flat response, while in the Quake
case, the user is iteratively debugging his physical model of
geological structures and composition.  In both cases, the user may
also single-step the simulation to discover what is happening in
detail.


\subsubsection*{Aperiodic user actions generate aperiodic messages}

Most computation in an interactive application arises from aperiodic
user actions that result in aperiodic messages being delivered to the
program.  The aperiodicity is due to from the variable ``think time''
humans need to decide their next action~\cite{ENDO-LATENCY-OSDI96}.
Consider a user using a brush tool to draw freehand in an image
editing application.  Every time he moves the mouse, the operating
system sends the application a message announcing that fact.  He may
slow down due to indecision, or stop, or release the mouse button
ending his drawing.  Next, he may select a menu option to run a boxcar
convolution on the whole image resulting in a menu message that
delivers the news, or he may pause for a second or an hour.  The
aperiodicity is even more overt in the acoustic room modeling
application, where the user is, in effect, meandering around the
simulated room.


\subsubsection*{Message handlers and activation trees}

In response to the arrival of a message, the application calls the
appropriate handler for the message.  The handler performs the necessary
computation, which may be very simple or very complex.  For
example, a message announcing mouse movement in the acoustic room
modelling application may simply mean that a line representing a room
wall needs to be moved, or it may require that all the impulse
response functions be recomputed.  

%
% Activation tree example here
%
\begin{figure}
\centerline{
\begin{tabular}{cc}
%\epsfxsize=2.5in
%\epsfbox{chroom.eps} &
\epsfxsize=3in
\epsfbox{listmove.eps} &
\epsfxsize=3in
\epsfbox{stepsim.eps} \\
(a) Listener Moved & (b) Step Simulation \\
\end{tabular}
}
\caption{Activation trees from acoustic room modeling application}
\label{fig:room_model_trees}
\end{figure}

The call to the message handler is the root node of an activation
tree.  The structure of the tree depends on the message and program
state.  In general, the structure is unknown until after it has been
executed.  Figure~\ref{fig:room_model_trees} shows two activation
trees from the acoustic room modeling program.  To the left of each
edge is the amount of data the called procedure needs, while to
the right is the amount of data it produces.  In
Figure~\ref{fig:room_model_trees}(a) a listener has changed his
position, and thus the impulse responses from each of the sound
sources to that listener must be recomputed.  For each source/listener
pair, the simulation is stepped a sufficient number of times to
propagate the initialize impulse to the listener, including several
reflections.  The series of pressure values at the listener position
is extracted to form the impulse response, which is then displayed.
In Figure~\ref{fig:room_model_trees}(b), the underlying simulation
itself is being stepped and visualized to aid in debugging.  After
every $k$ steps, the grid of all the pressure values in the room is
displayed.

\subsubsection*{Feedback}

The message handler ultimately generates feedback to the user in the
form of display updates or sound.  This feedback helps the user
determine what his next action will be and when he will undertake it.
On a large time scale this is clear - the user of an image editor will
not apply another filter until he has seen the effects of the one he
just applied.  On smaller time scales it can be even more important.
Consider the same user drawing freehand.  This results in a stream of
mouse movement messages, and the handler changes a few pixels for each
message.  Without these few changed pixels, the user is drawing blind.

\subsubsection*{Responsiveness}
%\label{sec:response}

Achieving responsiveness in an interactive application amounts to
providing timely but not necessarily too timely feedback to individual
user actions.   If the feedback arrives too late or there is too much
jitter for a series of similar actions, the utility of the program is
degraded, perhaps severely.  For example, high-jitter or long latency
feedback makes freehand drawing impossible in an image editor.

Although we require fast feedback, there is no benefit to it arriving
too early, since humans can only perceive rapidity up to a point.  It
makes no difference to the user of an image editor whether it responds
to his freehand drawing in 5 milliseconds or 50, but the difference
between 50 and 150 is the difference between usability and
uselessness.  

Beyond this threshold of human perceptability, there other similar
thresholds of annoyance in waiting time and ``sweet spots'' between
them.  To the user, responsiveness means knowing in which sweet spot
the wait for an action will be and that the wait for similar actions
will always be in the same sweet spot (jitter will be low.)
Intuitively, the user is annoyed at having to wait, and so wants to do
something with that time.  When the waiting time crosses a threshold
of annoyance, he decides to do something else during the wait.  If he
knows which sweet spot the wait is in, he can meaningfully decide what
to do during the wait.  For example, you may wait and read a magazine
during a 10-30 minute oil change, but you'll probably leave your car
and go to work during a 100-300 minute tune-up.  If the mechanic tells
you it will take 10-30 minutes, but it takes 100-300, you will be very
annoyed.  Conversely, if he says 100-300 and it really takes 10-30,
you will get no benefit.

Similarly, applying a filter to a realistic sized image or recomputing
impulse response functions in a reasonable sized room will necessarily
take a perceptible amount of time, but the difference between waiting
10-30 seconds or 100-300 seconds is the difference between waiting
patiently and getting a cup of coffee. Another example is loading a
web page from a remote site which again necessarily takes a
perceptible amount of time, yet still has a sweet spot.

% Figure
%\ref{fig:sweet_spots} illustrates potentially interesting sweet spots.

%\begin{figure}
%\label{fig:sweet_spots}
%\caption{``Sweet spots''}
%\end{figure}

What we really want is to bound the execution time of a call to a
handler from above and below in order to fit it into the sweet spot
the user or programmer expects.  Of course, if we achieve the upper
bound, the lower bound can always be achieved by inserting artificial
delay.  However, note that minimizing the actual execution time and
then adding delay is much more difficult than simply trying to meet
the bounds.

\subsubsection*{Tradeoffs and discussion}

Stepping the acoustic room modeling simulation by $k$ steps, as shown
in Figure~\ref{fig:room_model_trees}(b), is a good example to
illustrate the tradeoffs involved in remote execution of calls.
Consider a system with two hosts: $A$, which is the user's workstation
and computes at 25 MFLOP/s, and $B$, which is a shared compute server
which computes at 50 MFLOP/s.  The two machines are connected with an
10 MB/s network.  We begin executing on $A$ and have to decide whether
to execute Forward($k$) on either $A$ or on $B$.  Let us suppose for
the moment that the whole subtree rooted at Forward($k$) will execute
on the same host as the root node.  The problem size $N$ depends on the
size of the space to be simulated and the peak frequency we want to
resolve.  Omitting the derivation for space reasons, for a cube 4
meters on a side and a frequency of 3000 Hz (telephone quality), the
problem size is $N=2,744,000$, where each of the $N$ values which grid
the cube is a double (8 bytes) representing the pressure at
a point in the space.  For the purposes of this discussion, we also
associate one scalar material property with each point.

A simulation step involves computing the $t$th set of $N$ output
values from the $t-1$th and $t-2$th sets, and the material properties
set using 26 floating point operations per point.  On $A$, a step can
be computed in $26N/25$ MFLOP/s $=2.86$ seconds, while on $B$
computing a step takes only $26N/50$ MFLOP/s $= 1.43$ seconds, but we
must first transfer $3N$ doubles (the $t-1$ and $t-2$ pressure values,
and the material property values) to $B$ and then transfer $3N$
doubles (the last three pressure value sets) back to $A$ to be
displayed.  This requires $8\times 6N / 10 $ MB/s $= 13.18$ seconds.
If the number of steps $k$ is less than $\lceil 13.18/1.43 \rceil =
10$, it is faster to perform the computation on $A$, otherwise it is
faster to do it on $B$.  Notice, that when stepping the simulation for
debugging, $k$ is likely to be small, perhaps even one.  On the other
hand, to recompute the impulse responses due to a listener moving
(Figure~\ref{fig:room_model_trees}(a)), we need to perform at least
$N^{1/3} = 140$ steps, and probably several times that number in order
to capture several reflections.

Since $B$ is a shared machine, we may not always get the full 50
MFLOP/s from it, so we must take the future load on $B$ into account
when we make our decision.  For example, if other jobs put a unity
load on $B$, then there is no tradeoff point.  Similarly, the actual
bandwidth available on the network may vary, again changing the
tradeoff point. It is also important to point out that with a large
number of steps, an incorrect decision on where to execute
Forward($k$) or SourceListenerResponse($i$,$j$) will be felt for a
long time.  However, if we allow ourselves to map individual Step()s
(or Block()s in an more fine grain environment), we can potentially
better recover from bad decisions.




