The Whyline
The central idea of the Whyline is that it allows programmers
to ask questions about their program's failures in terms of their
program's output. The Whyline's design was heavily motivated by
our formative studies of programmers' debugging
strategies. To illustrate its use, consider this scenario (which
comes from a real user study).
Ellen is creating a Pac-Man game, and trying to
make Pac shrink when the ghost is chasing and touches Pac. She
plays the world and makes Pac collide with the ghost, but to her
surprise, Pac does not shrink...
Pac did not shrink because Ellen (a pseudonym) has code that prevents
Pac from resizing after the big dot is eaten. Either Ellen did
not notice that Pac ate the big dot, or she forgot about the dependency.
When Ellen played the world, Alice hid the code and expanded
the worldview and property panel, as seen in Figure 1. This relates
property values to program output. Ellen presses the why button
after noticing that Pac did not shrink, and a menu appears with
the items why did and why didn't, as in Figure 2. The submenus
contain the objects in the world that were or could have been affected.
The menu supports exploration and diagnosis by increasing visibility
and decreasing the viscosity of considering them.
Because Ellen expected Pac to resize after touching the ghost,
she selects why didn't and scans the property changes and animations
that could have happened. When she hovers the mouse over a menu
item, the code that caused the output in question is highlighted
and centered in the code area (see Figure 2). This supports diagnosis
by exposing hidden dependencies between the failure and the code
that might be responsible for it. This also avoids premature commitment
in diagnosis by showing the subject of the question without requiring
that the question be asked.
Ellen asks why didn't Pac resize .5? and the camera focuses on
Pac to increase his visibility. The Whyline answers the question
by analyzing the runtime actions that did and did not happen, and
provides the answer shown in Figure 3. The actions included are
only those that prevented Pac from resizing: the predicate whose
expression was false and the actions that defined the properties
used by the expression. By excluding unrelated actions, we support
observation and hypothesizing by increasing the visibility of the
actions that likely contain the fault. To support diagnosis, the
names and colors are the same as the code that caused them. This
improves consistency and closeness of mapping with code.
The arrows represent data and control flow causality. Predicate
arrows are labeled true or false and dataflow arrows are labeled
with the data used by the action they point to. The arrows support
progressive evaluation, and thus hypothesizing, by helping Ellen
follow the runtime s computation and control flow.
Along the x-axis is event-relative time, improving the closeness
of mapping to the time-based Alice runtime system. Along the y-axis
are event threads: this allows co-occurring events to be shown,
supporting juxtaposibility.
Ellen interacts with the timeline by dragging the time cursor
(the vertical black line in Figure 3). Doing so changes all properties
to their values at the time represented by the time s location.
This supports exploration of runtime data. When Ellen moves the
cursor over an action, the action and the code that caused it become
selected, supporting diagnosis and repair. These features allow
Ellen to rewind, fast-forward, and the execution history, receiving
immediate feedback about the state of the world. This exposes hidden
dependencies between actions and data that might not be shown directly
on the Whyline, and between current values and program output.
To reduce the viscosity of exploration, Ellen can double-click
on an action to implicitly ask what caused this to happen? and
actions causing the runtime action are revealed. Ellen can also
hover her mouse cursor over expressions in the code to see current
values and to evaluate expressions based on the current time. This
improves the visibility of runtime data and supports progressive
evaluation. Finally, the Whyline supports provisionality by making
previous answers available through the Questions ve Asked button.
The button prevents the hard mental operation of recalling facts
determined earlier in debugging activity.
So this says Pac didn't resize because BigDot.isEaten
is true. Oh! The ghost isn't chasing because Pac ate the big dot.
Let's try again without getting the big dot.
Without the Whyline, the misperception could have led to an unnecessary
search for non-existent errors. In fact, in numerous user tests without the
Whyline, users frequently did just this.
Validation of Effectiveness of the Whyline
In comparing equivalent debugging scenarios between user tests
with and without the Whyline, we have shown that the Whyline reduced
programmer's average debugging time by a factor of 7.8. Furthermore,
the Whyline helped programmers complete 40% more of their task
than without the Whyline. We are in the process of refining the
Whyline and performing more formal investigations of the Whyline's
effectiveness.
|