Proposal to Participate in the CHI'99
Workshop on
End-User
Programming and
Blended-User Programming
February 26, 1999
Brad A. Myers
Human Computer Interaction Institute
School of Computer Science
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213-3891
+1 412 268 5150
FAX: +1 412 268 1266
bam+@cs.cmu.edu
http://www.cs.cmu.edu/~bam
|
John F. Pane
Computer Science Department
School of Computer Science
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213-3891
+1 412 268 8078
FAX: +1 412 268 1266
pane+chi99@cs.cmu.edu
http://www.cs.cmu.edu/~pane
|
Studying End-User Programming
The call for this workshop points out that there has been some study
of the characteristics and needs of "end-user programmers, who do some
programming in addition to their other job duties." Yet, the designs
of most of today's end-user programming languages seem to have been based
more on the intuitions of the designers, consistency with earlier languages,
and technical issues such as efficiency or easy compilation. It is our
position that end-user programming languages should be designed with more
attention to how end-users think and to what is most natural for them.
In the Natural Programming
Project, we are using a three step process to create a sounder basis
for language design for end-users. First, we have surveyed the existing
literature in Empirical Studies of Programmers (ESP) and the Psychology
of Programming to find out what is known about novice programmers, and
wrote a report
that can be used to guide the design of new programming systems.
Second, we have performed a series of studies to investigate how non-programmers
think about and express their solutions to programming problems.
Finally, we have started to build a series of programming languages and
environments based on these findings.
Of course, these programming environments will take advantage of modern
techniques that are known to make the programming easier. This will
include the techniques described in the call for this workshop such
as "build[ing] interfaces by checking boxes and browsing applets in complex
development environments." In addition, the raw computing power that is
now available on the desktop provides an opportunity to trade off some
computational efficiency in order to improve the efficiency of the programmer.
While our research has thus-far focused on novices, we believe that
the techniques we are investigating are applicable to all languages aimed
at programmers who are not professional programmers, including the group
identified in the call as "blended-user programmers". Some of the people
in our studies did have some experience with programming, and we found
that the ways they expressed algorithms were not vastly different than
the non-programmers.
Studies: The Language and Structure in Non-Programmers' Solutions to Programming
Problems
We recently conducted a pair
of studies of the natural language and structure used by non-programmers
in solutions to a set of problems that are essentially programming tasks.
When the results are compared with the language and structure that are
required by popular modern programming languages some interesting contrasts
emerge. The authors believe that using these results to guide the design
of future programming languages can make programming more accessible to
beginners and end-users, because the language would be a better fit to
the programmer's natural abilities, and thus would be easier to learn and
use.
The studies asked people to describe in their own words how they would
express algorithms. The first study was conducted with 14 fifth graders
at East Hills International Studies Academy (a public K-5 school) in Pittsburgh.
The children were evenly divided by gender and were racially diverse. They
were asked to describe how they would make PacMan move about the screen,
eating dots and killing or being killed by monsters. A real risk in designing
a study like this is that the experimenter could bias the subjects by the
language used in asking the questions. For example, the experimenter cannot
just ask: "How would you tell the monsters to turn blue when the PacMan
eats a power pill?" because this may lead the participants to simply parrot
the question back. Therefore, the participants were shown depictions of
the scenarios and asked to write down using their own words or diagrams
how they would instruct the computer to implement the actions shown. This
enabled the experimenter to show the images and ask general questions to
prompt the participants for their responses. As responses, the participants
could both draw pictures and write text on the unlined blank paper that
we supplied. This allows us to look at whether linguistic (textual) or
graphical notations are preferred.
To analyze the data, we gave the participants' responses to five people
who are not affiliated with the project, and asked them to classify what
they saw in the answers. These raters were all programmers, and were paid
to participate. Among the observations from this study are:
-
Much of the control (54% of all utterances) was expressed in an "event
language" (also called the "production language") style, with rules to
control behaviors. For example: "When PacMan eats all the dots, he goes
to the next level." This result is already reflected in some of today's
end-user programming languages. The event-based style used by Visual Basic,
Lingo for Director, and HyperTalk for HyperCard, is a form of rule-based
style, since the code is of the form "if this event happens, then execute
this code."
-
Iterations were usually expressed implicitly, by operating on sets of objects.
In fact, 95% of the participants' utterances about multiple objects used
a set/subset specification. For example, "When PacMan eats all of the yellow
balls he goes to the next level." This is instead of using any form of
iteration or explicit counting, as would be required in most programming
languages.
-
When there were multiple options for a conditional, the most frequent construction
was a set of mutually exclusive rules, which appeared 37% of the time (for
example: "When the monster is green he can kill PacMan. When the monster
is blue PacMan can eat the monster"). The next most popular construction
was to specify a general condition that was subsequently modified with
exceptions, which appeared 27% of the time (for example: "When you encounter
a ghost the ghost should kill you. But if you get a power pill you can
eat them"). This is in contrast to conventional languages that generally
require the conditional to be set up in advance using "ANDs," "NOTs" and
"ORs," forcing the user to think about all the cases first, and resulting
in a complicated Boolean expression.
-
The students expected objects to be moving as their normal behavior, and
wrote commands that would alter the motion (97% of the utterances). For
example, "If PacMan hits a wall, he stops." This is in contrast to some
conventional languages and environments where to make something move requires
setting its position at each clock timer tick.
-
Since the graphical entities were expected to have their own continuous
movements and behaviors, this suggests the students expected them to have
"object"-like behavior. To further investigate this question, we asked
the raters to look for expressions that suggested an object-oriented perspective,
which they identified in 61% of the utterances.
-
When inserting items, most subjects (74%) treated the data structures as
a list, and just inserted the new item without making room first (as would
be required with an array). To sort the items, the subjects usually inserted
the item in an indeterminate place, and then specified the sort operation
afterwards.
Many researchers have identified control structures as a common area of
difficulty for novice programmers [Hoc 1990]. It is interesting that many
of the strategies noted above that the subjects used serve to eliminate
control structures by making loops and conditionals implicit. This provides
further evidence that creating a new language that supports these natural
tendencies may be easier to learn.
We next performed a follow-on study using database access with both
children and adults, and both programmers and non-programmers. This is
to investigate how well the observations generalize to other domains and
to other populations. We again showed the participants pictures to avoid
biasing the answers. This time the pictures were of the database tables
before and after various operations, and we asked them to write how the
computer should carry out the operations. This study was administered to
19 adults with various levels of programming ability, and to 21 fifth-grade
children, four of whom had programmed before. The analysis of this second
study is still on-going, but we do have some preliminary results:
-
96.5% of the time, multiple objects were handled by operating on the set
as a whole, rather than iterating through the individual elements, which
is consistent with our first study.
-
Also, as in the first study, subjects did not construct complex conditionals
using ANDs, ORs, and NOTs. Instead, they would express independent conditions
(as in "Black is for G and L. Gold is for B, C, H, J, and S") or a general
case first and exceptions afterwards.
-
Most mathematical operations were expressed in a natural language style,
such as "Add 10,000 points to the scores in Round 1 and Round 3" rather
than a mathematical style ("score + 10000") or a programming language style
("score = score + 10000"). However, this natural language style appeared
to lead to more errors in the specification, such as failing to handle
boundary cases in ranges.
-
The subjects used the words AND, OR and THEN in various ways, often inconsistently,
and usually in ways that would not work in a conventional programming language.
For instance, AND often means "then," as in "Cross out the highest score,
and add the lower scores."
We believe that the results of these studies will be useful for anyone
designing a programming language that is designed to be easy to learn.
One way to define programming is the process of transforming a mental plan
in familiar terms into one that is compatible with the computer. The closer
the language is to the user's original plan, the easier this refinement
process will be. This is closely related to the concept of directness which,
as part of "direct manipulation," is a key principle in making user interfaces
easier to use. User interface designers and researchers have been promoting
directness at least since Shneiderman identified the concept in 1982, but
it is not even a consideration in most programming language designs.
By focusing on making the programming language closer to the way people
think, the programming process can be made easier.
Future Work
With the findings from these studies, combined with Human-Computer Interaction
principles and knowledge that has been gained from years of research in
the area of Empirical Studies of Programmers, we are creating a set of
new programming languages and environments. One of these environments will
be for children to create their own software. Another is aimed at
authoring video productions.
The environment for children, which is called HANDS and is part of the
PhD thesis of John Pane, will use a new metaphor to help make the aspects
of programming more concrete. The target domain for HANDS will be
the games and simulations that children (and their teachers) are most interested
in creating. However, HANDS will be a general-purpose programming
system in which anyprogram can be written. To make the system easier
to learn, data will be represented on cards so they can be visible and
directly manipulated. The code will use the production-language (event-based)
style that was used by subjects in the experiments. The editor will be
tightly-coupled with the environment, in the style of Visual Basic, so
graphical parts of the application can be specified graphically.
The video editing project is a new effort that will be funded by the
new Digital Libraries-II initiative. It will create a new kind of
video editing facility that will combine direct manipulation, programming-by-example,
scripting and intelligent features to make it much easier to create productions
that combine digital video, synthetic graphics, and interactivity.
Part of this project will be to explore what kinds of features are necessary
to provide significant customizability and programmability to authors of
the video presentations. Since video authors will range from children to
professionals, it is important to investigate what kinds of features are
appropriate, and how these can be made easy to learn.
Summary
By combining HCI principles, ESP results, and the findings of new studies
about what is natural, we expect to generate a body of knowledge about
how to make programming easier for anyone who isn't a professional programmer.
We will then create new languages and environments using this knowledge
and a more accessible concrete computational model that we hope will allow
people to create a broad range of programs more easily than with other
programming environments.
Why We Want to Participate in the Workshop
Both Brad Myers and John Pane would like to participate in this workshop.
John Pane's PhD thesis work has included the studies discussed above,
and will include the development of a new programming language for children
based on the results of the studies. Some children will have prior
experience with Basic or other programming languages, while others will
be complete beginners, so the term "blended" applies. It will be very useful
to discuss what programming paradigms and features children (and adults)
need to learn, so we can incorporate them into the planned system.
Brad Myers is the principal investigator of the Natural Programming
project, and John Pane's advisor. He is just now starting a new initiative
as part of the Natural Programming project to develop a scripting language
for video productions. It will be very useful to discuss with others
what features are necessary in these kinds of programming environments.
Short Biography of the Authors
Brad A. Myers is a Senior Research Scientist in the Human-Computer
Interaction Institute in the School of Computer Science at Carnegie Mellon
University, where he is the principal investigator for various projects
including the User Interface Software Project, the Demonstrational Interfaces
Project, the Pebbles project, and the Natural Programming Project. He is
the author or editor of over 180 publications, including the books "Creating
User Interfaces by Demonstration" and "Languages for Developing User Interfaces,"
and he is on the editorial board of five journals. Myers received a PhD
in computer science at the University of Toronto where he developed the
Peridot UIMS. He received the MS and BSc degrees from the Massachusetts
Institute of Technology during which time he was a research intern at Xerox
PARC. From 1980 until 1983, he worked at PERQ Systems Corporation. His
research interests include User Interface Development Systems, user interfaces,
Programming by Example, programming languages for kids, Visual Programming,
interaction techniques, window management, and programming environments.
He belongs to SIGCHI, IEEE, IEEE Computer Society, and Computer Professonals
for Social Responsibility.
John Pane is a PhD student in the Computer Science department
at Carnegie Mellon University. In his thesis research he is designing a
new programming language and environment for children. He has emphasized
usability throughout this design, by applying prior results from empirical
studies of programmers and the
psychology of programming, as well as new empirical studies that investigate
the natural ways that non-programmers express problem solutions. His thesis
is that this focus on usability will produce a system that is easier for
children to learn and use than existing systems, and that this same technique
can be used to create effective new programming systems for other novice
audiences. Prior to beginning his PhD studies, he spent nearly a decade
researching and developing structure-editor technology in programming environments
for beginners.