Obtaining data on user behavior

Next: Trends in the Up: Behaviors of Net Previous: Behaviors of Net

Obtaining data on user behavior

We chose to study the users of the nn and xrn news readers because nn and xrn are two popular news readers in wide spread usage with very different user interfaces. The interface to the nn news reader is text based, while xrn has a graphical user interface. The benefit of studying two news readers with different interfaces is that differences in reading behavior caused by the user interfaces could be isolated. Further, when we subsequently modified the news readers as will be described in later chapters, we had a larger pool of potential users from which to draw.

To obtain data which would give an accurate picture of the behavior of most users, we wanted to collect data from all the nn and xrn users at PARC. We chose not to ask for a group of volunteers to let us collect data on their reading habits for fear of biasing our survey with a self-selected sample group. Unfortunately, this choice meant that since we did not have explicit permission from each individual to record information about him or her, we had to record data in such a way that no individual could be identified from collected data.

To preserve the anonymity of the user community all the data was recorded in a global append-only log file. The log file consists of a series of records; each record representing one session with a news reader - a session being defined as the time between when a user starts a news reader program to when he exits it or has a four hour period of inactivity. For every newsgroup the user looked at during the session an entry was made in the log file. We considered an article to have been ``read'' if the text of the article was displayed to the user. An excerpt of raw data from the global log file is shown in figure . Because nn and xrn have different user interfaces, they recorded similar, but not identical, types of information for each newsgroup.

Figure: Excerpt of raw log file data from nn and xrn.

xrn begins the record for each session with a line identifying the record as coming from xrn and the time at which the record was entered into the log. The record contains one line beginning with GRP: for each newsgroup the user was shown. For each newsgroup, xrn recorded: 1) How many articles were available to be read in the group(AVAIL:). 2) How many articles were actually read (READ:). 3) How much time, in seconds, was spent both reading and selecting articles (TIME:). 4) Whether one of the ``catch up'' buttons was used. Catch up buttons in xrn allow a user to ignore an entire set of articles with one action. The use of a catch up button in a group is shown by an asterisk.

nn begins the record for each session with a line identifying the record as coming from nn and the time at which the record was entered. The record contains one line beginning with GRP: for every newsgroup the user was shown. For each newsgroup, nn recorded: 1) How many articles were available to be read in the group (AVAIL:). 2) How many articles were actually read (READ:). 3) How many seconds were spent reading the selected articles (TIME_READ:). 4) How many seconds the user spent scanning the subjects of available articles choosing which articles to read (TIME_SCAN:).

Next: Trends in the Up: Behaviors of Net Previous: Behaviors of Net

David A. Maltz (dmaltz@cs.cmu.edu)