Table of Contents


A Literature Survey for Virtual Environments: Military Flight Simulator Visual Systems and Simulator Sickness
Randy Pausch and Thomas Crea
Computer Science Report No. TR-92-25
August, 1992
This work was supported in part by the National Science Foundation, the Science Applications International Corporation, the Virginia Engineering Foundation, and the Virginia Center for Innovative Technology.

ABSTRACT

Researchers in the field of virtual environments, or virtual reality, surround a participant with synthetic stimuli. The flight simulator community, primarily in the U.S. military, has a large amount of experience with aircraft simulations, and VE researchers should be aware of the major results in this field. In this survey of the literature, we have especially focused on military literature that may be hard for traditional academics to locate via the standard journals. We concentrate on research which produces specific, measured results that may have applicability to VE researchers. We also assume no background other than relatively basic computer graphics, and explain basic simulator terms and concepts as necessary. We have included our annotated bibliography as an appendix. The major areas we have concentrated on are:

\x11 the effects of display parameters, including field-of-view and scene complexity
\x11 the effect of lag in system response
\x11 the effect of refresh rate in graphics update
\x11 existing theories on causes of simulator sickness
\x11 after-effects (subject experience after simulator sessions)
Many of the results we cite are contradictory --- our global observation is that with flight simulator research, like most human-computer interaction research, there are very few "correct" answers. Almost always, the answer to a specific question turns on the task the user was attempting to perform with the simulator.

INTRODUCTION

Researchers in the field of virtual environments, or virtual reality, surround a participant with synthetic stimuli. Therefore, VE researchers must be aware of the phenomena that can occur when various aspects of the stimuli cause undesired artifacts. The flight simulator community, primarily in the U.S. military, has a large amount of experience with aircraft simulations, and VE researchers should be aware of the major results in this field. In flight simulation, the basic goal is to develop aviation skills which the pilot can then transfer into a real-world mission. In this paper, we present major results from the flight simulation literature having to do both with the measured success of skill transfer, and with effects such as simulator sickness and after-effects. Many of the results we found were contradictory; the specific task being performed must be taken into account when asking questions such as "how much does latency affect performance?"

The history of flight simulation dates back to 1929 when Edwin A. Link patented the first ground-based flight trainer [Stark 1992]. Stark cited 1934 as the first time the U.S. Army used simulators to train their pilots and World War II as the time when the United States and its allies purchased 10,000 "blue box" Link Trainers to teach instrument flying and radio navigation skills. Since that time, flight simulators have advanced well beyond basic instrument and radio navigation trainers. Today's simulators enable pilots to "feel" the simulated emergency in motion-based systems and conduct air-to-air combat in visually-based systems. Presently, researchers direct much of their efforts toward promoting realistic scenarios for pilot training.

Initially, many aviators were skeptical regarding the training value of simulators and preferred training in the actual aircraft. As the government trimmed operational budgets, the military needed to reduce flight training costs and directed research efforts towards cost-effectiveness studies. Orlansky and String [1977] investigated a collection of studies since 1939 and concluded simulators had significant positive effects upon training. In one study using the 2-B-24 Flight Trainer for instrument training of undergraduate helicopter pilots in the Army, Caro [1973] stated there was a "90% reduction in the amount of aircraft time to attain course objectives." Specifically, Caro [1972] showed that previous students needed 60 hours of actual aircraft time and 26 hours in the older 1-CA-1 trainer. Upon use of the 2-B-24, students now achieved the same training goals in only 6.5 aircraft hours and just under 43 simulator hours on average. Orlansky and String's results were instrumental in promoting simulator use. They concluded that "hourly operating costs of flight simulators were approximately 5% to 20% of the hourly operating costs of the aircraft they emulate." They also predicted that military flying hours would be reduced to almost 17 percent by 1981, and that the procurement cost of these simulators could be amortized in 2.2 years. Consequently, the military strongly encouraged flight simulator use in all areas of training. Today, simulators serve as a major training resource for the United States military services and many commercial aviation companies.

Despite the cost benefits of training in simulators over the actual aircraft, pilots experience a phenomenon called simulator sickness which is not present in the actual aircraft. Simulator sickness is a common side-effect for many users which is often associated with motion sickness. Kennedy and Frank [1985] describe motion sickness as a general term for a collection of symptoms one experiences when subjected to abrupt, periodic, or unnatural accelerations, and common symptoms include: loss of skin color, inability to coordinate voluntary muscular movements, and nausea. The term simulator sickness is typically used to refer to sickness caused by the incorrect aspects of the simulation, not sickness caused by a correct simulation of a nauseating experience, such as a turbulent airplane flight.

There are subtle differences between motion sickness and simulator sickness. For instance, Casali [1986] concluded from research conducted by Money (1970), that it is "...generally accepted that stimulation of the vestibular apparatus of the inner ear is necessary for the inducement of motion sickness in humans." Daunton, Fox, and Crampton [1984] showed that the symptoms of motion sickness, along with the illusions of self-motion, can be elicited in human subjects by visual stimulation alone. This phenomenon of visually induced motion sickness (VIMS) is an example in which the user becomes sick without any vestibular stimulation, and, although the symptoms are similar to those of motion sickness, VIMS is an example showing how simulator sickness can be distinct from motion sickness. Kennedy, Frank and McCauley [1985b] best depict these subtleties via a diagram showing a schematic relationship among simulator sickness, motion sickness, and perceptual adaptation, which is simply the ability of the human central nervous system to adjust and respond to a stimulus better the next time it is encountered. The diagram illustrates that although there exists overlap among each of the three, each also has its own unique characteristics. Later, Cheung, Howard, and Money [1991] identified another issue asserting, "Labyrinthine-defective subjects [those with inner-ear damage] experience no sickness symptoms, which strongly suggests that the vestibular system is necessary for sickness induced by moving visual fields."

The origins of simulator sickness are unclear and no single factor appears to cause illness in all simulators. Sickness may arise as a result of unique, individual factors or because of improper simulation from the hardware device. Some, but not all, symptoms of simulator sickness are identical to those of motion sickness. Kennedy and Frank [1985] describe simulator sickness as both polysymptomatic (many symptoms) and polygenic (many distinct sources). For instance, whether the simulator device is a moving or motionless platform is one possible source, and it has been well documented that a motion system operated at a frequency of 0.2 Hz (cycles/second) is more likely to induce sickness than at other frequencies [Kennedy, et. al, 1987]. Another primary source of stimuli is the visual imagery in the simulator.

VISUAL SYSTEMS

Many of the characteristics of a simulator's visual display system can be described on two levels: a quantitative, physical level and a qualitative, psychophysical (or perceptual) level. These qualitative measures are more intuitive human perceptual measurements as opposed to quantitative physical measurements. A 1981 Advisory Group for Aerospace Research and Development (AGARD) paper, "Characteristics of Flight Simulator Visual Systems", described the display system in detail and discussed three basic characteristics of visual systems: energy, spatial, and temporal properties.

ENERGY PROPERTIES
Several energy properties pertaining to flight-simulator visual systems include:

\x11 luminance,
\x11 contrast,
\x11 resolution, and
\x11 color.
The energy properties listed are the physical measurements of the hardware device which are not fully understood by most people. Understanding these properties is important towards recognizing how the visual display system corresponds to the human visual system. For instance, luminance may not be understood by many, but the psychophysical (perceptual) correlate of light is common to most. Light is a familiar concept and most people see it on a daily basis, while luminance is the energy property which was measured in candela/meter2, or the light emanating from various types of standard candles. Today, it is measured by the amount of light emanating from a blackbody surface at the temperature of melting platinum [AGARD 1981]. Another psychophysical term is visual acuity, which is the ability of the human eye to recognize fine detail. For instance, the more dots/inch an individual can distinguish, the better his or her visual acuity. Resolution, on the other hand, is the level of small, recognizable details at which a device can represent an image. A CRT which is able to represent 500 pixels/inch has greater resolution than a CRT which represents 100 pixels/inch. Light or brightness correlates to luminance and visual acuity correlates to resolution. Light and visual acuity are what the human perceives, and luminance and resolution are the actual physical measurements.

It is important to view the energy properties of visual displays with the following in mind. Luminance, contrast, and resolution are very much interrelated and cannot be considered in isolation of each other. Otherwise, any adjustment of one of these variables without a corresponding adjustment of the other two will result in an improper visual display. Consequently, these properties must be carefully balanced with the task to be performed and the capabilities of the human visual system in order to achieve optimum performance [AGARD 1981].

Color is an energy property whose value concerning user performance is uncertain. "Whether color should be used in visual systems is a debatable question. There is little experimental evidence of its effect, and presently no substantial objective evidence either for or against the use of color in visual flight simulators" [AGARD 1981]. In a study of bombing performance in the 2B35 TA-4J flight simulator, Kellogg, Kennedy, and Woodruff [1984] determined there were no statistically significant differences between performance with color and black-and-white visual displays. They concluded that color visual scene presentation did not enhance performance, and determined that the only advantage was that pilots preferred the aesthetic considerations of color. Despite the fact that "the highest resolution color television display has significantly lower resolution than many monochrome displays," there is a subjective preference of color over monochrome [AGARD 1981]. Also, if tasks depend on rapid discrimination of objects, color may provide benefit: "An object is more easily recognized in color than monochrome... in a projected image of the sky and ground, if the sky is blue and the ground is brown/green, the pilot will rarely mistake his orientation, even in violent combat maneuvers" [AGARD 1981].

SPATIAL PROPERTIES
Several spatial properties pertaining to flight-simulator visual systems, include:

\x11 field-of-view
\x11 viewing region
\x11 depth perception
\x11 scene content.
Field of View
Field of view is a spatial property that defines the horizontal and vertical dimensions of the display screen in terms of angles from the design eyepoint. In general, findings show that wider field-of-view displays tend to enhance performance while also increasing the likelihood of simulator sickness, but studies reveal are inconsistent across various systems.

Research conducted at the Naval Training Systems Center's (NTSC) Visual Technology Research Simulator (VTRS) studied field-of-view and considered its effect on performance relative to cost-effectiveness, incorporating simulator sickness as a variable in the conclusions. In two carrier landing studies, Westra et al. [1986] showed there was no transfer advantage for those trained with a wide field-of-view compared to those trained with the lower cost narrow field-of-view; Westra [1983] also showed that there were some advantages for the wide field-of-view conditions, although these effects were small and/or short lived and generally disappeared after the user completed a few trials within the simulator. On the other hand, in two other studies pertaining to helicopter shipboard landings, Westra et al. [1987] determined pilot performance was significantly better in all phases of the approach, hover, landing, and precision hover task with the wider field-of-view display; Westra and Lintern [1985] also determined field-of-view had marginal positive effects on a few performance measures.

The advantages gained from wider displays varied with each study and further depends on the task performed. Determining the correct field-of-view for any particular simulator is not clear. The studies Westra and colleagues conducted at the VTRS considered transfer of training advantages relative to cost-effectiveness, and they based their recommendations on these considerations. The following discussion touches on just one aspect which should be considered when determining field-of-view effectiveness. In terms of increased simulator sickness, Van Cott [1990] showed that a wide field of view provides more stimulation and results in a more compelling display of motion. Furthermore, Kennedy, Fowlkes, and Hettinger [1989] stated that restricting the field-of-view may reduce the properties which cause nausea. These results imply that simulator sickness generally occurs more frequently with a wider field-of-view display. While the width of a display is a factor in determining whether people will experience illness, it should be understood that the display's field-of-view cannot be considered separately from other factors. For instance, Anderson and Braunstein [1985] demonstrated linear vection (perception of self-motion induced from visual stimulation) with a visual angle as small as 7.5 degrees in the central visual field. They achieved these results by placing observers in an environment in which they were exposed to a moving display of randomly positioned dots and concluded that motion and texture may be more critical than the size of the field-of-view. A common example of this effect occurs when movie producers show aircraft moving at very fast rates of speed, using dots or stars to simulate spaceflight.

Viewing Region
The viewing region is another spatial property which is important in that it has the limitation of producing good imagery only when viewed from within a specified region. Considering the fact that the center of the viewing region is the design eyepoint of the system, as the observer moves away from the design eyepoint, the image becomes distorted and unusable. Once you are outside the boundary of the viewing region, the imagery disappears or its quality is unacceptable [AGARD 1981].

In a cinema, the viewing region is substantially larger than that of a computer generated image. Real-image displays (screens and projectors) degrade gradually and the imagery can be useful at locations considerably distant from the design eyepoint [AGARD 1981]. On the other hand, computer-generated imagery (CGI) degrades quickly and must be properly calibrated for the design eyepoint in the virtual display. CGI displays have a smaller viewing region and any view away from the design eyepoint has implications concerning crewmember susceptibility. Lilienthal [1992] stated that "In the 2F87(F) (P-3C) simulator, the flight engineer, who was behind the pilots [location] and thus out of the design eye point of the visual system, saw distorted visual cues." He explains that flight engineers reported a significant amount of simulator sickness until they used a baffle to prevent them from having a direct view of the visual scene, and this modification reduced reports of sickness significantly. He also mentions that viewing the visual scene outside the design eyepoint causes users to see distorted visual cues which induce symptoms of simulator sickness and a lack of balance while standing.

Several of these distorted visual cues are dynamic. "Graphic displays, such as those used in flight simulator visual systems, provide accurate representations of three-dimensional space only when viewed from the geometric center of projection. If the head is moved outside the center of projection, geometric distortions occur in the projected imagery which provide inappropriate visual information for self-motion" [Rosanski, 1982, cited in Kennedy, Fowlkes, and Hettinger, 1987]. Kennedy, Fowlkes, and Hettinger also state that these optical distortions may be magnified by highly detailed imagery and wide field of view systems, and the irregularities introduced by distortions may provide inappropriate self-motion information.

Depth Perception
There are three factors important in the discussion of depth perception:

\x11 absolute depth, or real-world distance
\x11 relative depth, or distance between objects
\x11 depth order, otherwise known as z-order.
Depth perception is the property of vision that allows us to see the world as three-dimensional rather than flat. Unfortunately, depth is difficult to simulate and it is difficult to build hardware that permits the use of binocular (both eyes fusing separate images) vision. Most visual systems, although they may be biocular (both eyes, same view), provide only monocular (one eye/image) cues. Although humans can adapt and use monocular cues to accomplish a variety of tasks, Hale [1987] claims binocular vision is clearly superior. Hale concluded this result via a literature review which included an article by Upton and Strothers (1972) that stated stereo viewing was superior to monocular viewing. In Hale's review, another paper by Martin and Warner (1985) compared four different field of view angles: 40 degrees monocular, 40 degrees binocular, 90 degrees binocular, and 120 degrees binocular. The subjective response from the questionnaire indicated a progressive increase in pilot ratings from the 40 degrees monocular field-of-view to the 90 degrees binocular field-of-view for various aspects of the mission. There was very little difference in ratings between the 90 degrees binocular and 120 degrees binocular field-of-view. Martin and Warner indicated this may suggest that increasing the field-of-view beyond 90 degrees will not significantly improve pilot performance.

Scene Content
Scene content is a spatial property that simply refers to the level of detail available for the given scene. There are varying reports on the performance advantages of scene detail depending on the task performed and the study. The conclusions Westra and colleagues reached with the carrier landing studies indicate scene detail had very small effects, and the helicopter shipboard landing studies indicate a range of small to large effects. Specifically, Westra et al. [1987] stated that the "largest" scene detail effects occurred during the approach, hover, and landing phases. A possible explanation for this is that the takeoff and landing tasks performed by any pilot typically require greater concentration than normal in-flight tasks.

TEMPORAL PROPERTIES
Temporal properties are potentially the most important aspects of a simulation (or Virtual Environment) system, but they are also among the most difficult to measure. Temporal properties include:

\x11 lag
\x11 time lag
\x11 refresh rates
\x11 update rates
Lag is the effect that the CRT cannot completely discharge the image before the scan of the "next" image. If lag is excessive, it will cause smearing of a moving image and after (old) images may be visible [AGARD 1981]. This lag is associated with the rate at which the phosphor dissipates; more on this topic in the section on refresh rates.

Time Lag
Although time lag may refer to either the instrument, motion, or visual system, most research concerning time lag pertains to the motion and visual systems. Our discussion will focus once again on the visual system for previously stated reasons, that is, the majority of information we receive is from visual stimuli. Frank, Casali, and Wierwille [1988] confirm this point as they cite Newell and Smith (1969), who show that our reliance on visual stimuli transfers to simulators. Frank, Casali, and Wierwille later concluded that visual delay is far more disruptive to a simulator operator's control performance and physical comfort than motion delay.

At the Navy's Visual Technology Research Simulator, Westra and Lintern [1985] compared two simulator systems in their studies of helicopter landings on small ships with system visual lags of 217 milliseconds and 117 milliseconds. They indicated that pilot performance was better with the shorter 117 milliseconds lag system, and although lag had small effects on objective performance measures, pilots noticed the increased lag and believed it had a detrimental effect on their performance.

Uliano et al. [1986] conducted another study as part of the Navy's VTRS program on three visual throughput delay systems with varying amounts of lag at 215 +/- 70 milliseconds, 177 +/- 23 milliseconds, and 126 +/- 17 milliseconds. Here, they concluded that lag had no effect on illness in any of the conditions. They also noted that pilots were almost unanimously aware of the two longest lags, and that simulator performance was the worst under the longest lag condition.

Westra et al. [1987] conducted a second study of helicopter landings on small ships at the VTRS using system visual lags of 183 milliseconds and 117 milliseconds. Once again, they concluded that the smaller system lag had only small effects on improved performance. They also concluded that the 183 milliseconds system is marginally acceptable for performance and mentioned that there is a substantial accumulation of empirical evidence indicating increased lag contributes to deteriorated operator performance. After this study, they recommended a constant condition of 117 milliseconds for future VTRS transfer-of-training research. Within this paper, they cite Ricard et al., who "contrasted delays of 68 and 128 milliseconds and reported significantly lower error rates on all their measures of helicopter shipboard landing performance with the shorter delay." The Ricard study generated one display frame (refresh) every 33 milliseconds and they learned that a difference of 66 milliseconds (two frames) produces a "just noticeable difference" in performance effects while 33 milliseconds is less than "just noticeable".

The time lag issues discussed above deal strictly with transport delay, that is, "the time period from stick input to the completion of the first field of video output" [Westra et al. 1987]. Lilienthal [1992] recommended a limit on transport delays of 100-125 milliseconds to ensure that pilot technique is not affected by the delay, asserting that large transport delays (over 150 milliseconds.) made it difficult, if not impossible, for a pilot to adapt to the system. For large transport delays, pilots could not predict with any accuracy the length of the delay and attempts to guess and lead the system failed. As a result, pilots would overcompensate and produce oscillations, which would cause abnormal accelerations sometimes leading to sickness.

Another problem is the cue asynchrony problem, which is a greater concern in terms of simulator sickness [Lilienthal, 1992]. Lilienthal describes cue asynchrony as the difference between any two systems (i.e., motion, visual, or instruments) and recommends that the delay between any two cues be less than 35 milliseconds because "the motion cues may give the impression of motion in one direction while the delayed visual cues give the impression of motion in another direction." Kennedy, Fowlkes, and Hettinger [1989] state that there were only two experiments addressing lags and asynchronies. In the first study (Uliano et al. 1986) claim no differences in sickness ratings and in the second study (Frank, Wierwille, and Casali 1987) showed transport delays affected performance (i.e., manual control) behaviors, but the size of the delay did not affect reports of simulator sickness [Kennedy, Fowlkes, and Hettinger, 1989].

Refresh Rate
Refresh rate refers to the time it takes for the display phosphor to dissipate. The most common observance of refresh rates appears in television sets. Television displays in the U.S. operate at 30 Hz in a 2:1 interlaced mode. That is, each raster line on the screen is painted 30 times a second, such that the electron beam paints every other line during one sweep of the frame buffer, and the alternate set of lines during the next pass. The electron beam continually alternates between these sets of lines, sweeping the entire screen 60 times a second. The human visual system is generally not susceptible to flicker at 30 Hz in the fovea or central vision, however, the observer may still perceive flicker with peripheral vision. The point at which flicker becomes visually perceptible is called the flicker fusion frequency threshold. Refresh rate, brightness (light or luminance), field of view, and color are all factors that contribute to determining this threshold.

Two fundamental characteristics regarding flicker are refresh rate and brightness. As the level of brightness increases, the speed of refresh must also increase in order to suppress flicker. Also, as the speed of refresh increases, the costs increase. As a result, many users of flight simulators with slower refresh rates will reduce the visibility to dusk conditions (lower luminance) in order to prevent flicker. Further, since the peripheral visual system is more sensitive to motion than the central visual system, larger field-of-view displays increase the likelihood that the observer will perceive flicker [Lilienthal 1992]. Once again, refresh rates must increase with larger field-of-view displays in order to suppress flicker.

Slower refresh rates require more persistent phosphors which are not suitable for displaying moving images because they will cause the images to smear [AGARD 1981]. Also, slower refresh rates promote flicker which Van Cott [1990] cites as a contributor to simulator sickness. Lilienthal [1992] also states that flicker is distracting, induces eye fatigue, and appears to be associated with simulator sickness, and that if the cost of refresh rates are too high, then the trade-off should be made with luminance specifications.

There are two general categories of flicker in the literature. Small-field flicker which refers to elements in single lines or small groups of lines corresponding to the central visual system, and large-field flicker which refers to all portions of the display and the peripheral visual system. Large-field flicker appears as random movements across the display and is more objectionable than small-field flicker [AGARD 1981]. Kennedy [1990] supports this argument and found that large-field flicker may be interpreted as motion in the background, and the discomfort reported from flicker may cause sickness.

Update Rate
The refresh rate indicates how often the frame buffer is examined and displayed to the screen. Update rate refers to the generation frame rate or the frequency at which complete images are generated and rendered into the frame buffer for display. Unlike refresh rate, which is a hardware-determined constant, update rate can vary dramatically based on scene complexity and other factors.

SIMULATOR SICKNESS

GENERAL FINDINGS
Havron and Butler (1957) and Miller and Goodson (1958) were the first pairs of researchers to mention the phenomenon of simulator sickness by name [Frank et al. 1983]. Research on simulator sickness steadily increased through 1980, and then reported incidents nearly doubled by 1985 [Kennedy and Frank, 1985]. A majority of the reports investigate the commonality of simulator sickness and the percentage of the population that actually experiences the illness. The reported rate of incidence varies, as Casali and Frank [1987] point out in their review of the literature which documents incidence rates ranging from 0% to nearly 90% in flight devices and even higher in some ground vehicle devices. Kennedy, et al. [1987] provide more concise results taken from U.S. Navy studies conducted over a two year period at ten flight simulator sites. These studies showed less variation, with incidence rates ranging from 12%-60% at these simulators.

Several studies have attempted to determine whether certain individuals or groups of individuals were more susceptible than others. For example, Kennedy, et. al [1987] claim that "perhaps as much as 80% of the simulator sickness problem resides in perhaps 20% of the population." They later went on to say that "only about 30% of the individuals become ill under even the worst simulator conditions." In an attempt to isolate various individual sources, Kennedy and Frank [1985] address several, including gender, age, and physiological condition.

SPECIFIC FINDINGS
Regarding gender, Kennedy and Frank [1985b] claim that women are more susceptible to motion sickness than men. They mention a postulate concerning motion sickness which stated "that perhaps hormonal influences are at play, since women are most susceptible during their menstrual cycle (Schwab, 1954)." More importantly, they noted that women exhibit larger fields of view than men, and it is a well documented fact that simulator sickness appears more prevalent in simulators with wide fields of view.

Kennedy and Frank [1985] address age as a factor and state that susceptibility is highest for individuals from about two years of age through puberty. Then, susceptibility rapidly decreases through age 21, decreasing gradually thereafter, and almost disappears at age 50.

Illness is another factor which increases a person's susceptibility to simulator sickness. Previously, Frank et al. [1984] addressed the physiological state of the individual and advised against using the simulator if the subject was ill. Other reasons to avoid simulator use include fatigue, sleep loss, hangover, upset stomach, periods of emotional stress, head colds, ear infection, ear blocks, upper respiratory illness, and current medication. They further recommend not using simulators more than necessary when suffering from the effects of flu or possibly after receiving a flu shot, primarily because the literature on motion sickness and vomiting show that these symptoms are cumulative [deWit, 1957 and Cordts, 1982, cited in Frank, et al., 1984].

Gender, age and physiological condition are only a few of a larger number of individual sources which could be considered when studying simulator sickness. Crewmember position and experience are two other sources, and the following breakdown of these issues may better reveal the complexity of simulator sickness and explain why the origin is so elusive.

Casali and Wierwille [1986] looked at crewmember susceptibility and point out that simulator-induced sickness may be a function of the aircrew member's position in the simulator cockpit. An explanation for fewer pilot incidences than co-pilot or other crewmembers may be due to the amount of control the participant exercises. Lackner [1990] found that when subjects generated input themselves they were less susceptible to motion (simulator) sickness. He makes the point that the person controlling or anticipating the motion (simulator) becomes sick less often than the passengers, a phenomenon similar to the experiences of many automobile passengers whose car sickness diminishes when they are the driver. Another explanation might be the participants position in the simulator relative to the optimal viewing position, or the design eyepoint. There is only one design eyepoint in a simulator, and as you move away from this point it is more difficult to view the imagery. In two pilot cockpits, the design eyepoint is usually located between the two pilots or at the pilots station. Consequently, the further a crewmember is away from the design eyepoint, the greater the chances for sickness.

The level of previous experience is another source from which to study simulator sickness and the evidence here is inconclusive. Kennedy, et al. [1987] believe that more experienced pilots have greater difficulty than novice pilots, and they cite Havron and Butler (1957), Miller and Goodson (1960), McGuiness, Bouwman and Forbes (1981), and Kennedy (1981) whose studies support this argument. Specifically, they state that Miller and Goodson found 60% of the instructor pilots reported symptoms as compared to only 12% of the student pilots, and McGuiness, Bouwman and Forbes concluded that the more experienced aircrews [over 1500 flight hours] had a higher incidence of symptoms than the less experienced flight crew. On the other hand, Magee, Kantor, and Sweeney [1987] stated there was no evidence to indicate that experience influenced susceptibility to simulator sickness. This issue of these inconsistent definitions for novice came up at the 1988 Advisory Group for Aerospace Research and Development (AGARD) conference during the concluding round table discussion. In the Magee, Kantor, and Sweeney study, novice pilots were those who were new to the C-130 aircraft, but averaged 1500 flight hours, while previous studies defined novice pilots as those who had little or no total flight time. The outcome of the discussion was that the AGARD committee generally accepted novice to mean little or no flight experience. As a result, the committee recognized that more experienced pilots tend to experience greater difficulty than novices, and that the different criterion used explained the varying results.

The fact that experienced pilots have greater difficulty might be explained from several points already mentioned. Since more experienced pilots have clearer expectations of what should happen in the aircraft compared to the novice, an incorrect signal received by the experienced individual may also result in a greater mismatch discrepancy than for the novice. Further, since student pilots (novices) tend to handle the flight controls more than the instructor pilots (experienced), they may be less susceptible because they control the input to the system. Finally, if the optimal position is placed at the student pilot's location, this would be one explanation for the higher incidence rates for instructor pilots.

THEORIES
After introducing several generic sources for individual difference in susceptibility, Kennedy and Frank [1985] reviewed a number of theories attempting to explain the origin of motion sickness that surfaced in the literature. These theories include:

\x11 vestibular (inner-ear) overstimulation
\x11 fear/anxiety
\x11 toxic reaction
\x11 fluid shift
\x11 perceptual conflict
The final theory, "perceptual conflict" proposed by Steele in 1968, is also known as the "sensory conflict" theory or the "cue-conflict" theory. This theory addresses the mismatch that occurs when one expects certain things to happen based upon previous experience, yet the visual or vestibular signal received produces a mismatch or conflict. Van Cott [1990] described this theory as sickness that arises when "motion information from vision, the vestibular system, and proprioceptors (sensory receptors) may be in conflict with the expected values of these inputs derived from past experience." Although this theory does not answer every possible source of simulator sickness, it is presently the most widely-accepted working model explaining the illness. Cheung, Howard, and Money [1991] support the Kennedy and Frank conclusions concerning the vestibular system and simulator sickness, and their conclusions are consistent with the theory of sensory conflict.

SIMULATOR USAGE

ADVANTAGES AND DISADVANTAGES
The advantages mentioned previously include the success of transfer of training, that is, the carryover of those tasks learned in the simulator to the actual aircraft, and the cost-effectiveness gains. For additional transfer of training information, the reader should look at Waag's [1981] review of the literature concerning the training effectiveness of visual motion, or a collection of nearly 150 extracts compiled by Ayres et al. [1984]. Additionally, Stark [1992] provides pointers to papers on transfer of training. When considering the disadvantages of simulators, Frank, et al. [1983] first addressed a few key negative implications which they grouped into three broad areas concerning sickness. These are simulator after-effects, decreased simulator use, and compromised training.

DISADVANTAGES
After-effects
An initial look at simulator after-effects reveals that exposure to the simulator may result in future safety concerns. If an individual experiences side-effects (loss of skin color, sweating, nausea) as the result of a simulator session, the consequences of operating another vehicle such as a car or the actual aircraft after simulator exposure could be hazardous. Many of the reports on after-effects include examples where the user receives a conflict from the orientation cues used in the simulator. Kennedy et al. [1987] tell of an incident where one individual had to stop his car on the side of the road because the after-effects of a simulator experience were so pronounced, and Kellogg, Castore and Coward [1980] cited F-4 pilots reporting delayed perceptual after-effects occurring eight to ten hours following simulator flight. These observations led to additional studies which attempted to better understand the issues of after-effects.

In one of four studies conducted with helicopter simulators, Gower, et al. [1987] revealed that nearly 40% of the AH-64 (Apache) helicopter pilots reported symptoms lasting over an hour, and 14% reported symptoms lasting longer than six hours. with helicopter pilots. In another study of UH-60 (Black Hawk) pilots, Gower and Fowlkes [1989] reported cases where individuals experienced delayed effects over 24 hours after postexposure. They concluded that approximately eight percent of the aviation population experiences delayed problems beyond the simulator session for periods that exceed six to eight hours, and an even smaller population will experience symptoms for as long as one to two days. As a result of this study, many U.S. Army aviation units adopted a policy of no aircraft flying within six hours after simulator flight.

There is a series of techniques which were successful helping pilots unable to adapt to flight simulators to overcome their experiences of simulator sickness. Dobie and May [1989] use a cognitive intervention program in which participants receive desensitization training and/or cognitive therapy in order to inoculate individuals against motion sickness. Instead of eliminating personnel susceptible to motion sickness from flight training programs, they received a number of potential students that would have been eliminated from the flight training program and their treatment resulted in 86% of trainees returning to flight training without any further significant signs of airsickness for these individuals.

Decreased Use
Decreased simulator use is another concern of Frank, et al. [1983]. If simulators produce unpleasant side effects, they may not be used because people will lack confidence in the training they receive. The effort to produce the most realistic simulator continues to be an active area of research. Several variations include simulators that may or may not include motion and/or visuals.

Compromised Training
Training in the simulator can be compromised in a number of ways, including fatigue and adaptation. Frank, et al. [1984] mention that simulator sickness symptoms may cause distractions and therefore interfere with learning. They refer to this as fatigue-decreased proficiency, which describes use of a simulator when fatigued, whether upon entry or after exposure to the simulator during the training period results in reduced proficiency. Ebenholtz [1990] claims that once the user experiences fatigue, the potential for positive learning effects from the simulator is decreased. Hamilton, et al. [1989] showed that over 50% of tested aircrews experienced increases in simulator sickness symptom frequency following training, with the most commonly reported symptoms being mild mental fatigue, physical fatigue, eye strain and after sensations of motion. The CH-47 (Chinook) Flight Simulator study conducted by Gower and Fowlkes [1989] also showed eyestrain and headache as the leading symptoms of asthenopia, a term optometrists use to refer to many eyestrain problems.

Adaptation
The capability of humans to adapt (perceptual adaptation) to simulation deficiencies can be a problem. It is possible that an individual might use techniques to avoid simulator sickness which may be detrimental if they transfer these techniques to the actual aircraft. For example, many pilots restrict their head movement while in the simulator to avoid what is known as pseudo-coriolis effect. The Coriolis force is an apparent force that as a result of the earth's rotation deflects moving objects (as projectiles or air currents) to the right in the northern hemisphere and to the left in the southern hemisphere. "In simulators, large rapid head movements during angular motion of a simulator can cause vestibular coriolis effects, while head movements during visually represented angular motion can cause pseudo-coriolis effects" [Van Cott, 1990]. Lackner [1990] discussed how provocative effects of head movement can be, but if pilots begin to restrict head movements, they may develop negative habits which may be detrimental if transferred to in-flight conditions. Any pilot who learns to restrict their head movement in the simulator will develop bad habits in terms of basic flying skills and visual contact with other aircraft, let alone any battlefield scenario amongst enemy aircraft.

CONCLUSION

Our goals with this paper was to collect together references to significant results obtained by the simulator research community. We were surprised at the current isolation between the academic computer science community and the (primarily military) simulator community, and we have collected these results, and provided our bibliography in the hopes that as virtual environments research progresses, the computer science community will be able to learn from these results, rather than re-establish them unnecessarily.

APPENDIX: ANNOTATED BIBLIOGRAPHY