Robert J. Laferriere, Michael L. Keller, Rich Gossweiler, Randy Pausch
Computer Science Department
University of Virginia
Charlottesville, Va 22901
contact author: Randy Pausch
(rjl2b, mlk4v, rich, pausch@virginia.edu)

Table of Contents

An Introductory Tutorial for Developing
Multi-User Virtual Environments

An Introductory Tutorial for Developing Multi-User Virtual Environments

An Introductory Tutorial for Developing
Multi-User Virtual Environments

Note to reviewers: The intended audience for this paper is probably at the college sophomore/junior level. Many computer graphics courses would like to have students implement distributed applications --- this paper is intended to be used, among other purposes, as a supplement in such courses. Therefore, in many cases, we have chosen simpler terminology and/or explained some concepts that would normally be used without explanation in PRESENCE.

Abstract

This paper is an introductory level tutorial describing how to implement multi-user virtual environments (VEs). The intended audience for this paper is students who are competent programmers and now wish to implement a distributed multi-participant application. We describe the fundamental concepts of distributed computing for multi-player simulations, and provide a concrete example which may be used as a template, including C source code available via the Internet. The template program demonstrates a simple multiple player, distributed application, where each player controls the position of a space ship, and the multiple programs communicate ship position data over the network. The template uses a technique called dead-reckoning to improve performance. We give detailed instructions on how to obtain and modify the template, so that students can quickly create their own distributed applications. We conclude by briefly discussing advanced issues, such as collision detection and physical modelling, reliability, and synchronization.

Introduction

Virtual Environments (VEs) are interactive computer simulations that immerse users in an alternate, yet believable reality. People move around, look at, and manipulate graphics objects using input devices ranging from the commonly accessible keyboard and mouse to more exotic devices, such as head-mounted displays and instrumented gloves. VEs come in many flavors, ranging from a single person on a single computer, to many people running in parallel across a distributed set of machines.

One potential use for VEs is to allow several people, or players [Blau92], to interact in a single simulation; for example, several students sitting at different computers connected over a network. These multi-user VEs extend interaction beyond the single-user VEs, but introduce more programming complexity because multiple programs on different machines have to be managed, and the participants must coordinate communication among themselves. This introductory tutorial describes a basic mechanism for coordinating multiple programs and the communication between the programs. This paper also provides a template which is intended to provide new authors with a basis for developing their own distributed VEs. We believe that a pragmatic tutorial of this type will allow new authors to explore the domain and that this will benefit the VE community.

Overview of Network Communications

Establishing a Phone Number

When several computers are connected together, the resulting configuration is called a Local Area Network (LAN) or Wide Area Network (WAN) depending on how geographically far apart the computers are located. Typically LANs are within one building, while WANs may span the whole country or even the world. In a system where there are more than two computers connected together, how does one computer identify which computer to send the data? Each computer needs to have a uniquely identifiable number, which acts like a phone number, allowing one computer to call and send data specifically to another computer. Since the UNIX Operating System supports multiple programs (called processes) which run simultaneously on the same machine, this unique identification number must exist, not just for the computer, but for each individual process wishing to communicate with other processes. In fact, a single process may wish to maintain several communication lines simultaneously, and therefore need several unique identification numbers. A unique identification number for each process can be created by using the identification number of the machine (the Internet address, such as 128.143.66.54) in combination with an abstraction called a port. A port is simply an additional number (for example, 6543) which allows a machine to provide several unique numbers for that machine. Programmers can try to open a line of communication by using the machine address and any particular port number. If the port is already in use, then the call fails. Alternatively, programmers can request that the operating system return the next unused port.

Figure 1: multiple processes with lines of communication

The operating system abstraction for a line of communication is called a socket. When two programs wish to communicate, they both create sockets, connect the sockets together, send data, and when done, close the sockets. For example, process A creates a socket using its machine name and a pre-established port number:

#define PORT 6543
char hostname[256];
int socket, newSocket;
gethostname(hostname, 256); /* name for the machine, can be string or number */
socket = CreateSocket(hostname, PORT);
Now process A can wait for another process to connect on the given port:

newSocket = WaitForConnection(socket);
In the meantime, process B can create its own socket and then connect to the waiting process A:

#define PROCESS_A_MACHINE 128.143.66.54
#define PORT 6543
int socket;
socket = ConnectToProcess(PROCESS_B_MACHINE, PORT);
Note that process A now has two sockets, one which may be used for more new connections (socket) and one which is directly connected to process B (newSocket). Also note that both knew to use port 6543 to establish the connection. This implies that authors of the two programs agreed in advance to use that number.

Getting Each Other's Number

If two sockets are created by two separate programs running on two separate machines, then how does either process know about the other process's new number? When a process starts up, it creates a socket using a port number that other processes need to find out about. This may be done by establishing ahead of time that each new process will use specific port numbers well-known to everyone beforehand. The danger here is that two completely unrelated programs on the same machine may try to use the same port number. One will obtain the port, and the other will fail. Then all of the other processes which want to connect to the failed process will connect to the wrong process. This problem compounds as more player-processes start up, since each process would try to use a fixed port number that could potentially conflict with a port already in use on that machine.

Another technique is to ask the operating system for the first free port number during run-time, and then to write the port number out to a well-known file. If the networked computers have a shared file system --- a system where multiple computers can read and write to the same files --- then a well-known file is simply an agreed upon file name from which to read and write. All connecting processes can read the file to find out about the machines and port numbers of the other participating programs.

A third technique involves a name server. For networked systems which do not support a shared file system, then a special name server process can be created. The name server does the same thing as a file: it serves as a repository for the machine names and port numbers of the processes which want to communicate with each other. Instead of being a file, it is a running process which simply waits for other processes to connect to it. The connecting process can tell the name server what port it will be using, or request the machines and port numbers of other processes. The name server must open one well-known port number which the other processes can use to access it. That one port number must be established between all of the programs in advance, but all of the other processes can dynamically establish their port numbers, substantially reducing the risk of a port number collision.

Correct Protocol

In addition to figuring out who is talking to whom, the computers must also agree upon a protocol for establishing how they transfer information back and forth. For example, in the real world, people use a radio protocol where they say "roger" to acknowledge that they received and understood the message and "over" when they are done [Santifaller91]. "Over" serves to let the other party know that the first party is not simply pausing, but is done speaking for the moment. For computers, a network message may include a special header or trailer to mark the message boundaries.

Figure 2: network message buffer

"Roger" is sent by the receiving party to acknowledge the receipt of a message. This "acking back" helps maintain order and a steady-state flow of communication between the sender and the receiver.

Figure 3: acking (acknowledging) back

Communication Efficiency

Because the programs may send arbitrarily long messages to each other, for purposes of efficiency, the network may break a long message into fixed-sized packets before sending the data across to the receiver. All of these packets must be received and put back together to form the original message. On networks where there are several data paths from one machine to another, some packets may take faster routes, or even get lost.

Figure 4: multiple routes

The receiver must detect incorrectly ordered packets, and also determine when a packet is lost and request a repeat. This requires additional overhead in the messages and in the time to process the messages. For reliable networks where packets are not lost or re-ordered, this overhead is not necessary. There are two common protocols which deal with the efficiency and reliability of network packets: Transfer Communication Protocol and Internet Protocol (TCP/IP) and User Datagram Protocol (UDP). TCP/IP guarantees that the packets will arrive and that they will arrive in order. An alternative is UDP, which is faster, but provides to guarantees. For the purposes of this tutorial, we will use the more reliable TCP/IP protocol. For a detailed discussion on this protocol and several examples of code, we recommend Stevens' excellent book [Stevens90].

Current VE Communication Models

At present, there are two popular methods for implementing distributed Virtual Environments. In a centralized controller model, one computer collects all of the data from the different machines, stores the changes in some collection of data structures (which we'll call the database), and then it sends the results back out to each participating machine, which then drawing the scene's contents, and handles any user input. Although this model allows the programmer to develop a simple data structure to store and handle the data, it is not scalable. As more and more programs enter the simulation, all of the programs suffer, since they all must wait longer on the bottlenecked, centralized computer as it updates the database from each of the other programs.

Figure 5: centralized model

An alternative, more scalable approach incorporates a distributed controller model. Here, each program maintains its own complete, local copy of the database as well as performing the drawing, computation and animation of objects. When a program makes changes to its own database, it sends the data out so that all of the other programs can update their individual databases correctly. This model is scalable, because no single CPU is the bottleneck. The problem now involves the number of messages being sent among players.

Dead Reckoning

One way to reduce the amount of messages being sent is to use an algorithm called dead-reckoning. Dead-reckoning is at the heart of popular simulation mechanisms, such as DIS (the Distributed Interactive Simulation protocol) [IEEE93], and is used in SIMNET[Pope89] and NPSNET[Macedonia95].

Consider a distributed space "dogfight" game with n players, each controlling his own space ship. Player X's program could move its own ship, and then send a message to all of the other n-1 players, telling them all where the ship has moved. Player Y's program would then move its ship and then tell everyone about the change. After each player-program has moved its ship just once for this round, this would result in n-1 messages from n players, or roughly n2 messages. As soon as a player-program moves its ship again, the player-program would have to send out n-1 more messages.

To reduce the amount of messages being sent, player X could instead send just the ship's position and its velocity (direction and speed). Then, when the ship moves again, if it moves along the same path as the previously-established velocity, player X does not need to tell the other players. The other players will already know where X's ship is because they have the ship's old position and current velocity. This ship position calculation, based on a last-known velocity, is called dead-reckoning.

Figure 6: actual path versus dead reckoning path

To accomplish this distributed model of simulation, every player program maintains a complete copy of the database, and each program is in charge of moving all of the objects within its database. A program may have direct control of a certain object (called its "live" object), or it may use dead-reckoning to move an object (called a "ghost", since the object is actually controlled by another process). To determine when an object has changed significantly in value, the program runs the dead-reckoning algorithm on the live object as well, and compares the difference. If the difference is significant, then the program informs the other players to update their ghost-objects with the new position and velocity.

Note that if player X's ship strays only slightly from the dead-reckoning path, then player X may decide that the difference is negligible, and not send messages to the other players. The player only needs to send a message out to the other players when the change is significant.

A Concrete Example

As a concrete example, the following describes the implementation of a two-dimensional (2D) multi-player, networked game. The code for this example is freely available via the Internet (see a later section entitled How To Obtain the Code). This code was developed in ANSI C using X-Windows graphics primitives and runs on SGI and Sun workstations under UNIX. We chose this level of simplicity so that we may provide template source code that can be used and extended immediately.

In this example, each player (program) controls the position and velocity of a single object, a space ship, but sees all of the participating space ships in the virtual environment.

Figure 7: two views of the simulation

Each player can accelerate, declerate or rotate his or her own ship, and maintains a complete database with all of the ships' positions and velocities. The simulation uses dead-reckoning, so it uses the current velocity to update the position of the other players' ships (ghosts). It must also run the dead-reckoning algorithm on its own live ship, to see if it has deviated significantly from the path the other players are maintaining. If so, then the program must send messages out to inform the players of the new position and velocity.

Time-Based versus Frame-Based Animation

On a particular computer, the VE is calculating the position of the ghosts using the last reported velocity. The program may draw the ship in one position, then clear the image and draw it again in another position. These drawings are called frames and if the frames are presented fast enough, the illusion of motion is conveyed to the user.

while (!done)
{
ClearScreen();
for (i=0;i<numberOfShips;i++)
{
ship[i].position = CalculateShipPosition(ship[i].velocity);
DrawShip(ship[i]);
}
}
If the velocity of a ship is specified in frames --- that is, the ship moves one centimeter per frame --- then the ship may end up at different locations on different machines. For fast machines drawing at sixty frames per second, in one second the ship will have moved over half a meter. For a slower machine, drawing at ten frames per second, the ship will only be a tenth of a meter along.

To solve this, the ship's speed must be specified in distance per unit of time, rather than distance per frame. Instead of 1cm/frame, we specify velocity as 1cm/second. On fast machines the animation would be smoother than on slower machines, but both would present the ship at the same position.

Using a File as a Name Server

As we mentioned earlier, each player-program that enters the simulation needs to find out about the other players, so that it can connect to them and communicate messages. The player-program also needs to leave its own machine/port number identification so that new programs can connect to it. We use a file to which everyone can read and write, as storage of this information. This requires a shared file system. If this is not an option, then the developer can use the same programmer's interface but will need to write the name server as a stand-alone process.

The file contains a list of active players, for each one, list a machine and port number. One potential problem is that two new players may attempt to access the file at the same time, resulting in one writing back the file and then the second one overwriting the file and destroying data. We perform a "move file" to guarantee that this condition won't occur. The first player moves the file to a temporary location. Then, should another program try to access the file, the second program will not find the file. The second program will wait until the file returns, knowing that when the file does return, what it will read will be complete, and that no one else will be in the process of updating the file. We assume that two programs cannot move the same file at the same time.

After a program has moved the file, the program reads in the list of players, sends messages to the other players, then writes the file back out with its own machine/port identification. Similarly, when leaving the game, the player-program moves the file, deletes its entry, moves the file back, and tells the other players of its intention to leave. The last player out will leave a zero-length (empty) file. A zero-length file is distinguished from no file, because no file means some other program is currently accessing the file, while zero-length means that there are no players (an initial condition).

Implementation of a Simple VE

To implement this virtual environment we assume that we have the following programmer's interface to carry out the network communication. We have written these functions in C, and are available via FTP. This library of functions is used by the developer when creating a multi-player simulation:

playerList=CreatePlayer(GAME_FILE): This function creates a socket for the player process, obtaining a new port number dynamically. The function then checks for the well-known name file. If the file does not exist, that means someone else is currently accessing it, and so this function waits for the file to return. Once the file exists, this function moves the file (to prevent other writers from accessing the file), reads in the player-list and adds its machine name and port number into the file. This function also sends messages to all of the other players, so that they may add the new player as a ghost to their local database.
DeletePlayer(GAME_FILE): This function removes the player from the file and sends messages to all of the other active players, so that they may remove the player from their local databases. If this is the last player, the function leaves a zero-length file.
AddPlayer(playerList): This function adds a new player to the current player-list.
WriteToPlayers(playerList, message): This function sends the message to all the players in playerlist.
ReadFromPlayers(playerList): This procedure checks for incoming messages from other players and returns the received messages. If no messages are available, null is returned and the program continues (execution does not pause to await a message).
There are the three main parts to participating in the virtual environment - entering the simulation, maintaining current state information with other participants, and finally leaving the simulation. A flowchart of these steps is shown in Figure 8:

Figure 8: flowchart of simulation

Entering the Simulation

When entering the simulation, a participant must create a socket so that others can connect to the new player, read in the game file to find out who is playing, and write out an entry to the game file, so that new players will know about this player. After this, the new player must connect with all of the existing players.

/*
* this involves creating a socket for the player (from the machine and a port number)
* moving and opening the GAME_FILE
* creating a list of players (id, machine, port) which is returned
* adding this player to the GAME_FILE
* connecting to all of the players
* moving the GAME_FILE back
*/
playerList = AddPlayer(GAME_FILE);

The Main Loop

Since a new player may enter the simulation at any time, the current players must periodically check their socket for new players. During each simulation cycle a player must get user input which controls the "live" ship, check for messages about a new player connecting, check for messages about ship updates from other players, and draw all of the ships (a ship's position is computed either by a message update, or by using the dead-reckoning algorithm). If the "live" ship controlled by this program has deviated significantly from the dead-reckoning path, then the program sends a message to all of the participating players.

while (!done)
{
message = readFromPlayers(playerSocket); /* read any messages from other players */
switch (message.type)
{
case UPDATE:
FixOwnGhostPosition(message);
break;
case NEW_PLAYER:
AddNewGhost(message);
break;
}
done = getUserInput(); /* user may press `q' to quit */
CalculateAllShipPositions();
if (CompareLiveAndDeadReckoning())
{
WriteAllPlayers(liveShipPosition);
}
DrawShips();
}

Exiting the Simulation

The simulation loop executes until the player quits the simulation. Upon exiting, the connections to all other players are closed and the player entry is removed from the simulation file. The process of deleting a player is shown in the following pseudocode:

DeletePlayer(GAME_FILE, playerSocket);
WriteAllPlayers(QUIT);
ClosePlayer(playerSocket);
If this player is the last one in the simulation, then the game file is left as a zero-length (empty) file.

Sample Execution Sequence

The following illustrations provide snapshots as the execution progresses. Players enter the game, participate in the simulation and leave the game:

The First Player In

When the first player-program enters the simulation, it updates the game file and creates a ship. Since there are no other players, there are no "ghost" ships for this player.

Figure 9: First Player In

The Next Player in (and the Rest of the Players)

When the next player and all players hereafter enter the simulation, they update the game file, connect to the existing players and inform them that it has entered the simulation.

Figure 10: Next Player In

Moving Around

With several players in the simulation, figure 11 shows how different players' screens appear when a ship moves. When a player moves its own ship, if the movement does not deviate much from the dead-reckoning, then no messages are sent. The ship still moves on all screens as a result of the dead reckoning algorithm:

Figure 11: Ship position on different screens

If the ship does deviate significantly from the dead reckoning path, then the player informs all of the other players to update their ghost ships with the new correct ship position and velocity.

Figure 12: Ship Position after Update

Leaving the Simulation

When a player-program wishes to leave the simulation, it informs the other players (so that they will remove the ghost ships from their local databases) and removes its entry from the game file.

Note that the game file is only accessed when new players enter the game and old ones leave. It serves one purpose: to let new players find out who is already in the simulation and how to contact them.

How To Obtain The Code

Note to reviewers: we are still user-testing the distribution mechanism of the skeleton code with a set of naive users: these instructions will change in the final version to reflect those changes.

To obtain the source code, makefile, datafile and executable, type the following:

· $> ftp uvacs.cs.virginia.edu
· login name: anonymous
· password: (your name and machine, for example dave@virginia.edu)
· $> binary
· $> cd pub
· $> cd multiplayerTutorial
· $> prompt
· $> mget *
· $> quit
Once you have done this, you can run the game by typing game at the command line. If problems arise:

· make sure you have a zero-length (empty) data file called gamefile
· make sure other players have read and execute permissions on the directory, and read and write permission on the gamefile. If you are not familiar with UNIX, you may need help with this, if problems occur.
Start the program from several machines and play the game. After you have experienced the game, try modifying the code. We recommend a simple change at first, such as the shape of the ships, just so you can successfully demonstrate that you are correctly re-building the executable with your changes.

Other Issues for a Multi-User VE

While the previous sections discussed some of the fundamental issues involved when setting up a multi-player simulation, there are several more advanced issues that go beyond the extent of this tutorial. Understanding these issues is not necessary to build your own distributed application from the template, but we mention them briefly.

Collision Detection

This simple example did not implement collision detection or perform physically correct modelling, but clearly these are potentially useful features. Collision detection quickly becomes more complicated as the number of objects in the environment increases, and the accuracy with which the checking is performed. The issue of collision detection is discussed in several references [Uchiki83] [García94].

Physically correct modelling is a complex topic. In order to perform realistic smooth updating, state variable characteristics of the object being modelled need to be considered. As an example, Moshell [Moshell94] developed a complex model for dynamic terrains that involves numerical methods to solve complex models of Newtonian objects. For interested readers, there are VE development platforms such as VEOS [Coco] and DIVE [Carlsson93], with more advanced primitives to simplify virtual environment development.

Reliability

As the distributed programs become more complicated, the chances for a process to fail increases. How do the other players find out that a client no longer exists? One technique is to have a program periodically "ping" the players to make sure that they are still up and running. Another technique is to make it the responsibility of the players to periodically issue "heartbeats" or "I-am-alive" messages to the other players.

Managing Clocks

If timing data is shared between processes, then their clocks have to be synchronized. One process might think it is 1:00 pm and another might think it is 2:00 pm. A well-known solution which involves passing everyone's clock time around to synchronize clocks is discussed by Lamport [Lamport78].

Simulation Time

Imagine the scenario where ship A and ship B are both on a collision course with ship C. If two ships collide, then they are instantly vaporized. If ship A and B hit C at exactly the same time, what happens? If the events are processed in a particular order different results may occur. If the event of A hitting C is processed first, then A and C may be removed from the game and B survives. Conversely, if the B-hits-C event is processed first, then A is left alive. One solution is to "freeze" time and process each event without regard to other events happening during that frozen moment. Within a given simulation-tick all simultaneous events are processed but their results are stored in a separate "new state". By doing it this way, events that happen at the same time are not influenced by the order in which they are processed. A hits C and A and C are marked for deletion. B hits C and B and C are marked for deletion. The new state does not have A, B or C.

Conclusion

We provide new authors with a technical framework to begin designing multi-player systems. By providing some basic background information, a simple 2D multi-player simulation example and an FTP site for the source code, we make it possible for new developers to immediately begin creating their own distributed, multi-player virtual environment.

Acknowledgments

This work arose from efforts to complete a short term project given to the graphics class at the University of Virginia. Much of the literature addressing distributed systems delved into complex issues at abstract levels, requiring a steep learning curve to even begin to explore multi-player simulations. This initiated the desire to produce an introductory tutorial. We thank the 1994 computer graphics class at the University of Virginia, especially David Shiflett and Jeff White, and we give special thanks to the Virginia User Interface Group for their work on the Alice system [DeLine93].

References

[Blau92] Brian Blau, Charles Hughes, Michael Moshell and Curtis Lisle, "Networked Virtual Environments," Computer Graphics, Special Issue on 1992 Symposium on Interactive Computer Graphics, March 1992, pp. 157-160.
[Carlsson93] Chirster Carlsson and Olof Hagsand, "DIVE--- A Platform for Multi-User Virtual Environments," Computers and Graphics, 6, 1993.
[Coco] Geoffrey P. Coco, and Dav Lion, "Experiences with Asynchronous Communication Models in VEOS, A Distributed Programming Facility for Uniprocessor LANs," Technical Report FJ-15, University of Washington.
[DeLine93] Robert DeLine, "Alice: A Rapid Prototyping System for Three-Dimensional Interactive Graphical Environments," Masters Thesis, University of Virginia, May, 1993.
[García94] Alejandro García-Alonso, Nicolás Serrano and Jaun Flaquer, "Solving the Collision Detection Problem," IEEE Computer Graphics and Applications, 14(3), May 1994, pp. 36-43, T385.I13 (ISSN 0272-1716).
[IEEE93] Institute of Electrical and Electronics Engineers, International Standard, ANSI/IEEE Std 1278-1993, Standard for Information Technology, Protocols for Distributed Interactive Simulation, March 1993.
[Lamport78] Leslie Lamport, "Time, Clocks and the Ordering of Events in a Distributed System," Communications of the ACM, 21(7), July 1978, pages 558-565, QA75.A68.
[Moshell94] Michael Moshell, Brian Blau, Xin Li and Curtis Lisle, "Dynamic Terrain," Simulation, 62(1), pages 29-40, TA34.S54, (ISSN 0037-5497).
[Pope89] Pope, Arthur, BBN Report No. 7102, The SIMNET Network and Protocols, BBN Systems and Technologies, Cambridge, Massachusetts, July 1989.
[Santifaller91] M. Santifaller, TCP/IP and NFS: Internetworking in a UNIX Environment, Addison-Wesley, Reading, MA, 1991.
[Stevens90] Richard W. Stevens, UNIX Network Programming, Prentice-Hall, N.J., 1990, QA76.76.063S755, (ISBN 0-13-949876-1).
[Uchiki83] T. Uchiki, T. Ohashi and M. Tokoro, "Collision Detection in Motion Simulation," Computers and Graphics, 7(3-4), 1983, pages 285-293.
[Macedonia95] Michael R. Macedonia, Michael J. Zyda, David R. Pratt, Paul T. Barham and Steven Zeswitz. "NPSNET: A Network Software Architecture for Large Scale Virtual Environments," submitted to the special issue of Presence on Networked VE & Teleoperation