15-410 Project 1: Night Driver
Table of Contents
Project OverviewIn this introductory project you will be writing three device drivers. These three device drivers will constitute a library which will be used to construct single-task "kernel applications". You will write a small game application to run on top of your library. In addition, we will provide you with a basic application that will test some aspects of your device-driver library.
The first is the console device driver. The console device driver is responsible for printing characters on the screen. Without it, it would be rather difficult later in the semester to obtain feedback as to what is going on in your operating system or user applications.
The second device driver you will write is the keyboard device driver. Whenever the user presses a key, the keyboard device driver needs to find out what key it is, whether it was pushed or released, and then make that information available to the operating system. An important aspect of the keyboard device driver is that it is interrupt driven. We will take a closer look at what this means later.
The third "device driver" is the timer handler. While we do not often think of the on board PC timer as a device, it generates interrupts just like the keyboard. The timer is a critical device that will be used later in context switching between processes.
It is not possible to test these device drivers on their own. In
order to develop and test them, you will be provided with a skeleton
kernel. The kernel performs some very basic machine initialization
such as transitioning to protected mode. Once this initialization
is complete, it transfers control to the application's
Once you have implemented the three device drivers, you will use the three of them to implement a video game modeled after Atari's 1976 arcade game Night Driver. A detailed explanation of the desired behavior is presented later in this handout.
Technology DisclaimerBecause of the availability, low cost, and widespread use of the x86 architecture, it was chosen as the platform for this sequence of projects. As its creator, Intel Corporation has provided much of the documentation used in the development of these projects. In its literature, Intel uses and defines terms like interrupt, fault, etc. On top of this the x86 architecture will be the only platform used in these projects.
The goal of this project set is certainly not to teach the idiosyncrasies of the x86 architecture (or Intel's documentation). That said, it will be necessary to become accustomed to the x86 way of doing things, and the Intel nomenclature, for the purposes of completing this project set. Just keep in mind that the x86 way of doing things is not the only way of doing things. It is the price to be paid for learning the principles of operating systems on a real world system instead of a simulated architecture.
Welcome to x86 Kernel ProgrammingThese projects will be a new experience for many of you because they are special in two ways. The first is that, as kernel programmers, you are subject to a number of constraints that do not show up in user-level programming. The second is that you will need to look down at the hardware you are running on and manipulate device registers/processor data structures to get your kernel to work. Because this is new for many of you, the following section is devoted to saying a few things about kernel programming, and then summarizing some of the important features of the x86 architecture which you will need to know.
Kernel Programming: Always Assuming the WorstProgramming in the kernel is quite different from writing user level applications. As has been emphasized in class, safety is a priority in kernel code - that is, checking pointers passed to you by user code before you dereference them, testing all possible cases to ensure there are no infinite loops, etc. Kernels need to be as absolutely bulletproof as possible. For example, the following line will happily execute as part of your kernel:
*(char*)0x0 = 0x1badd00d;
Wild pointer accesses are much more dangerous in kernel mode because valuable kernel data structures are not protected by the user mode/kernel mode protection schemes of the processor (discussed here shortly). Dereferencing and writing to a bad pointer like this within the kernel can overwrite crucial kernel data or reboot the (simulated) machine.
The processor checks the privilege level (PL) in a number of
circumstances, such as when an attempt is made to execute a privileged
instruction. These are instructions that modify control registers or
change how the processor is running (a list of these instructions can
be found in section 4.9 of intel-sys.pdf). These are instructions we
would want only the kernel to be able to execute, so they are only
allowed while the processor is running in more privileged levels (such
as PL0). If you try to execute such instructions at other privilege
levels, a fault occurs. Fun thought experiment: If privileged
instructions cause a fault, how does one then change privilege levels?
A segment is simply a portion of the address space. Once defined, we can associate important characteristics with the segment, like the privilege level required to access the segment or whether it contains code or data. A segment does not need to represent memory that actually exists - for example, we could define a segment ranging from 0 to 4GB (the entire 32 bit address space), even though we might only have 128MB of physical memory.
Because your kernels are going to be complicated enough as it is, we will steer clear of segmentation by using only four segments. These four segments all span the entire 32 bit address space (overlapping each other completely). Two of them require privilege level 0 to be accessed, and two of them require only privilege level 3. Of each privilege level pair, one segment represents code and the other data. We have set up these segments for you.
In this project, since no code will execute in user mode,
we will use only the two PL 0 segments.
To refer to these
segments when configuring kernel data structures (such as IDT entries),
Please refer to our segmentation document for a more detailed explanation of segmentation. Do not worry if parts of it do not make sense at the moment as they will become clearer later in the course.
Most devices in the x86 architecture are accessed through I/O ports.
These ports are controlled by special system hardware that has
access to the data, address, and control lines on the processor.
By using special hardware instructions to read and write from these
ports, we can use I/O ports without infringing upon the normal
address space of the kernel or user applications. This is because
these special instructions tell the hardware that this "memory"
reference is actually an I/O port, not a location in actual memory.
For more information on I/O ports, consult chapter 10 of intel-arch.pdf.
Both the timer and the keyboard use I/O ports. For convenience, an
assortment of C wrapper functions are provided to you to save you
from having to break into assembly language to read or write I/O
ports. These are located in lib/inc/x86/pio.h. The various
in functions read from an I/O port while the out
functions write to an I/O port. The letter after the name indicates
the size of the data being sent (b - byte, w - word/short, l -
long/int). For this project, in which you will be talking only to
primeval 8-bit I/O devices which date to the 1980's, you will
probably need only the byte-wide versions of these instructions,
There are also some devices which are accessed by reading and writing particular addresses in traditional memory. This is called "memory-mapped I/O". This kind of memory is part of the regular address space and therefore needs to be carefully managed. The video display hardware uses both memory-mapped I/O and I/O ports, so you must clearly understand the distinction.
When a device wants to raise a hardware interrupt, it communicates this desire to one of two programmable interrupt controllers (PICs) by asserting some control signals on interrupt request (IRQ) lines. The PICs are responsible for serializing the interrupts (taking possibly simultaneous interrupts and ordering them), and then communicating the interrupts to the processor through special control lines. The PICs tell the processor that a hardware interrupt has occurred and which request line the interrupt occurred on so the processor knows how to handle the interrupt. How devices are assigned to particular interrupt request lines is extremely complicated, but there are some conventions which are usually followed, displayed below.
The PIC chip used in old IBM compatibles only had 8 interrupt request (IRQ) lines. This proved to be limiting, so a second PIC was daisy chained off of the first one. When an interrupt is triggered on the second PIC, it in turn triggers an interrupt on the first PIC (on IRQ 2). The interrupt is then communicated to the processor.
Once the processor receives an interrupt from the PIC, it needs to know how to respond. The processor reads a data structure called the interrupt descriptor table (IDT). There is a descriptor in this table for each interrupt. A descriptor contains various information about how to resolve the interrupt, most importantly where the interrupt handler is located. The interrupt handler is a piece of code that the author of the device driver writes that gets executed when that device issues an interrupt. The IDT also stores the privilege level that is required to call the handler (DPL) and the segment to use while running the handler (which determines the privilege level to run under). Once the processor locates the appropriate entry in the IDT, it saves some information about what it was doing before the interrupt occurred, then starts executing at the address of the interrupt handler. Once the interrupt handler has run to completion, the processor uses the saved information to resume its previous task. Note that IRQ numbers do not necessarily correspond to matching indices into the IDT. The PICs have the ability to map their IRQ lines to any entry in the IDT. You will be given information on which IDT entries correspond to interrupts pertaining to your projects. For Project 1, the IDT entries will correspond to the timer interrupt and the keyboard interrupt. These IDT entries are further discussed in their respective sections.
Interrupt handlers typically need to execute as quickly as possible so that the processor is free to accept future interrupts. When an interrupt is sent by the PIC, the PIC will not send another interrupt from that same source until it gets acknowledged through an I/O port. This is because interrupt handlers usually manipulate critical data structures and would not withstand being interrupted by new invocations of themselves (i.e. they are not reentrant). In particular, an interrupt handler must never block on anything. Most interrupt handlers simply make a note of work that must be done as a result of the interrupt, clear the interrupt, then terminate, leaving the work to be done at a more convenient time. Note that it may be possible for one interrupt handler to be interrupted by a different interrupt handler, so long as they do not share data structures.
In addition to hardware and software interrupt handlers, the IDT also contains information about exception handlers. Exceptions are conditions in the processor that are usually unintended and need to be addressed. Page faults, divide-by-zero, and segmentation faults are all types of exceptions. We will cover both software interrupts and exceptions in more detail later, especially in Project 3. For now, we just need to look at how to install an interrupt handler.
An entry in the IDT can be one of three different types: a task gate, an interrupt gate, or a trap gate (a "gate" is just a descriptor that contains information about a transition to a different piece of code - think "gate to new procedure"). We will not be using task gates because they make use of the processor's hardware task switching functionality (which we also will not use for reasons we will discuss later in the course). The difference between an interrupt gate and a trap gate is that when the processor accesses an interrupt handler through an interrupt gate, it disables all further interrupts by clearing a flag in one of the processor's registers. Handling interrupts through trap gates does not do this.
For the interrupts in this project, we will be using trap gates (in later projects you will probably use a mixture of trap and interrupt gates). Note that a single interrupt source (timer, keyboard, etc.) will not signal a new interrupt to the processor until the processor has indicated that handling of the previous interrupt is ``done'', even if the system-wide interrupt-enable flag is on. You will need to give careful consideration to how long it is reasonable to disable interrupts globally and how long it is reasonable to wait before acknowledging a particular interrupt.
The format of the trap gate is given on page 151 of intel-sys.pdf. Note that regardless of the type of the gate, the descriptor is 64 bits long. You can use this fact to index into the IDT without knowing the type of the other gates in the table. To get the base address of the IDT, we use an instruction called
Initially, the PICs are programmed so that they call entries in the IDT that conflict with processor exceptions. The pic_init() function in lib/inc/x86/pic.h remaps the PICs to higher IDT entries. We have provided you with a skeleton version of night.c which shows you how and when to call pic_init().
Some fields in the trap gate are self-explanatory, however others are not so they are summarized here:
This may be the first time you are required to write code which packs a data structure into a hardware-defined memory format. There are several different ways to write this kind of code. Please make sure that, regardless of the approach or style you use, your code is structured in a reasonable way.
This information is indeed enough to resume executing whatever code was running when the interrupt first arrived. However, in order to service the interrupt we need to execute some code; this code will clobber the values in the general purpose registers. When we resume executing normal code, that code will expect to see the same values in the registers, as if no interrupt had ever occurred. So the first thing an interrupt handler must do is save all the general purpose registers, plus %ebp. We need not save the stack pointer because if there is no stack change (as is the case in this project since we have no user processes), the stack pointer will be correct once we pop off all of our interrupt-related information. If there is a stack change (as there will be in Project 3 when you do have user processes), the interrupt saves the stack pointer for us. So, to recap, the registers you need to save are %eax, %ebx, %ecx, %edx, %esi, %edi, and %ebp.
The easiest way to save these registers is to just save them on the stack. This is easily done with the
A final note about writing assembly. Comment your assembly profusely. One comment per instruction is not a bad rule of thumb. Keep your assembly code in separate files, and to export a symbol (e.g.,
For those of you who have not written assembly before, here is the expected form:
Assembly files normally have a .s or .S extension. The two extensions have different meanings to gcc; gcc will run the preprocessor on assembly files named with .S, while it passes files ending in .s to the assembler unaltered. Since you'll probably want to use #defines and other preprocessor statements within your assembly code, we strongly suggest naming your assembly with the .S extension. Note that the C preprocessor will swallow C-style comments in your .S files so they won't annnoy the assembler.
For C code to call your assembly-language functions, you should declare them in a header (.h) file. This is a good place for you to place doxygen comments for the assembly-languge code.
The pixel pattern displayed by the graphics adaptor is controlled by a region of main memory (memory mapped I/O). Each character on the console is represented in this region by a byte pair. The first byte in this pair is simply the character itself. The second byte controls the foreground and background colors used to draw the character. These byte pairs are stored in row-major order. We have configured your graphics controller to paint a screen consisting of 25 rows of 80 characters each. The location of video memory, as well as the color codes used for the second byte of each pair, are defined in 410kern/lib/inc/video_defines.h.
Writing a character to the console is as simple as writing a byte pair to video memory. Assuming the top left corner is (0,0), to write the character 'M' on the 2nd character of the 1st line, you would do something like this:
While our support code has already put the hardware in 80x25 format, you will also need to reposition or hide the hardware cursor. The hardware cursor is controlled by the Cathode Ray Tube Controller (CRTC), an important device on the video card. Communication with the CRTC is accomplished with a special pair of registers. Accessing these registers is a little different from writing to console memory as they use the I/O ports (discussed earlier).
int putbyte( char ch );
void putbytes(const char* s, int len);
void draw_char(int row, int col, int ch, int color);
char get_char(int row, int col);
void set_term_color(int color);
void get_term_color(int* color);
void set_cursor(int row, int col);
void get_cursor(int* row, int* col);
The actual timer interrupt handler in this project is quite simple, although in Project 3 you will be using the timer interrupt to trigger your scheduler. For this project, you'll be using the timer interrupt for to measure elapsed times.
To initialize the timer, first set its mode by sending
When the timer interrupt occurs the processor consults the IDT to find out where the timer handler is. The index into the IDT for the timer is
Your timer interrupt handler should save and restore the general purpose registers. You also need to tell the PIC that you have processed the most recent interrupt that the PIC delivered. This is done by sending an
Note: You will be testing this on an instruction set simulator. Even though you are simulating an older processor on relatively fast machine, Simics does not make an effort to exactly correlate the simulation to real wall clock time. If you have it set up properly, it will run in real time on real hardware.
The index into the IDT for the keyboard is KEY_IDT_ENTRY (defined in 410kern/lib/inc/keyhelp.h). When you receive a keyboard interrupt, your keyboard handler needs to read the scancode from the keyboard by reading a byte from the keyboard's I/O port (KEYBOARD_PORT defined in 410kern/lib/inc/keyhelp.h). You also need to tell the PIC that you have processed the keyboard interrupt. This is done by sending a INT_CTL_DONE to one of the PIC's I/O ports, INT_CTL_REG.
So that your device driver library might be used by the widest range of applications, such as a real-time game, we will ask you to structure your keyboard interrupt handler so it does as little work as reasonably possible. In particular, you should postpone processing the scan code until you are no longer inside an interrupt handler. For Project 1 you should assume that the process_scancode function is expensive timewise, so that performing that processing in the keyboard interrupt handler might disrupt the application.
Do not forget that your keyboard interrupt handler needs to save and restore the general purpose registers! Also, we have included functions that enable and disable interrupts in 410kern/lib/inc/x86/proc_reg.h (namely enable_interrupts() and disable_interrupts() respectively). You will most likely need to use them once in your keyboard driver to ensure there are no interrupt-related concurrency problems. Certain implementations may not need them. If you believe this is the case for your implementation, please add a comment explaining why this is so.
Before handing in your project, you should make certain that your drivers work with our test code, provided in 410test.c. To test your drivers against the provided test code, build your project using the 410test make target using "make 410test". Please note that this make target does not update the project's support files automatically; you should run make (or make afs or make web), followed by make clean, before running make 410test.
Although various printing functions are provided in stdio.h (namely printf(), putchar(), and puts()) they will not work until you implement putbyte() and putbytes().
Also, we have provided a function called
Atari introduced the Night Driver arcade game in 1976, when Professor Maggs was thirteen. As you can imagine, it made quite an impression. The object of this game is to drive as quickly as possible along a curvy, unlit, road on a moonless night. We're talking pitch black here. Of course when you drive at night, you run the risk of veering off the road into one of the deep and booby-trapped ditches on either side. Fortunately for you, the Federal Government has conscientiously applied the windfall of tax dollars collected from .com millionaires through the pernicious Alternative Minimum Tax. At the very edge of the road, on either side, they have placed reflectors. You should not drive over these reflectors, for if you do, disaster will ensue.
In the arcade version of the game, the driver controlled the car with a steering wheel, gear shift, and gas and brake pedals. (What, no clutch?) The score was based on how quickly the driver covered a fixed course (called a "level") without crashing (i.e., running over a reflector). The video display was very simple. It was black, with white reflectors scrolling off the bottom of the screen as the car passed them. Here's a screen shot of the arcade version of Night Driver from the video game emulator MAME. (This screenshot was taken from the web site Arcade@Home.)
The white box at the bottom indicates the width of the car. (You can think of the box as the hood of the car, as seen through the windshield.) The car itself never moved at all on the screen. Note that a much lesser home version of Night Driver was released by Atari in 1978. A serious gamer would not confuse the home version with the real thing.
You will write a simple version of Night Driver, in which the only control for the car is the steering wheel. (Since you do not have a steering wheel input device, you will use the keyboard to steer.) Your steering mechanism can be very simple, e.g., pressing the key to turn the steering wheel left might simply move the car one character to the left. You may either move the image of the car left and right, or you may keep the car in a fixed position, and move the reflectors left and right.
Your program must scroll the reflectors down the screen in real-time, and detect if the car has run over a reflector. The car may move along the road at a constant velocity, i.e., the reflectors may stream by your car at a constant rate. The car should not move up and down. The intent is to provide a driver's perspective of the road. It is NOT necessary, however, to draw the road in perspective (i.e., narrower at the top of the screen), as the arcade version did. The game should be playable in SIMICS WHEN SITTING AT AN ANDREW WORKSTATION, as that is how we will test it. A driving level should not be too long, i.e., it should not take more than thirty seconds to complete. (Unfortunately, throughout this assignment you will find yourself annoyed by the differences in speed between simics at an andrew workstation, simics over the network, and real PC hardware.)
If the the car hits a reflector, the game should stop and issue a suitably stern admonishment, and then give the driver the option of starting over. Otherwise, if the driver completes the level, the program should the tell the driver how much time has elapsed. The driver should then move on to the next level. It is not necessary to give a "score" to the driver.
You do not have to worry about designing roads. We will provide a pseudo-random-road generator that you can use to create a curvy road "on the fly". This is contained in 410kern/levels.c.
The levels API is very simple. There are two functions for you to call, seed_levels and init_levels. seed_levels takes in an int with which it will seed the random number generator. After the random number generator has been seeded you can call init_levels which will generate the levels for you. init_levels takes in the number of levels you wish to generate, the height (in lines) that each level should be, and some bounds information. All levels generated will be placed in the levels  struct. More documentation is in 410kern/inc/levels.h.
Feel free to write your own levels or level generator in place of ours and have fun with this!
The above description is a bare-minimum requirement for the
project. Feel free to be more creative: add perspective, model
momentum in steering, provide gears, brakes, and a gas pedal, or do
whatever you think would be fun. If an idea seems too unusual, feel
free to check with a TA before implementing it.
Testing and Debugging with SimicsDebugging operating system code is typically quite difficult. One of way of doing it is to have two machines, one running a standard production OS like Linux, and the other running (and presumably crashing) the OS being developed. A serial cable is used to attach the two machines and special serial handlers installed on the development OS allow a debugger on the production OS to debug the development code. Fortunately, you will not have to deal with this. For the duration of this course, you will be testing and debugging your projects with Simics. Simics is an instruction set simulator. This means that every instruction that Simics runs is simulated, rather than run on real hardware as in VMware. For this reason Simics runs quite a bit slower than VMware. The amount of code in your operating system is small enough, however, that the difference will not be noticeable.
Because Simics simulates every instruction executed, it has some powerful debugging capabilities. It allows, for example, breakpoints to be set on memory accesses to certain locations (specifically reads, writes, executes, or any combination thereof). It also allows temporal breakpoints - breakpoints that occur after a certain number of instructions have been executed. Like the popular debugger GDB (which most of you should be familiar with from 15-213) it has some symbolic debugger capabilities as well.By typing help at the Simics prompt, you will be presented with a list of different categories of commands. To see the commands in a specific category, type help CATEGORY. To get help on any of these specific commands, type help COMMAND. Please see our Simics Commands Guide for a list of important commands to know.
Getting Started with Simics
Simics has been installed on AFS and thus may be used from any Linux machine connected to AFS, subject to licensing restrictions. If you have added the 15-410 bin directory to your path, you can execute simics on your kernel by executing simics-linux.sh You may also use a Solaris machine, but your code still must be compiled and run on a Linux machine (you can ssh into linux.andrew.cmu.edu and compile and run there).
The BIOS and BootloaderBefore your kernel starts running, you will always see the BIOS startup followed by the bootloader. We chose GRUB as our bootloader for various reasons which we will not go into here. The bootloader has been configured to present you with a boot menu consisting of two choices. You will be able to choose between running your video game or the 410_test program.
Triple faults are easy to run into in Project 1. Since you are not installing many fault handlers, most IDT entries are invalid. If, for example, your Project 1 code makes an invalid memory reference, a protection fault could be raised. Since you have not installed a protection fault handler, there is a good chance that referencing the nonexistent handler will cause a "segment not present" exception to be raised. Since you have not installed a "segment not present" handler...
For more information see the 410 Triple Fault Page.
The meaning of each fault condition is described in Chapter 5 of intel-sys.pdf.
The 410test program provides a different kernel_main() function
which will link against the same $OBJS list and will call
your console and keyboard API functions. Before it does
that, however, it will call
See http://www.cs.cmu.edu/~410-s05/p1/handinP1.html for details.
Up to you! But here are some suggestions.
[Last modified Friday January 21, 2005]