The kernel project is a four week project, unlike the two projects you have
completed. It will be very important to adhere to a reasonable schedule
throughout the four weeks to ensure completion of the project. While
project four (the file system) is probably more code than the kernel,
the kernel is arguably the most complicated project conceptually.
Goals
The goal of this project set is certainly not to teach the
idiosyncrasies of the x86 architecture (or Intel's documentation). That
said, it will be necessary to become accustomed to the x86 way of doing
things, and the Intel nomenclature, for the purposes of completing this
project set. Just keep in mind that the x86 way of doing things is not
the only way of doing things. It is the price to be paid for learning
the principles of operating systems on a real world system instead of a
simulated architecture.
Important Dates
Wednesday, February 19th: Project three assigned.
Friday, February 28th:Checkpoint one.
Friday, March 7th:Checkpoint two.
Friday, March 14th:Checkpoint three.
Friday, March 21st: Project three is due at midnight (Friday night).
This is the last day of class before spring break. If you are planning on
leaving before this time, you may certainly turn in your kernel early.
You have over four weeks to complete your kernel, plan accordingly.
Groups
The kernel project is a group assignment. You should already be in a group
of two from the previous project. If you are not in a group, or you are
having other group difficulties, send email to staff-412@cs.cmu.edu.
Grading
The primary criteria for grading the kernel will be correctness
and robustness. Does your kernel crash if you pass a null
pointer to your printline system call? Kernels are expected to take
abuse and continue to run.
Although it is not the first thing we will look at, style is also important. Use an appropriate level of commenting, name variables consistently, and try not to resort to global variables when possible.
It is also our intent that part of the project 3 grade will be based on a face to face interview process. We reserve the right to give partners in a group different project grades if circumstances make that necessary.
Please hand in a design document with your kernel. It should talk about all the important design decisions you made, what works, and what does not. If you tell us about something that doesn't work, it puts you in a better position to be given more partial credit than if you do not tell us and we stumble across it ourselves, because it lets us know that you understand that there is a problem and have spent some nonzero amount of time trying to find it.
Handin
The procedure for handing in project 3 will be the same as it was for
project 2. You should have a group directory...
Hardware Primitives
Project one involved a small amount of talking to devices and dealing with
processor data structure (like the interrupt descriptor table). In this
project, you will need to understand more processor data structures (such
as the x86 page table structure), and you will need to incorporate some of
the functionality of project 1 as well. In this section we will look at
some of the important hardware primitives that you will need to use.
Segment Selector Registers
As was mentioned in the project 1 handout, we will be making as little use
of segmentation as possible in these projects. However we do need to specify
code, stack, and data segments for the processor. These segments are specified
in segment registers. Specifically, the code segment for the currently
running process is stored in the %cs register. The stack segment for the
currently running process is stored in the %ss register. It is possible to
specify up to four data segments for the processor, and they are stored in
%ds, %es, %fs, and %gs. The code segment is used to access instructions.
The stack segment is used in stack related operations (such as push and pop
instructions). The data segments are used in all other operations that access
memory, such as data structure reads and writes.
Each segment has a privilege level associated with it, which is the minimum privilege to access that segment. In project 1, all of our segments were privilege level 0, and we never left kernel mode. In this project however, we will have user processes which cannot access segments that require privilege level 0. Fortunately we have already defined segments for user code and data (user data can be used for both the user's stack segment and data segment). They are specified in seg.h.
When you start a user process, you will need to specify a user-level code
and stack segment. You will also need to load a user-level data segment into
the data segment registers. When a user process takes an interrupt, the
code and stack segment registers will get saved automatically (see
intel-sys.pdf page 153). The data segment registers will not be saved, though.
You will need to save these along with the other general purpose registers,
and then restore them before the user process is run again.
The EFLAGS Register
The EFLAGS register controls some important state like whether or not
interrupts are enabled. You will never need to modify the EFLAGS register
directly, however it may be useful to understand its format. The EFLAGS
register is discussed in section 2.3 of intel-sys.pdf.
Although the EFLAGS register stores a flag controlling whether interrupts
are enabled or disabled, there is a much faster way to disable and enable
interrupts. The sti and cli instructions enable and disable interrupts.
You can find more information on them in intel-isr.pdf.
Control Registers
The control registers contain some of the most important flags on the x86
architecture. A summary of the control registers can be found on page 50 of
intel-sys.pdf.
Control register 0 (CR0) contains the most powerful system flags. The only
flag in CR0 that you should modify is bit 31 (the top bit), which activates
paging. Paging is active when the flag is set, and inactive when the flag is
cleared. Make sure you have set up your page directory (discussed below)
properly before setting this flag. Do not modify the state of any of the
other flags in this register.
Control register 1 (CR1) is reserved and should not be touched.
Control register 2 (CR2) is only used in one instance. When your kernel takes a page fault, CR2 will contain the address that caused the page fault. You will need to get the value of this register inside your page fault handler.
Control register 3 (CR3) is sometimes referred to as the page directory base register, or PDBR. It holds the address of the current page directory in the top 20 bits (these are the top twenty bits of the page directory's address, the bottom 12 must be zero since the page directory must be page-aligned). It also has two flags, present in bits 3 and 4. Bit 3 controls whether the processor's internal caches are write-back or write-through (if the bit is set, write-through caching is enabled, if it is clear, write-back caching is enabled). Bit 4 controls whether the page directory may be cached (if the bit is set, the page directory may not be cached). Both of these flags should be cleared (not set). You will need to update CR3 whenever you wish to switch address spaces. Fortunately the x86 architecture flushes the TLB for you when you update CR3.
Control register 4 (CR4) contains a variety of extension flags, which for the most part we will not be using. The one flag we do care about is the Page Global Enable (PGE) flag, stored in bit 7. We will want to set this flag to enable global pages. The reason for this is made clear below. Do not modify the state of any of the other flags in this register.
We have provided assembly macros to read and write from the control registers.
Be very careful about reading and writing to control registers (think about
interrupts).
The Kernel Stack Pointer
On the x86 architecture, the stacks for user level code and kernel level code
are separate. When an interrupt occurs that transitions the current privilege
level of the processor to kernel mode (refer to the project one handout
for background on privilege levels), the stack pointer is set to the top
of the kernel stack. A small amount of context information is then pushed onto
the stack to allow the previously running process to resume once the interrupt
handler has finished running.
The initial value of the stack pointer when we enter kernel mode is defined by the currently running task. Tasks are a hardware process mechanism provided by the x86 architecture. We will not be using tasks, however. Not only is it faster to manipulate the process abstraction in software, we can also tailor that abstraction to our particular design needs. However it is necessary on the x86 to define at least one task. The OSKit libraries take care of setting that up for you. We have provided a function to go into the processor data structure for this single task and set the initial kernel stack pointer value. It is called set_esp0(void* addr), defined in seg.h.
For more information on hardware tasks, refer to chapter 6 of intel-sys.pdf.
Paging
The x86 architecture uses a two-level paging scheme with 4k pages. It is also
possible to use larger page sizes, however 4k is by far the most commonly used
page size. The top level of the paging structure is the page directory,
while the second level consists of page tables. The format of the
entries in the page directory and page tables are very similar, however, their
fields have slightly different meaning. Here is the format of both a
page directory entry and a page table entry.

Both entry formats use the top 20 bits to specify an address (a page directory entry specifies the address of a page table, a page table entry specifies the address of a frame in memory). Note that both page tables and frames must be page aligned (they are both exactly 4k in size and the bottom 12 bits of their addresses must be zero), which is why we can simply specify the top 20 bits of the address.
The designers of the x86 architecture have kindly left us three bits which we can use as we please. These are the bits in the field marked "Avail" in both types of entries.
Bit 8 is the global flag, which has no meaning in the page directory entry (it can be set to either 0 or 1). In a page table entry however, if the global flag is set the virtual-to-physical mapping for this page will not be flushed from the TLB automatically. This can be useful for pages that are the same for all address spaces. You should use this to prevent the kernel mappings from being flushed on context switches. To use this bit, you also need to set the page global enable flag (PGE) in control register 4.
Bit 7 is the page size flag in the page directory entry, and in the page table entry it is the page attribute table index flag. Since we are only using 4k pages and they are all the same type of page, both of these flags should be set to zero.
Bit 6 is the dirty bit, and is only valid in page table entries. This flag gets set by the hardware when the page pointed to by the page table entry is written to. This can be used in implementing demand paging. It will not be used, however, in this project.
Bit 5 is the accessed bit. It gets set by the hardware when the page pointed to by a page table entry is accessed. The accessed bit gets set in a page directory entry as soon as any of its page tables point to pages have been accessed. You will not need to worry about using this flag.
Bit 4 is the page-level disable caching bit. If the flag is set then caching of the associated page (or page table if the entry is a page directory entry) is disabled. This flag should always be cleared to enable caching.
Bit 3 is the page-level write through bit. If it is set, write-through caching is enabled for that page (or page table if the entry is a page directory entry), otherwise write-back caching is used. This flag should be left clear.
Bits 1 and 2 of the page directory and page table entries interact with each
other to provide page-level protection. We will
be using page-level protection in the kernel to make sure that user processes
cannot clobber the kernel. See the table on page 136 of intel-sys.pdf for
the specific behavior of these two flags:
Bit 2 is the user/supervisor flag. If the flag is set, then the page is
user accessible. This flag has different meaning depending on whether it is
set in the page directory entry or the page table entry. It also interacts
with the user/supervisor flag. See the table on page 136 of intel-sys.pdf for
specific behavior.
Bit 1 is the read/write flag. If the flag is set, the page may be written to. If it is clear, the page is read-only. This flag has different meaning depending on whether it is set in the page directory entry or the page table entry. It also interacts with the user/supervisor flag. See the table on page 136 of intel-sys.pdf for specific behavior.
Bit 0 is the present flag in both the page directory and page table entries. If the page is marked not present, then an attempt to read, write, or execute data stored at an address within that page will generate a page fault. When you install a page into a page frame, you should set this bit. Likewise, if a page table is marked not present, then an attempt to read, write, or execute data stored anywhere at an address spanned by that page table will generate a page fault. When you install a new page table, make sure to set the present bit in its corresponding page directory entry.
Recall from the section on control registers that the address of the active page directory is kept in CR3, and the top bit of CR0 controls whether paging is active or not. You will need to install a page fault handler (this is similar to your installation of timer and keyboard interrupt handlers). That page fault handler will need to look at CR2 for the address of the fault, and decide whether it is a valid request (in which case you will need to allocate a new page and possibly a new page table), or an invalid request (an attempt to access kernel memory or the void between the heap and the stack). In order to get page faults when a user process attempts to access kernel memory, set the proper permission bits in the corresponding page directory and page table entries.
Of course, some accesses between the heap and the stack are valid; in fact, any time the user process grows its stack, it will access memory between the heap and the current stack pointer. You will need to decide how far is "too far" when judging whether a pointer access is an attempt to grow the stack, or a wild read/write into the middle of nowhere.
The Interrupt Descriptor Table
You should remember the interrupt descriptor table from project 1. It is where
we specify the handlers for interrupts, both hardware and software, as well
as exceptions. You will eventually need to install handlers for page faults.
You also need to install a handler for the illegal instruction exception, divide by zero exception, and general protection fault. Details on all of the exceptions (and their index into the IDT) can be found in section 5.12 of intel-sys.pdf. The general protection fault is the x86 catch-all exception that you will probably get a lot of in your kernel development - it would be valuable to get this handler installed early. The need for the other exception handlers should be obvious.
Installing a handler should be familiar from project one; if you're a little
rusty, consult your project one solution or the project one handout.
The Layout of Physical Memory
Although there are many ways to partition physical memory, it is required that
you follow the following model. The bottom 16MB of memory (from 0x0 to
0xffffff) is reserved for the kernel. This kernel memory should appear as the
bottom 16MB of each process' address space. This kernel memory is also direct
mapped; that is, the virtual addresses for the bottom 16MB of memory are the
same as the physical addresses. Note that user processes should not be able
to read or write kernel memory, even though it is resident at the bottom of
each user process' address space.
The rest of physical memory, i.e. from 0x1000000 on up, should be reserved for page frames. The simulated machines have 256MB of memory. Although it is quite possible to determine how much memory there is in the system, for now your kernel should have a #define for the amount of memory.
#define MACHINE_MEMORY_SIZE_MB 256
You should use this memory size in determining the size of your kernel data
structures.
Kernel Programming Environment
The programming environment for this project is quite similar to the one
provided in project one. OSKit will take care of rudimentary machine
initialization and get you to the main() function in kernel.c. If this
main() function ever returns, OSKit will print a message prompting you to
press a key to reboot. Your kernel should never reach this point. We will
provide you a shell which does not exit. If you ever detect that the shell
process has died, you should respawn another one.
While we could make you write your own memory management routines for kernel space, we have decided that you should be focusing on process management, virtual memory and system calls instead. To that end, there is a small collection of malloc functions that are available to you. OSKit supports these malloc functions with a list-based memory manager. In particular, here are the malloc functions that you can use.
To use these functions, #include <oskit/c/malloc.h.> You may also use the str*** and mem*** functions, as well as bzero. To use any of those functions, #include <oskit/c/string.h>. The prototypes of these functions are the same as their Unix counterparts, so you can look up the behavior/prototype using common man pages.
Your console output should be done using your solution to project one. If you
did not get project one to work, or were not satisfied with it, you may
contact us for some simple console and keyboard routines. We would much rather
you use your own code if possible however.
Context Switching
Context switching is probably the messiest part of this project. You will
have to write some assembly to do it, and it may seem a little mind bending
at first. It is a great feeling though when you successfully schedule two
processes for the first time, however!
Context switching involves saving the general purpose registers and the segment selector registers of one process, halting its execution somewhere, and then loading the general purpose registers and segment selector registers of another process and resuming its execution where it previously was stopped. You will also need to switch the active page directory to correspond to the new process' address space.
One way to simplify the burden of context switching is to always do it in the same place. If you have a single function which performs the context switch, then the point of execution for every process not currently running is inside that function. You need not explicitly save the instruction pointer.
Writing the routine to context switch processes that have already run is actually not so bad. The messier part is bootstrapping a process to get it to execute for the first time. Before you attempt to run a process for the first time, you will need to set up the top of its kernel stack so that your context switching code loads meaningful context. You might also have a routine which context switches a little differently if the new process has never run before.
A useful tool in bootstrapping processes is the iret instruction. The iret, which you should be familiar with from project one, is used to return from interrupt handler routines. It is capable of changing the code and stack segments, stack pointer, instruction pointer, and EFLAGS register all in one step. It is particularly useful in starting user processes for the first time. A good place to find more information on the behavior of the iret instruction (and what it expects on the stack) is page 153 of intel-sys.pdf. This gives a diagram of what iret expects on the stack.
Do not forget that you will need to register a handler for the timer interrupt, and that it may need to call your scheduler and context switch. Refer to your solution for project one and the project one handout to see how to use the timer. As in project one, your timer should go off every ten milliseconds (i.e. "ticks" are separated by ten milliseconds). As far as your scheduler is concerned, a simple round robin scheme is sufficient for this project.
A Loader for a.out
It will be a tough task indeed to convince everyone in the world to use your
new operating system if it cannot run any user processes. In order to take
a binary executable and get it running inside a process, you need to load
it from the executable into its address space. However this is not as simple
as just copying the file into some fixed location. There are a variety of
executable image formats. Because we do not want you to dwell on the loader
too much, we have chosen the simple a.out format as the executable format
of choice for the project three kernel. Your task will be to write a loader
for a.out, and use it to load your executables.
Getting the Executable's Bytes
Because we do not have a file system as of yet (you will be working on that
in your next project), you will be loading your executables from large arrays
compiled into the kernel. We are providing to you a utility, exec2obj, which
takes as an argument an a.out executable and creates a .c file. The .c file
simply contains a char array with the name of the a.out executable, and its
contents are the data of that executable. So, if I had an a.out executable
whose filename was hello, and I executed
exec2obj hello
the result would be a new file in that directory called hello_obj.c. Inside that hello_obj.c file would be a char array called hello, and it would be initialized to the data contained within the hello executable. You can then refer to that array in your loader when you wish to load that particular user application. A table of contents is also created which you can use to locate the array (of which there may be more than one) based on the filename of the executable. The definition for the table of contents is in exec2obj.h.
Eventually, you will be reading your executables from a file system. To facilitate an easy switch from exec2obj to a file system, we would like you to create the following interface to the arrays provided by exec2obj:
int getbytes(char* filename, int offset, int size, char* buf)
The getbytes function should find the file 'filename' in the table of contents in exec2obj.h, seek to the specified offset within the array of bytes for that file, and copy size bytes to buf. The function should return the number of bytes succesfully copied (this could be cut short if you run into the end of the array). If the offset is to the end of the file, getbytes should return zero (regardless of size). If offset goes past the end of the file, -1 should be returned.
Your loader should make calls to your getbytes function in order to acquire the actual bytes of the executable.
Details on the a.out Format
For information on the a.out executable format, refer to the FreeBSD a.out
man page, available at:
http://www.freebsd.org/cgi/man.cgi
There are some macros that you will find useful in dealing with the a.out format. They are:
#define N_MAGIC(exec) ((exec).a_info & 0xffff)
#define N_BADMAG(x) (N_MAGIC(x) != NMAGIC)
#define N_TXTOFF(x) (sizeof( exec_t ))
#define N_DATOFF(x) (N_TXTOFF + (x)->a_text)
In addition, we will only be using the NMAGIC variant of a.out. You will need this then as well:
#define NMAGIC 0410 /* 0x108 */
The rest of the information should be available in the man page on a.out.
Linking User Processes
Your user processes require some special care in linking, since they are going
to be run in a very strange environment. In particular, you will want to
compile them with these flags:
Kernel System Calls
The heart of your kernel will be the system call interface. User
processes need to be able to fork, exec, getpid, etc. You used system
calls in project 2 to implement your thread library. Now, in this
project, you will implement those same system calls, as well as many
others.
To receive the system call you will need to install a handler for the int instruction you have been using. You should remember how to do this from project one, if not, consult your project one handout or solution. In short you will need to add an entry into the interrupt descriptor table (IDT).
To ensure that we can test for robustness easily, we are requiring you
to use int 0x40 as your system call software interrupt. Please do not use
a different software interrupt. We also require that you store your
system call number in %edi, and a single argument to your system call
in %esi. For system calls that take more than one parameter, an argument
structure is defined below. The system call should, in this case, take a
pointer to this structure in user space.
System Call Interface
You need to provide the following system calls. Of course, all of
these system calls will use an int instruction. As you no doubt
discovered in project two it is inconvenient for the user to have to
write wrappers around the int instruction. Please write user-level
wrappers around these functions that take care of collecting arguments
into a structure if necessary, and calling the int instruction. This
will make it easier for you to test these system calls as well.
Since this system call takes more than one parameter, you should have it take a pointer to a structure with this format:
struct {
char* execname;
char** argvec;
}
On success, there is no return from this call, since the calling process is terminated. On failure, this call returns an ERROR. When a process exits or is aborted, if it has children, they should continue to run normally, but they will no longer have a parent. When the orphans exit, at a later time, you need not save or report their exit status since there is no longer anybody to care.
Note that character processing on the backspace character should happen before the string is copied into the buffer and returned to the calling process. The readline system call returns the size of the line (not including the NULL terminating character) as its return value.
You will need your keyboard interrupt handler from project one to drive this system call. Since it takes more than one argument, you should have readline take a pointer to a structure formatted like this:
struct {
int size;
char* buf;
}
You will need to use your console driver from project one to print characters to the screen. The same scrolling behavior that was present in project one should also be present in project three. You need not maintain a clock, however.
Also keep in mind that the output of two concurrent print calls should not be intermixed. Since it takes more than one argument, you should have readline take a pointer to a structure formatted like this:
struct {
int size;
char* buf;
}
struct {
int row;
int col;
}
You should implement the raw system call (the kernel side of minclone) and then reuse your minclone wrapper from project two.
oskit_random_t* mr_unpredictable;
oskit_random_create(&mr_unpredictable);
int rand_num;
oskit_random_random(mr_unpredictable,&rand_num);
"I'm getting the OSKit default trap handler telling me I have a general protection fault. What's wrong?"
or
"I installed my illegal instruction handler and now it's telling me I've executed an illegal instruction. What's wrong?"
The problem with both of these messages is that they do not show that you have made any effort to debug the problem yourself. We put a lot of effort into giving you powerful debugging tools. Please use them. Questions like these give the appearance that you would rather place your debugging burden on a TA then solve the problem yourself. Debugging is actually one of the best ways to learn operating system concepts.
Having said that, if you have spent reasonable amount of time trying to solve the problem and are not making any progress, do not hesitate to ask a question. But when you do, provide evidence that you have spent some time trying to solve the problem yourself.
To help you, we have established a recommended plan of attack. Hopefully this will give you an idea of how to start. This also establishes the criteria used for the checkpoints.