Carnegie Mellon
SCS logo
Computer Science Department
home
syllabus
staff
schedule
lecture
projects
homeworks
 
 

15-410 Project 4


Overview

This semester Project 4 will explore the relationship between virtual memory and file systems. You will implement a stripped-down version of the mmap() system call and user-space library providing standard file-system I/O calls.

Project 4 is due Wednesday, December 5th, at 23:59. When planning your work, keep in mind that the book report and the final homework assignment will be due on the last day of classes.

Note that P4 grades will probably not be returned before the final exam; in the other direction, the exam will not test you on P4 material as such (for that reason and because not all groups will work on P4).

Memory-mapped files

To begin with we will ask you to implement memory-mapped files via one system call.
  • int mmap(char *pathname, void *base)

    Causes the RAM disk file named by pathname to appear in the invoking task's address space starting at the address specified by base, which must be page-aligned. If the call is successful the return code is the number of bytes contained in the file and memory reads from the appropriate range of pages will return "file" data from the RAM disk.

    The system call returns an error code less than zero if there is no file by that name, if the base address is invalid, if the kernel is running low on some critical resource necessary for the call to succeed, if it is not possible to contiguously map the entire file contents into the task's address space starting at the base address, etc. The mmap() specification does not include specific values for the various error conditions, which may vary from one kernel implementation to another (see "Restrictions" below).

    The task may use the remove_pages() system call to remove the mmap()'d pages from its address space

File-I/O library

We will also ask you to implement the traditional Unix I/O primitives open(), read(), lseek(), and close(). However, in a break with the Unix tradition, these will not be system calls. Instead, you will write a user-level library, libfs.a, which will provide essentially the same effect.

  • int initfs(void)

    Initializes the libfs.a library routines. Programs using this library are expected to #include <fs.h> and call initfs() before calling other functions provided by the library.

  • int open(char *pathname)

    Opens the file named by pathname. If the file exists, sets the file position to the beginning of the file and returns a file descriptor number that can be used for future operations on this open file. If the file does not exist or for some other reason cannot be opened open() returns an integer error code less than zero.

    The file descriptor returned should be a small integer that uniquely identifies the open pathname file for as long as the file remains open. A single pathname name can be opened multiple times simultaneously, and each separate open() should return a separate file descriptor.

  • int close(int fd)

    Closes the open file specified by the file descriptor number fd. If fd is not the descriptor number of a file currently open, close() returns an integer error code less than zero, otherwise it returns zero.

  • int read(int fd, void *buf, int size)

    Reads data from the current file position of an open file. The argument fd specifies the file descriptor number of the open file to be read. buf specifies the address of the buffer into which to deposit the data from the file, and size is the number of bytes of data to be read. read() returns the number of bytes read, with 0 indicating end-of-file. The number of bytes read will be the minimum of the number of bytes requested and the number of bytes remaining in the file between the current file position and the end-of-file. Any error should cause read() to return an integer error code less than zero.

  • int seek(int fd, int offset, int whence)

    Changes the current file position of the open file specified by fd. The argument offset specifies a byte offset in the file relative to the starting point indicated by whence. If whence is SEEK_START then the resulting position is offset; if SEEK_HERE the resulting position is the current position plus offset; if SEEK_END then the resulting position is the end of file plus offset (which will most usefully be zero or negative).

    seek() returns the resulting file position or an integer error code less than zero.

  • int dup(int fd)

    Returns a new file descriptor based on the same file-open status as fd. That is, the two descriptors always share the same file position in the same file. dup() should assign the new descriptor the lowest-numbered file descriptor slot not in use by the current process. In other words,
    close(0);
    x = dup(7);

    should result in x having the value 0, but now both 0 and 7 should refer to the same file-open state.

Restrictions

  1. It is acceptable for your file-system library to work only for single-threaded tasks.
  2. In the Unix model, file descriptor state is inherited across exec(). You are not required to support this.
  3. A correct libfs.a should run on any Pebbles kernel which has been extended by the addition of the mmap() system call.
  4. It is entirely reasonable for libfs.a to assume that the read-only RAM disk file system is in fact read-only (no files will be added, deleted, or modified).

Getting Started

  1. Begin with a copy of your p3 directory tree. On top of it, extract the contents of the P4 tar file (the result should be that you gain a README and that update.sh is replaced).
  2. Do an update and marvel at the new files which arrive.
  3. Edit your config.mk to specify:
    • FS_OBJS
    • STUDENTFILES
    • 410FILES

Design considerations

Here are some issues you may wish to consider during your design process... how you address these issues will probably affect your project grade, but not by more than 10%.

  1. What happens if many tasks open the same file? Is this sufficiently likely that it deserves to be addressed?
  2. What happens if a task opens the same file multiple times? Is this sufficiently likely that it deserves to be addressed?

Deliverables

  1. Your modified source tree, in p4, including your modified kernel, the complete source code to your libfs.a, and any test code you wrote.
  2. A config.mk which has been modified to include FS_OBJS so that your fs library builds correctly. You may include text files in your RAM-disk by filling in 410FILES, which will include files from 410user/files, and STUDENTFILES, which will include from user/files.
  3. Include a discussion of your design and implementation in README.dox. Be sure to discuss the key design decisions you made.

[Last modified Monday November 26, 2007]