OpenAFS Adaptation to Linux 2.6 Kernel
For this project, I worked on incompatibilities between OpenAFS and the Linux kernel version 2.6. This began with a vague description of the problems and a minimum understanding of AFS and subsequently was heavily aided by expert descriptions and guidance. In the end, many problems were uncovered and addressed with varying levels of my involvement.
System Call Table and PAGs
The Linux kernel version 2.6 made the design choice of removing system call table exportation because of security concerns. Apparently hijacking system calls is a popular technique with rootkit builders; Linus justified it technically by claiming that a loadable module shouldn't be able to modify such a core kernel structure. This, however, is unfortunate for OpenAFS, which uses the system call table to both define its own system call and hijack the setgroups and getgroups calls.
OpenAFS requires its own system call for user programs to talk to the kernel module. This is used for both path ioctl "pioctl" and for a set of configuration commands sent by the afsd setup program. Path ioctl is used for implementing afs operations on filesystem paths, such as ACL changes and ticket management.
The group identifier modifications are for the implementation of process authentication groups (PAGs). AFS needs an identifier that propagates down the parent/child relations of processes. This is implemented using an ugly hack that utilizes the upper 16 bits of some of the 32 GIDs a process can have. In order to filter out and maintain the encodings, setgroups and getgroups must be hijacked.
Version 2.6 introduces kernel level preemption for the first time. This will most likely require some locking changes in OpenAFS.
Make system, timekeeping and inode changes
Two architecture parameter files had to be created for i386_linux26 in the build system, and a few entries for i386_linux26 had to be added to configuration files scattered throughout the source tree. These configuration files were located in src/config, and I was able to follow the style implied in them to infer how to add i386_linux26.
The next thing that went wrong was a change in the definition of yylineno. The reference was in a printf statement in what looked to be a debug function. It may be specific to my particular yacc version but it was easily patched.
Then timekeeping code in the RPC callbacks failed. AFS can provide RPC latency statistics, and the code that implements this feature utilizes macros that expand into about 20 lines of text. The functions reference an the internal kernel timepiece xtime. 2.6 switched from microseconds to nanoseconds for this variable, causing an easily fixed type conversion.
Finally, there was inode changes in 2.6 that required simple initialization changes to remove initiation of removed entries.
The patch that I specifically contributed addressed the system call table problem. It achieved this by creating a file in the /proc file system. This was done using functions in the proc_fs.h header in the kernel tree. The ioctl method was implemented by specifying it in a 'file_operations' structure and then overwriting the proc_fops member of the kernel data structure 'proc_dir_entry' type pointer returned by the creation routine. Once this was in place a pointer setup to be passed across the interface. This pointer was setup by the user code to point to the five arguments of the original system call. The appropriate function was called on the kernel side after using copy_from_user to get the arguments.
Other patches submitted
A regular OpenAFS contributor, Chas Williams, appears to have submitted patches that address most of the compile issues and some basic preemption problems (it looks like he's using 'the giant kernel lock' that I read about in articles).
The Andrew Benchmark was created in 1985 by Mahadev Satyanarayanan to be a distributed filesystem benchmark. I consists of five stages that tests creating subdirectories copying files, listing many directories, and scanning through many files using grep. The final stage compiles a simple window manager. Due to its age, the pace of computer advancement allows for running times of under half a minute for the complete benchmark.
Andrew Benchmark Results
I ran the Andrew Benchmark on a PII 450 mhz Dell Optiplex with 128 MB of RAM against the andrew.cmu.edu cell. As measured by consecutive date commands, the entire benchmark took 11-12 seconds in linux 2.4 and 12-14 in linux 2.6. Based on these results I conclude there is no performance difference between 2.4 and 2.6. Due to the number of confounding variables of the experiment, a second or 2 difference is probably within the margin of error.
Linux 2.6 Links
[Last modified Friday March 23, 2007]