15-410 HW2 solutions (Spring, 2006)

Question 1 - Public Key Practicum

If you did this problem correctly, in a couple of hours your hw2 directory should contain a $USER.message.decrypted file. If not, check to see if there is a $USER.ERROR file. If not, a popular problem is that you used file names other than the ones specified in the assignment, the grading script wasn't able to guess what you meant, and the grader hasn't yet had the time to manually intervene.

By the way, your encrypted messages to us were read and, generally, appreciated.

Question 2 - Virtual Memory

Part A

Assuming the kernel has "plenty of memory" (i.e., is not forcing pages out to the swap disk), how many page faults might a single CALL instruction encounter? Explain. You may find it helpful to disassemble some code using simics and/or gdb.

Ignoring exotica such as the "far call", a single CALL instruction can easily be 5 bytes long, so it can cross from one page to another (x86 instructions, unlike RISC instructions, have no alignment). That's two page faults.
Again ignoring "far call", the CALL instruction pushes four bytes of return address on the stack. If the stack pointer isn't word-aligned, those four bytes can cross a page boundary, resulting in two more page faults.
The first instruction of the called function might not be in memory... but it probably does not make sense to "charge" that page fault to the CALL.

Part B

How many page faults could a single RET instruction "reasonably" (non-pathologically) encounter?

RET is a 1-byte instruction, so it can't cross a page boundary. But it might (sadly) be on a different page from the other instructions in the function, which would generate one page fault.
Four bytes of return address must be fetched from the stack. As above, this is potentially two more page faults.
The next caller instruction could fault... but again we won't charge that to RET.

If you wish, you may give and explain a number for the pathological case as well as the "reasonable" case.

As was hinted by the text of Part A, it is possible that a page fault could be taken while fetching the RET, then a second fault while fetching the return address, at which point the thread in question could stop running for a while, during which time the instruction could be paged out...

Part C

In general, if you counted the number of page faults encountered by all CALL instructions in one run of a program and compared that to the number of page faults encountered by RET instructions in the same run of the same program, would the CALL number be noticeably greater than, about the same as, or noticeably less than, the RET number? Explain.

Fetching the RET can generate at most one page fault as opposed to two for the CALL. But the chances of a CALL crossing a page boundary are actually slim, something like 1 in 1000, so it's hard to ascribe "noticeably less than" to this effect.
The execution of most functions is much too quick for any pages of the stack to get paged out, let alone the most recently used one (or two). So while there is a sense in which it is "much" more likely for a stack push to generate a fault than it is for a stack pop, it's not all that much. And for the CALL instruction itself to be charged for a fault, it must be the return address (rather than one of the parameters) which crosses the page boundary. Again the likelihood is slim, around 1 in 1000.

So you should expect more page faults on CALL instructions than RET instructions, but actually not many more, something like 0.2%. Most stack-related page faults are probably charged to PUSH instructions, as most functions take more than zero parameters. Most instruction-fetch page faults are probably charged to... the most popular instruction, weighted by instruction size.

Question 3 - File System Performance

Part A

Please explain this seeming paradox. Why is random I/O within this largest-possible file reliably faster than I/O within medium-sized files?

When you ask your kernel to read a given block of a disk partition, the kernel adds the block number you provide to the number of the first block of the partition (found in the partition table, almost certainly in RAM) and asks the disk to fetch the result, costing one seek. However, because each "real file" (meaning a file stored in the file system, as opposed to a device pseudo-file) has its own "address space" mapping file blocks to disk blocks, there must be a mapping function. Because the mapping must persist as long as the file does, the data structure is generally stored on disk, so consulting it sometimes costs one or more disk seeks.

Part B

If you can, explain the odd clustering of seek times within files in /tmp.

Assuming the ext2 file system uses the traditional Unix inode mapping approach (which it does), the first few blocks of a file are mapped by the inode itself. The inode must be fetched when the file is opened, and generally won't leave the RAM cache before randread()'s reading starts. Thus there is no mapping cost for reads hitting the first part of the file, and they will cost only the seek necessary to read the data.
Most files which can't be mapped for free by the inode can be mapped by a single block of block pointers. A read() in this part of the file will cost a seek to read the map block and then a seek to read the data block--sometimes. Much of the time the mapping block will be in the cache. So it is still the case that many random reads will cost only one seek time.
Some randomly chosen blocks will be in a part of the file reached via double indirection, meaning that a block must be read to determine which mapping block must be read to determine which data block must be read. This could cost three seeks, but sometimes will cost only two.
Finally, a 200-gigabyte disk almost certainly doesn't require triple indirection.

Notes

We're just kidding--there is no Project 5. If you wish there were... you might give serious consideration to 15-412 and/or 15-610.
On a Linux system, you would need superuser privileges to run the timeseek program on /dev/hda7. It is not polite to beg the Wean Hall CCon to tell you the root password for the cluster machines.

Helpful Hint

By the way, if you think you are having AFS permission problems, try running the program located at
% /afs/cs.cmu.edu/academic/class/15410-s07/pub/access_hw2

For further reading...

Here is last semester's homework page.

[Last modified Friday May 04, 2007]