Carnegie Mellon
SCS logo
Computer Science Department

15-410 HW2 solutions (Spring, 2006)

Question 1 - Public Key Practicum

If you did this problem correctly, in a couple of hours your hw2 directory should contain a $USER.message.decrypted file. If not, check to see if there is a $USER.ERROR file. If not, a popular problem is that you used file names other than the ones specified in the assignment, the grading script wasn't able to guess what you meant, and the grader hasn't yet had the time to manually intervene.

By the way, your encrypted messages to us were read and, generally, appreciated.

The best message was Mark Hairgrove's; the second best was Mike Kasick's. The wisest was probably Zoe Ouyang's. No, of course we're not going to reveal the messages--they're secret!

Question 2 - Nuts & Bolts

The ELF file format specifies addresses for the text, data, rodata, and bss regions, but not the stack region.

Part A

Briefly explain why it is ok for this information to be missing from the executable file.

This is acceptable because the compiled instructions forming the program text don't depend on where in the address space the stack is located. Accesses to the stack are relative to the stack pointer (or the base pointer, in stack disciplines which have one)--this includes pushes, pops, fetches, and stores.

Since the program doesn't care where the stack is located as long as it "works" (implements these relative operations), there is no reason for the executable file to (over)specify an address. This means that the kernel has substantial freedom about where the stack is placed.

Part B

The fact that the executable file doesn't specify the address range used by the stack allows interested kernels the freedom to deploy a mild optimization for some programs. Briefly explain, referring to the Pebbles environment if you wish to use a specific example.

Consider slide 45 of the first VM lecture. Observe that the upper page table is required only because the stack is at the far end of the address space from everything else. This is often done so the stack and heap have "enough" room to grow toward each other without colliding, but how much "enough" do we really need? If a kernel decided to fit a whole program (text, rodata, data, bss, heap, and stack) into four megabytes, then mapping all of user space could be done with one page table.

How practical is that? Consider these section sizes for the Pebbles shell (excerpted from the output of size --format=SysV shell):

section bytes
.text 8864
.rodata 996
.data 176
.bss 32
Total 10068

So an important program like the shell, if allowed a megabyte of heap and a megabyte of stack, could easily be mapped by a single page table. In fact, the Minix OS uses a similar approach.

Question 3 - VM

On the IA32 platform, we have two ways to remove entries from the TLB, namely setting %cr3 and INVLPG.

Part A

Explain how to implement INVLPG in terms of setting %cr3.

The job of INVLPG is to remove a particular translation from the TLB. Setting %cr3 will remove all translations, so it will certainly accomplish the job of removing any particular one. In addition to removing the (presumably invalid) translation we no longer wish, it will remove many other translations we (presumably) still believe in. But those will reappear over time--performance will be reduced, but all memory references will have the correct outcome.

Part B

Explain how to implement flushing the whole TLB in terms of INVLPG.

This problem, as stated, is significantly more painful. Without a way to inspect the contents of the TLB, it will be necessary to perform 4,194,304 INVLPG operations to invalidate the entire address space--even if the TLB has only 64 or 128 entries.

However, some processors have special instructions which allow the kernel to read (or even update) particular slots in the TLB, or to request the index of the TLB slot mapping a particular page. On such processors (including certain flavors of PowerPC) a mass-invalidate instruction may not be necessary.

Part C

Now imagine you can purchase a version of the IA32 processor family which implements one TLB-removal method or the other, but not both. Which would you choose, and why?

Strictly speaking, this depends on how often the kernel must invalidate one page versus invalidate an entire address space, and how large address spaces typically are (if you know that the VM range from 1 gigabyte to 3 gigabytes is not used by any process in the system, then there is no need to INVLPG those addresses).

Practically speaking, I believe most people would regard mass-invalidate as a necessity and invalidate-one as a performance enhancement.

For further reading...

Here is last semester's homework page.

[Last modified Saturday May 06, 2006]