Fall 2006 15-712 Midterm Exam Closed book. 80 minutes. Answers should be written in separate blue books. Read the questions carefully and ANSWER ALL PARTS OF EACH QUESTION. Carefully mark your name on each blue book and number the blue books if you use more than one. I will grade the first 8 answers I find in the blue books submitted. All questions have the same weight. ************* Answer 8 questions ************ 1. Contrast Thekkath's RDMA with Birrell's RPC. Give at least two ways they are similar and two ways they are different. 2. Hauser's taxonomy for thread paradigms includes two types employing delayed execution: slack processes and deferring work. Contrast these two paradigms. Explain why Hauser found that slack processes were hard to get to work well. Explain how Hauser's task rejuvenation paradigm can encourage performance problems in deployed code. 3. Lamport uses a space-time physics analogy in his discussion of simultaneity in distributed computing. Why does this analogy apply? How is multi-threading on a single processor equivalent to distributed computing in terms of defining what is and what is not simultaneous? How do logical clocks help a distributed or multi-threaded program developer? 4. Lamport's physical clocks depend on message communication time to bound clock skew. The Network Time Protocol (NTP) uses message round trip time to manage clock skew too. What is the key insight in the basic NTP clock skew detection and correction algorithm? Lamport develops physical clocks in order to compensate for covert channels in his Logical Clocks scheme. Give an example of a covert channel that defeats Lamport logical clocks. Does NTP fix all of the covert channel that Lamport's physical clocks set out to fix and why? 5. Suppose there is a multi-threaded program (or distributed application) that modifies the records of a shared database using an application specific function "application_change()" using the following code in each thread. if (lots_of_threads_in_database()) { for each record in list_of_records { lock( rec_lock[record] ); application_change(record); unlock( rec_lock[record] ); } } else { lock( database ); for each record in list_of_records { application_change(record); } unlock( database ); } What benefit is the code trying to achieve by conditioning on lots_of_threads_in_database? Explain why Eraser's dynamic data race detector would not like this code. How could this be fixed? 6. Data race detection tools which rely on Lamport's happens-before relation have two primary weaknesses when compared to Eraser. First, they require per-thread information about concurrent accesses to each shared memory location. Secondly, their effectiveness is highly dependent on the interleaving of threads by the scheduler. Savage claims that Eraser is not sensitive to differences in thread interleaving and evaluate their claim with a simple test. Explain the intuition behind why the Lockset algorithm is thread interleaving insensitive. 7. Network Appliance's NFS filer server appliances use a "write-anywhere" file system very similar to Rosenblum's LFS layered on top of software-implemented RAID. They have recently implemented a variant of RAID level 6, which is a double failure tolerant RAID code, which in their case is organized like RAID level 4 but with two disks of redundancy codes (check disks). Network Appliance says that by using an LFS-like file filesystem, RAID level 4 is as good as RAID level 5 (and maybe better), and RAID level 6 has no performance penalty relative to RAID level 4 or 5. Explain the general rationale under which these two statements are correct. Give two reasons why this rationale breaks down, and explain the penalty and/or cost that can occur when this rationale breaks down. 8. What kind of workload patterns is the Log-Structured File System well adapted to? What kind of a workload is it's pathological case, when compared to a traditional file system like BSD's Fast File System. A key issue here is when disk fragmentation hurts performance and when it does not. Rosenblum's answer to fragmentation problems has been disputed. What is his answer, why is it disputed and what do other systems do about fragmentation? 9. The NASD interface is most closely related to the inode interface in most modern file systems. Explain this similarity. How does moving the inode interface from a module layer in the file system software (a local procedure call interface) to a wire protocol for storage devices (an RPC interface) provide benefit for both file systems developers and storage device developers? [3 more questions]