Fall 2006 15-712 Second Midterm Exam Closed book. 80 minutes. Answers should be written in separate blue books. Read the questions carefully and ANSWER ALL PARTS OF EACH QUESTION. Carefully mark your name on each blue book and number the blue books if you use more than one. I will grade the first 8 answers I find in the blue books submitted. All questions have the same weight. ************* Answer 8 questions ************ 1. Optimistic methods for concurrency control (Kung81) employs much less locking than more popular concurrency control methods (such as two phase locking). Yet both achieve serializability. (a) What is serializability? (b) How does both Kung's optimistic methods and two phase locking achieve serializability? (c) What benefit does Kung81 seek, relative to two phase locking? (d) Under what circumstances are Kung's optimistic methods not a good idea? 2. A simple implementation of Kung81's methods might fork each transaction into its own thread, use the thread ID provided by the underlying OS for Kung81's ordering and implement its validation and writeback phase as an IPC invocation on an atomic commit_and_exit_thread(). (a) Explain TWO weaknesses Kung would see in this approach. Kung would select the serially equivalent ordering much later. (b) How can late selection of ordering hurt long running transactions? In Kung's more aggressive parallel validation schemes he sketches a scheme where he might repeatedly sample the global transaction number in order to test for conflicting read and write sets outside of the critical section. (c) Why is it safe to do these tests outside the critical section? (d) What is left to parallelize after exhausting this resampling of the global transaction number strategy? 3. The simplest B-tree access methods support atomic transformations of the B-tree. Unfortunately in a system with many concurrent transactions, this simplest method has limitations. (a) Give a limitation experienced by read-only database transaction (b) Give a limitation experienced by read-write database transactions. Lehman81 offers more complex B-link-tree access methods to overcome these limitations. (c) How effective are B-link-trees at over coming the limitations of atomic B-tree transformations? (d) What is the key property of the implementation of B-link-trees that enables this improvement? (e) What does Lehman81 sacrifice for this improvement? 4. Consider a user-level transaction processing system in which each ongoing transaction is consuming a significant amount of the transaction processing system's virtual memory. If such a computing system begins to experience unacceptable memory pressure (more than negligible paging of the transaction processing system's virtual memory), it may signal the transaction processing system for assistance. (a, b, c) Give THREE different actions that the transaction processing system could do to (almost) immediately reduce the memory pressure in the machine. For each action, explain how that action frees up memory and how it depends on and interacts with the log. 5. Haskin88 describes a distributed database system with a broader agenda. Specifically, Haskin88 sets out to offer transactions as a simple-to-use, nearly transparent failure recovery service for all processes in a distributed system. (a) Explain how this might benefit non-database applications. But non-database applications differ from database applications, so a variety of specializations are offered. (b, c, d) Describe THREE such specializations and how they are beneficial. 6. Consider a set of computers communicating through a single shared ethernet cable (tapped or daisy chained), with today's fast processors and very smart network NICs. If cryptographic methods are NOT used but the number of faulty nodes is assumed to be at most 20% of the nodes on the shared ethernet, I assert that Lamport82's key result does not apply and consensus may not be possible. (a) Explain one way that my assertion does not contradict the correctness of Lamport82's key result. (b) Describe one reorganization of the communications network to overcome this assertion (it need not be particularly practical). Now return to the original network and suppose each ethernet NIC contains its own private key and a table of IP addresses on the shared ethernet cable and the corresponding public keys, and it encrypts the ethernet payload of each packet using its own private key. (c) Explain how this solves or does not solve the problems I assert defeat Lamport82's key result in this environment? 7. Castro99 provides a practical Byzantine algorithm for achieving (agreement) on replication in an "asynchronous environment like the Internet". Unless Fischer85 is wrong, and it is not, asynchronous must mean something different to Castro, because Fischer85 proves that even one faulty principal makes consensus impossible in an asynchronous environment. (a) What does asynchronous mean in Castro99? (b) Why is this distinction provide a valuable system property? In fact in its "Views" algorithm Castro99 sidesteps Fischer's asynchronous problem in much the same way as Lamport82. (c) Explain how does Castro use Lamport's sidestep and why it ensures liveness? 8. The introduction of processes (yes, a thread, its address space, open files, etc) provided fault isolation benefits to computing systems. Hive (Chapin95) proposes to partition the hardware and software of a multiprocessor into "cells" to provide a similar benefit of fault isolation in multiprocessor servers. (a) Explain the fault containment benefit Hive hopes to achieve with cells. (b) Give an example of hardware support Hive proposes for fault containment and explain how this hardware provides benefit. Unfortunately, Hive cannot always fully achieve this benefit. (c) How are the operating systems' core functions partitioned for fault containment, and what penalty is caused by this partitioning? 9. Anderson94 says "This is a failure of certification. Staff don't have the knowledge. Product certification should be tied to the skill of the user." (a) Explain what problem that concerns him. (b) Which of Saltzer75's design principles does this refer back to? In class, I asserted that AFS permissions had a similar problem. (c) In what way can permissions definitions exhibit this problem? 10. In "The Protection of Information in Computer Systems", Saltzer75, the authors describe the following attack: "...a masquerader could exactly record the enciphered bits in one communication, and then intercept a later communication and play them back verbatim. (This technique is sometimes called spoofing.) Although the spoofer may learn nothing by this technique, he might succeed in thoroughly confusing the user or the computer system. The general countermeasure for spoofing is to include in each enciphered message something that is unique, yet predictable, such as ___________." There are three common implementation techniques for the blank in this quote. (a) Describe TWO such techniques. (b) Explain how each counters "spoofing". (c) Contrast the costs or limitations of each. 11. Consider the following protocol allowing principals A and B who already have private channels open to a key generating server, S, (with keys Kas and Kbs, respectively). Encryption of data d1 and d2 under key Kxy is represented as {d1, d2}_Kxy. 1. A -> B: A, Na 2. B -> S: B, {A, Na, Nb}_Kbs 3. S -> A: {B, Kab, Na, Nb}_Kas, {A, Kab}_Kbs 4. A -> B: {A, Kab}_Kbs, {Nb}_Kab (a) List at least 8 assumptions that a BAN analysis of this protocol would begin with. (b) What would be the goal of such a BAN analysis? There is at least one flaw in this protocol. (c) Describe such a flaw and explain why it is a flaw. 12. The Terra platform for trusted computing in Garfinkel03 describes a procedure called Attestation that it offers for distributed system trust protocols. (a) What is attestation? (b) What properties does it seek to provide to distributed systems such as trusted gaming? (c) Give at least one limitation in the Terra realization of attestation?