Carnegie Mellon
SCS logo
Computer Science Department
home
syllabus
staff
schedule
lecture
projects
homeworks
 
 

15-410 HW2 solutions (Spring, 2015)


Question 1 - Public Key Practicum

If you did this problem correctly, your hw2 directory should contain a $USER.message.decrypted file. If not, check to see if there is a $USER.ERROR file. If not, a popular problem is that you used file names other than the ones specified in the assignment, the grading script wasn't able to guess what you meant, and the grader hasn't yet had the time to manually intervene.

By the way, your encrypted messages to us were read and, generally, appreciated.

The "most surprising encapsulation" award goes to Cary Yang. The best retro political joke was provided by Jack Sorrell. Of course we're not going to reveal the messages--they're secret!


Question 2 - "Logging Mania"

Part A:

Do you think system administrators should disable journaling for file systems that are stored on SSDs? Explain your answer.

It depends on what you're trying to accomplish.

If you have a journaling file system, booting is fast (and basically constant-time), which is nice: before using the file system you need to play the log across it, which usually takes well under a second, compared to minutes or hours for running fsck on a large device. After playing back the log, the file system has two properties: first, it is safe to use, e.g., you don't have a block marked as free that is also referenced by an inode (ouch!); second, it reflects the outcome of some series of system calls that user code actually executed. Running fsck provides the first property but not the second (as discussed in lecture). The downside of having a journal is, as pointed out by the homework question, needing to store each metadata change twice.

If you have a fairly small file system, so that running fsck after a crash will be fairly fast, and you don't mind mild semantic inconsistencies, and you typically make a lot of metadata changes as opposed to data changes, it might make sense to disable journaling. Otherwise, probably not.

Part B:

What would you suggest: leaving things as they are because it's the right thing to do even if it sounds a little silly, configuring MySQL so that it doesn't log (e.g., switch from InnoDB to MyISAM), configuring the file system so it doesn't journal, or something else?

The short answer is that the database's logging provides highly-useful high-level semantic consistency guarantees (every cell has a defensible answer), so you definitely want it even though it's expensive. Meanwhile, the file system's journaling is consuming the same resources (write throughput and SSD lifetime) to provide weaker consistency (it protects the file system's data structures but not the bytes stored in the files). While it might seem tempting to turn off file-system journaling, that would force fsck to be run after a crash, which probably imposes an undesirable amount of downtime for a database. In a sense, this means that you are forced to pay the overhead of the file-system journaling even though it doesn't protect your data well enough to be worth paying for.

Speaking abstractly, it would be better if you could get rid of the file system entirely and let the database manage the disk itself (i.e., divide the disk space up; use some of it to store the main database and some of it to store a write-ahead log). Real databases can be configured to do exactly that: they can be set up to store information in disk partitions instead of files inside a file system, e.g., data can be stored in /dev/data and the log can be stored in /dev/log, with the database manually defining and managing all of the data structures stored inside those "block device" partitions. Logging/journaling is a generally a good idea, but once you have enough of it then adding more generally results in extra costs without extra benefits.

Discussion

This isn't a terrible example of a "pure design" question: if you understand the principles firmly, you should be able to reason about the costs and benefits of disabling file-system journaling, even if you haven't read enough about databases to know that it's possible to just skip the file-system level entirely. This question is probably a little more abstract than an actual design-focused exam question would be, but different people evaluate the abstractness/concreteness of a questions differently, so it's hard to say for sure.

For further reading...

Here is last semester's homework page.


[Last modified Saturday May 02, 2015]