Reliable Software: Testing and Monitoring

Caltech, third term 2009, Monday/Wednesday 2:30-3:55, Jorgensen 287

Alex Groce (agroce@gmail.com)
Klaus Havelund


This is the website for the Testing portion of the class (the first nine lectures). The website for the Monitoring portion is here.

Announcements: Assignment #0 now due Wed April 15. See updated tarball! Old version of yaffs, cleaner to work with


YAFFS Website

My YAFFS tarball we'll be using: NOTE THIS IS UPDATED!

yaffsfs_bug1.c
a bug even that tester can find (file replaces yaffsfs.c)

yaffs_guts_bug1.c
and one that it can't (file replaces yaffs_guts.c)

yaffs_guts_bug2.c
and one that hangs the tester (file replaces yaffs_guts.c)

yaffs_guts_bug3.c
another one the dumb tester can find (replaces you-know-what)

yaffs_guts_bug4.c
an easy one the weak tester misses (replaces you-know-what)


Further notes on testing YAFFS:
  1. Due date for project is May 20. Deliverables are:
    1. Test report (document, preferably PDF)
    2. Tester (tarball I can unpack in yaffs2/direct, then do: make clean; make; ./directtest2k)
    3. Two buggy versions of YAFFS (files to replace)
    (see lecture 4 slides for details)
  2. Don't bother with permissions. We're not testing permissions or using the chmod operation. Assume argument 2 of mkdir is always 0, and that creates always take S_IREAD | S_IWRITE.
  3. UPDATED TIMEOUT VALUE: You can use a 60 second timeout for YAFFS operations. That is, if you wish, you can detect when YAFFS doesn't return from an operation within 60 seconds, and call that a test failure. This isn't required, but you might find it helpful.
  4. You can shrink the size of the /ram2k device by modifying yaffscfg2k.c. Let me know if you do -- and don't call tests failing just because YAFFS runs out of space!
  5. SWIG is a decent way to interface Python and C code.

Lecture 1: Introduction to Testing

Links for further reading:
Yannakakis and Lee, "Principles and Methods of Testing Finite State Machines: A Survey"
Chow, "Testing Software Design Modeled by Finite-State Machines"


Lecture 2: Design for Testability + The Idea of Coverage

Harder, Mellen, and Ernst, "Improving test suites via operational abstraction"
More slides (and a paper) for lecture 2, courtesy Mike Ernst


Lecture 3: Random Testing

Papers for this week:
Hamlet, "Random Testing"
Miller, Fredriksen, and So, "An Empirical Study of the Reliability of UNIX Utilities"
Groce, Holzmann, and Joshi, "Randomized Differential Testing as a Prelude to Formal Verification"
Pachecho, Lahiri, Ernst, and Ball, "Feedback-directed random test generation" (I'll be using slides courtesy of Carlos Pacheco for this bit)
You can download and play with Randoop (random tester for Java unit testing) from this site
Hamlet, "When only random testing will do"


Lecture 4: Random Testing, Continued + YAFFS Project Info + Concolic Testing
(yes, they're short -- we'll spend most of class looking at tester code and then talking about CUTE and using constraint-solvers in testing)

Sen, Marinov, and Agha, "CUTE: A Concolic Unit Testing Engine for C"
I'll be using Koushik's FSE 05 slides (on that page) in class
You can download and play with CUTE here
SPLAT is another constraint-based tester, also available for download. To get it to work, in addition to the instructions given, make sure to go into the directory splat-complete/cil_1_3_6 and do a ./configure; make before anything else.


Lecture 5: Concolic Testing, Continued + Testing and Debugging
(I'll also use Godefroid's RT 07 invited talk slides in class)

Papers for lecture 5:
Godefroid, "Compositional Dynamic Test Generation"
Godefroid, Levin, and Molnar, "Automated Whitebox Fuzz Testing"
Look closely at this one -- nice hybrid of "fuzz" testing and the "concolic" work, plus lots of interesting observations (buckets, distinct inputs to fuzz produce different bugs, etc.)

Zeller and Hildebrandt, "Simplifying and Isolating Failure-Inducing Input"


Lecture 6: Testing and Debugging: Causality and Fault Localization

Papers for lecture 6:
Renieris and Reiss, "Fault Localization with Nearest Neighbor Queries"
Jones and Harrold, "Empirical evaluation of the Tarantula automatic fault-localization technique"
Cleve and Zeller, "Locating causes of program failures"
(optional, not about testing): Groce, Chaki, Kroening, and Strichman, "Error Explanation with Distance Metrics"


Lecture 7: Testing via Explicit-State Model Checking
(and more slides, courtesy Rajeev Joshi)

Papers for lecture 7:
Holzmann and Joshi, "Model-Driven Software Verification"
Groce and Joshi, "Extending Model Checking with Dynamic Analysis"

PROMELA model for testing binary search:
binsearch.pml
Install SPIN. spin -a binsearch.pml; gcc -o pan -DSAFETY pan.c; ./pan

PROMELA model for testing insert sort:
insertsort.pml
insertsort.c
insertsort.h
Install SPIN. spin -a insertsort.pml; gcc -o pan -DSAFETY pan.c insertsort.c; ./pan


Lecture 8: Testing via Explicit-State Model Checking, Continued
(will also be using slides by Willem Visser, comparing various approaches)

Papers for lecture 8:
Visser, Pasareanu, and Pelanek, "Test-input Generation for Java Containers Using State Matching"


Lecture 9: "Coverage" Revisited: Bounded Exhaustive Testing

Papers for lecture 9:
Coppit, Yang, Khurshid, Le, and Sullivan, "Software Assurance by Bounded Exhaustive Testing"
Musuvathi and Qadeer, "Iterative Context Bounding for Systematic Testing of Multithreaded Programs"