Overview

Spring 2016
TR 3:00-4:20
GHC 4303
12 units
Professor: Claire Le Goues
Email: clegoues at cs dot cmu etc
Office: WEH 5117
Office Hours: Fridays 1- 2 pm; also, by appointment

This course provides an overview of the state of the art in program analysis as well as recent research in the area. Topics will include: program representations, abstract interpretation, type-based and constraint-based analysis, interprocedural analysis, counterexample-guided abstraction refinement, extended static checking, dynamic analyses (including testing and test input generation) and combinations of dynamic and static analysis. The course will mix theory and practice; students will formalize analyses and prove them correct, but also implement actual analyses for real programs, write a short challenge paper, and complete a capstone course research project.

This is a graduate-level course targeting Ph.D. students, but masters and strong undergraduate students interested in program analysis are also welcome. Please email the instructor if you are such a student and want to discuss your preparation and interest in the course. There is no course prerequisite, but students will benefit from being comfortable with formal definitions.

Acknowledgements: Much thanks to Jonathan Aldrich and Wes Weimer for their pointers and course materials, which I adapt and reuse in various ways throughout the course.

Logistics

We will use piazza for course materials, questions, and discussion, as applicable. Set up an account and enroll ASAP. Please send messages and questions through the piazza interface whenever possible.

We will use blackboard for the submission of homework assignments only.

We hope to have a github account set up for the course for programming assignments and the final project, assuming they grant the course academic status; Stay tuned.

(Ed note: I apologize for the diversity of information sources and websites we will use to coordinate this class; the lack of a single good full-service solution is the bane of my instructional existence.)

Format and grading

Overall, the course will consist of both lectures and seminar-style discussion. The course work will consist of a combined practical analysis and short paper; readings (mostly research papers), for which we may include small reading quizzes, at our discretion; written homework assignments (formalizing and proving properties about analyses); programming assignments (implementing analyses); a midterm; and a project, to be written up and presented to the class. Ph.D. students are welcome to make use of their existing research as part of the project if it is related to program analysis in some way. There will be no final exam. Participation in the seminar-style discussion will be expected and considered in grading.

The approximate grade breakdown will thus be:

Late days

For individual homework assignments: Everyone in the class has 5 late days that may be used throughout the semester. After late days are used, late assignments will be penalized by 10% per day. I will grant additional late days in extenuating circumstances (illness or other such emergencies), once you have used your existing allotment. Contact me via email.

If you do HW1(b) in a pair, late days will be subtracted from both members' allotment.

Late days may not be used for reading questions (since we answer them in class), or the final project presentation, absent very extenuating circumstances.

Supplemental course textbook:

Schedule

The schedule is subject to change as the course progresses, though the posted readings/assignments should stay stable. I link the first week's notes and homework below in case of piazza access issue, but after that, go to piazza for links to readings, homework assignments, etc.

Date Topic Reading assignments Assignments Due Optional reading
Jan 12 Intro & program representation
Jan 14 Program representation Ayewah et. al, 2008
Lecture notes 1: Program representation
Get on piazza and bb! PPA Ch 1
Jan 19 Dataflow analysis framework Engler et al., 2001
Dyer et al., 2013
PPA, Ch 2
Jan 21 Dataflow analysis framework Lecture notes 2: Dataflow analysis Framework
Jan 26 Dataflow analysis examples, intro to correctness Lecture notes 3: Dataflow analysis examples
Lecture notes 4: Dataflow analysis correctness
HW1(a): written component and challenge proposal PPA Ch 6
Jan 28 Dataflow analysis correctness Cousot and Cousot's Abstract Interpretation
Abramski's Introduction to Abstract Interpretation
PPA Ch 4
Feb 2 Abstract Interpretation (1/N) HW1(b): challenge write-up
Feb 4 Abstract Interpretation: Widening and Collecting Lecture Notes 5: Widening and Collecting
Feb 9 No class -- instructor illness Bonus Notes: While and While3Addr Reference
Feb 11 Interprocedural analysis Lecture Notes 6: Interprocedural analysis HW2: Control Flow Analysis PPA Ch 2.5
Feb 16 Pointer analysis Reps et al. 95
Gulwani and Necula 05
Note: we will have a short reading quiz at the start of class
Feb 18 Pointer analysis, part 2 Lecture Notes 7: Pointer analysis HW3: Dataflow Correctness
Feb 23 Buffer overflow analysis Hardekopf and Lin 07
Note: we MAY have a short reading quiz at the start of class
Feb 25 Functional Control Flow Analysis Hacket et al. 06 Note: we MAY have a short reading quiz at the start of class PPA Ch. 3
Mar 1 OO Call Graph Construction Lecture Notes 8: CFA and Dynamic Dispatch HW4: interprocedural analysis
Mar 3 midterm
Mar 8 NO CLASS--Spring Break
Mar 10 NO CLASS--Spring Break
Mar 15 Hoare Logic/Axiomatic Semantics/Verification Conditions
Mar 17 Hoare Logic/Axiomatic Semantics/Verification Conditions Hoare '71 Optional: Hoare '69
Mar 22 Model Checking (Guest: Arie Gurfinkel, SEI)
Mar 24 Model Checking (Guest: Arie Gurfinkel, SEI)
Mar 29 Symbolic Execution/Test Input Generation HW 5: Hoare-style verification
Mar 31 Concolic Execution Das et al. 02
Cadar et al. 08
RQ 1: Symbolic Execution
Apr 5 Program synthesis Lecture Notes 9: Component-Based Program Synthesis Project Proposal
Apr 7 Guest lecture: Nadia Polikarapova
Apr 12 Dynamic analysis
Apr 14 NO CLASS: Carnival
Apr 19 Automatic program repair (heuristic) Le Goues et al. 12
Mechtaev et al. 16
RQ 2: program repair
Project milestone 1
Apr 21 Repair part 2: Synthesis Strikes Back Logozzo and Ball 12
Apr 26 Boogie, Dafny, and OO verification
Apr 28 Project presentations Project presentations
May 5 Project deliverables due!

Academic Integrity

It should go without saying that we expect the utmost integrity and honesty of our students, especially in graduate-level courses. Especially for the research project and project-like components of the class, I expect you will discuss ideas, problems, and solutions with your classmates, as is standard in a healthy research environment.

That said, I expect you to follow standard academic integrity policies (such as the University Policy on Academic Integrity) for the individual problem set-like homework assignments especially. Although you may discuss the homework with your peers, you may not copy any part of a solution to a problem that was written by another student, or was developed together with another student, or was copied from another unauthorized source such as the Internet. You may not look at another student's solution, even if you have completed your own, nor may you knowingly give your solution to another student or leave your solution where another student can see it.

Here are some examples of inappropriate behavior, which I borrow with attribution from the 15-214 and 15-313 guidelines:

If any of your work contains any statement that was not written by you, you must put it in quotes and cite the source. If you are paraphrasing an idea you read elsewhere, you must acknowledge the source. Using existing material without proper citation is plagiarism, a form of cheating, and it is a career-ender for modern academics. If there is any question about whether the material is permitted, get permission in advance.

It is not considered cheating to clarify vague points in the assignments, lectures, lecture notes; to give help or receive help in using the computer systems, compilers, debuggers, profilers, or other facilities; or to discuss ideas at a very high level, without referring to or producing code.

Any violation of this policy is cheating. The minimum penalty for cheating (including plagiarism) will be a zero grade for the whole assignment. Cheating incidents will also be reported through University channels, with possible additional disciplinary action (see the above-linked University Policy on Academic Integrity); note that the University rarely gives graduate students second chances.

If you have any question about how this policy applies in a particular situation, ask the instructor for clarification.

Note that the instructor respects honesty in these (and indeed most!) situations.