Machine-Checked Proofs about Full-Scale Programming Languages
Friday, November 3rd, 2006 from 12-1 pm in NSH 1507.
Presented by Daniel Lee, CSD
Abstract: Despite the existence of many detailed type safety proofs for small core languages, the full-scale programming languages used for everyday software development have resisted such rigorous analysis. An initial hurdle is the fact that full-scale languages are often defined in terms of compiler implementations or large text documents that are known to contain ambiguities and even errors. Resolving definitional issues is just the beginning: rigorously verifying properties about full-scale languages is a daunting task requiring detailed proofs involving numerous cases. Computer proof assistants are valuable tools in studying formal systems, particularly when human verification would be too error-prone or time consuming to be practical. This talk will discuss the application of proof assistants to the formalization and verification of programming languages and their metatheory. Computer assistance and a principled approach to language definition make it tractable to formalize and prove properties about large, feature-rich programming languages. The discussion will be motivated by the speaker's work towards a machine-checkable specification and type safety proof for Standard ML using the Twelf proof assistant. This talk is targetted toward a general computer science audience with no prior familiarity with type systems or proof assistants.
Additional Info: Joint work with Karl Crary and Robert Harper. In partial fulfillment of the Speaking Skills Requirement.
Combining Shape Analysis and Arithmetic Reasoning
Friday, November 10th, 2006 from 12-1 pm in NSH 1507.
Presented by Stephen Magill, CSD
Shape analyses, which statically track the shape of data structures in the heap, are often imprecise in their numerical reasoning, whereas numerical static analyses are often largely unaware of the shape of a program's heap. This talk describes a method for combining a fixed shape analysis with any numerical static analysis. The combination is based on the use of counter-examples as a means of communication between the two analyses and the use of "shadow commands" to eliminate references to the heap when sending queries to the numerical analysis.
Making APIs More Usable through Studies of API Design Choices
Friday, November 17th, 2006 from 12-1 pm in NSH 1507.
Presented by Jeffrey Stylos, CSD
Successfully using application programming interfaces (APIs) presents several barriers for novice programmers and professional software engineers alike. To help API designers create APIs that programmers can more easily use, researchers at Microsoft perform usability studies on pre-release versions of APIs. However, these studies are expensive and APIs are too large and numerous to test fully. My research explores developing general design recommendations to guide API designers by conducting usability studies on broadly applicable API design choices, such as which constructors to provide for a given object. This talk describes my approach and the current results of my ongoing research.
In Partial Fulfillment of the Speaking Requirement
Beyond Brain Blobs: machine learning classifiers as instruments for analyzing fMRI data
Friday, December 1st, 2006 from 12-1 pm in NSH 1507.
Presented by Francisco Pereira, CSD
The vast majority of fMRI datasets is analysed using the general linear model approach. This limits the kinds of conclusions that can be drawn and the scope of hypotheses about brain activation and fMRI signal that can be tested. From work in the last few years we know that machine learning classifiers can predict, from fMRI of a subject''s brain, which of several cognitive states a subject is in. I will show that we can go further and use them to identify the relationships between brain activation and the cognitive state prediction being made, in ways not feasible or practical with traditional methods. In order to do this, I will go over several experiments with classifiers and what was learnt from them, while also discussing the technical issues involved. I will also describe our efforts to address the problems of predicting and identifying informative patterns of brain activation simultaneously via a new algorithm for learning low-dimensional representations of fMRI data.
Incremental Pattern Discovery on Tensor Streams
Friday, December 8th, 2006 from 12-1 pm in NSH 1507.
Presented by Jimeng Sun, CSD
How do we find patterns in author-keyword associations, evolving over time? Or in DataCubes, with product-branch-customer sales information? Matrix decompositions, like principal component analysis (PCA) and variants, are invaluable tools for mining, dimensionality reduction, feature selection, rule identification in numerous settings like streaming data, text, graphs, social networks and many more.However, they have only two orders, like author and keyword,in the above example. We propose to envision such higher order data as tensors, and tap the vast literature on the topic. However, these methods do not necessarily scale up, let alone operate on semi-infinite streams. Thus, we introduce the dynamic tensor analysis (DTA) method, and its variants. DTA provides a compact summary for high-order and high-dimensional data, and it also reveals the hidden correlations. Algorithmically, we designed DTA very carefully so that it is (a) scalable, (b) space efficient (it does not need to store the past) and (c) fully automatic with no need for user defined parameters. Moreover, we propose STA, a streaming tensor analysis method, which provides a fast, streaming approximation to DTA. We implemented all our methods, and applied them in two real settings, namely, anomaly detection and multi-way latent semantic indexing. We used two real, large datasets, one on network flow data (100GB over 1 month) and one from DBLP (200MB over 25 years). Our experiments show that our methods are fast, accurate and that they find interesting patterns and outliers on the real datasets.
Web contact: sss+www@cs