go back to the main page

SSS Abstracts
Fall 2006

go back to the main page

Secure Hierarchical In-network Aggregation for Sensor Networks

Friday, October 13^th, 2006 from 12-1 pm in NSH 1507.

Presented by Haowen Chan, CSD

In-network aggregation is an essential primitive for performing queries on sensor network data. However, most aggregation algorithms assume that all intermediate nodes are trusted. This is at odds with the standard threat model in sensor network security, which assumes that an attacker may control a fraction of the nodes, which may misbehave in an arbitrary (Byzantine) manner.

We present the first viable algorithm for provably secure hierarchical in-network data aggregation. Our algorithm is guaranteed to detect any manipulation of the aggregate by the adversary beyond what is achievable through direct injection of data values at compromised nodes. In other words, the adversary can never gain any advantage from misrepresenting intermediate aggregation computations. Our algorithm incurs only $O(log^2 n)$ congestion, supports hierarchical aggregator topologies and retains its resistance against aggregation manipulation in the presence of arbitrary numbers of malicious nodes. The main algorithm is based on performing the sum aggregation securely by first forcing the adversary to commit to its choice of intermediate aggregation results, and then having the sensor nodes independently verify that their contributions to the aggregate are correctly incorporated. We build on this primitive to produce secure median, count and average aggregates.

Machine-Checked Proofs about Full-Scale Programming Languages

Friday, November 3^rd, 2006 from 12-1 pm in NSH 1507.

Presented by Daniel Lee, CSD

Abstract: Despite the existence of many detailed type safety proofs for small core languages, the full-scale programming languages used for everyday software development have resisted such rigorous analysis. An initial hurdle is the fact that full-scale languages are often defined in terms of compiler implementations or large text documents that are known to contain ambiguities and even errors. Resolving definitional issues is just the beginning: rigorously verifying properties about full-scale languages is a daunting task requiring detailed proofs involving numerous cases. Computer proof assistants are valuable tools in studying formal systems, particularly when human verification would be too error-prone or time consuming to be practical. This talk will discuss the application of proof assistants to the formalization and verification of programming languages and their metatheory. Computer assistance and a principled approach to language definition make it tractable to formalize and prove properties about large, feature-rich programming languages. The discussion will be motivated by the speaker's work towards a machine-checkable specification and type safety proof for Standard ML using the Twelf proof assistant. This talk is targetted toward a general computer science audience with no prior familiarity with type systems or proof assistants.

Additional Info: Joint work with Karl Crary and Robert Harper. In partial fulfillment of the Speaking Skills Requirement.

Combining Shape Analysis and Arithmetic Reasoning

Friday, November 10^th, 2006 from 12-1 pm in NSH 1507.

Presented by Stephen Magill, CSD

Shape analyses, which statically track the shape of data structures in the heap, are often imprecise in their numerical reasoning, whereas numerical static analyses are often largely unaware of the shape of a program's heap. This talk describes a method for combining a fixed shape analysis with any numerical static analysis. The combination is based on the use of counter-examples as a means of communication between the two analyses and the use of "shadow commands" to eliminate references to the heap when sending queries to the numerical analysis.

Making APIs More Usable through Studies of API Design Choices

Friday, November 17^th, 2006 from 12-1 pm in NSH 1507.

Presented by Jeffrey Stylos, CSD

Successfully using application programming interfaces (APIs) presents several barriers for novice programmers and professional software engineers alike. To help API designers create APIs that programmers can more easily use, researchers at Microsoft perform usability studies on pre-release versions of APIs. However, these studies are expensive and APIs are too large and numerous to test fully. My research explores developing general design recommendations to guide API designers by conducting usability studies on broadly applicable API design choices, such as which constructors to provide for a given object. This talk describes my approach and the current results of my ongoing research.

In Partial Fulfillment of the Speaking Requirement

Beyond Brain Blobs: machine learning classifiers as instruments for analyzing fMRI data

Friday, December 1^st, 2006 from 12-1 pm in NSH 1507.

Presented by Francisco Pereira, CSD

The vast majority of fMRI datasets is analysed using the general linear model approach. This limits the kinds of conclusions that can be drawn and the scope of hypotheses about brain activation and fMRI signal that can be tested. From work in the last few years we know that machine learning classifiers can predict, from fMRI of a subject''s brain, which of several cognitive states a subject is in. I will show that we can go further and use them to identify the relationships between brain activation and the cognitive state prediction being made, in ways not feasible or practical with traditional methods. In order to do this, I will go over several experiments with classifiers and what was learnt from them, while also discussing the technical issues involved. I will also describe our efforts to address the problems of predicting and identifying informative patterns of brain activation simultaneously via a new algorithm for learning low-dimensional representations of fMRI data.

Incremental Pattern Discovery on Tensor Streams

Friday, December 8^th, 2006 from 12-1 pm in NSH 1507.

Presented by Jimeng Sun, CSD

How do we find patterns in author-keyword associations, evolving over time? Or in DataCubes, with product-branch-customer sales information? Matrix decompositions, like principal component analysis (PCA) and variants, are invaluable tools for mining, dimensionality reduction, feature selection, rule identification in numerous settings like streaming data, text, graphs, social networks and many more.However, they have only two orders, like author and keyword,in the above example. We propose to envision such higher order data as tensors, and tap the vast literature on the topic. However, these methods do not necessarily scale up, let alone operate on semi-infinite streams. Thus, we introduce the dynamic tensor analysis (DTA) method, and its variants. DTA provides a compact summary for high-order and high-dimensional data, and it also reveals the hidden correlations. Algorithmically, we designed DTA very carefully so that it is (a) scalable, (b) space efficient (it does not need to store the past) and (c) fully automatic with no need for user defined parameters. Moreover, we propose STA, a streaming tensor analysis method, which provides a fast, streaming approximation to DTA. We implemented all our methods, and applied them in two real settings, namely, anomaly detection and multi-way latent semantic indexing. We used two real, large datasets, one on network flow data (100GB over 1 month) and one from DBLP (200MB over 25 years). Our experiments show that our methods are fast, accurate and that they find interesting patterns and outliers on the real datasets.

Web contact: sss+www@cs