go back to the main page

 SSS Abstracts 
Spring 2006

go back to the main page

Conserved spatial patterns in genomes: signal or noise?

Friday, January 20th, 2006 from 12-1 pm in NSH 1507.

Presented by Rose Hoberman, CSD

Genome comparison is a powerful technique for understanding evolutionary relationships, functional constraints on genome organization, processes of genome evolution, and how those processes lead to evolution of new function. An essential prerequisite for all these analyses is the identification of chromosomal regions that have arisen from a common ancestor. In closely related genomes, these regions are characterized by identical gene content and order. However, in more distantly related genomes, related regions must be detected by searching for gene clusters, pairs of regions with similar, but not identical, gene content and scrambled gene order. As gene content and order diverge, it becomes more difficult to distinguish related regions from regions that share a few genes by chance. In this case, statistical tests to reject the null hypothesis become essential. In this talk I present the first formal statistical tests for max-gap clusters, the most widely used gene cluster definition. I show how analyzing the significance of gene clusters within a formal hypothesis testing framework reveals limitations of the current methodology, explains differences in the results of recent studies, and suggests improved strategies for identifying gene clusters.

In partial fulfillment of the speaking requirement.

Focused Forward Proof Search in Linear Logic

Friday, February 3rd, 2006 from 12-1 pm in NSH 1507.

Presented by Kaustuv Chaudhuri, CSD

Perhaps the most common method of searching for a proof of a proposition is to start from the desired goal proposition and work backwards, refining the subgoals using the logical rules of inference, until one is left with known facts. The proof search procedure has to make many choices during this search; for example, which path to take when there are multiple ways to prove a subgoal, or which subgoal to attempt first when there are multiple subgoals. Focusing is the name given to an analysis of the logical connectives to determine which of these choices are essential, and which choices are either inessential or can be postponed; in practice, focusing increases the efficiency of the search procedure astronomically.

Backward proof search, however, is not always the best proof search strategy. Linear logic, a logic for reasoning about state, is one example where backward search contains undesirable non-deterministic choices that entirely disappear when reasoning forwards-- starting from known facts and inferring new facts using the logical rules until the goal proposition is inferred. Because focusing is so beneficial in the backward direction, the obvious question to ask is whether the insights of focusing can also be applied to forward search.

In this talk I will sketch the forward search strategy for a fragment of linear logic and answer precisely this question. No familiarity with linear logic is assumed.

Based on joint work with Frank Pfenning.

In partial fulfilment of the speaking requirement.

Adaptable Automatic Evaluation Metrics for Machine Translation

Friday, February 10th, 2006 from 12-1 pm in NSH 1507.

Presented by Lucian Vlad Lita, CSD

Recent machine translation (MT) evaluation methods show promising correlation to human judgments in terms of adequacy and fluency. These advances allow researchers to bypass the expensive and time consuming human assessments when building or using machine translation systems. However, automatic MT evaluation still has considerable room for improvement, especially when sentence-level evaluation is required. In particular, flexible, parametrized models can be learned from past data and can be automatically optimized to correlate well with human judgments for different criteria (e.g. adequacy, fluency) using different correlation measures. We present BLANC, a family of dynamic, trainable evaluation metrics for machine translation and we show that it is a generalization of established BLEU and ROUGE metric families. We compare correlations with human judgments across these three metric families using MT output from recent Arabic to English evaluations and we discuss the benefits of shifting to a paradigm of learning specific evaluation metrics for machine translation.

In partial fulfillment of the speaking requirement

Near-optimal Sensor Placements: Maximizing Information while Minimizing Communication Cost

Friday, February 17th, 2006 from 12-1 pm in NSH 1507.

Presented by Andreas Krause, CSD

When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this talk, I will present our data-driven approach that addresses the three central aspects of this problem: measuring the predictive quality of a set of sensor locations (regardless of whether sensors were ever placed at these locations), predicting the communication cost involved with these placements, and designing an algorithm with provable quality guarantees that optimizes the NP-hard tradeoff. Specifically, we use data from a pilot deployment to build non-parametric probabilistic models called Gaussian Processes (GPs) both for the spatial phenomena of interest and for the spatial variability of link qualities, which allows us to estimate predictive power and communication cost of unsensed locations. Using these models, we present a novel efficient algorithm, pSPIEL, which selects Sensor Placements at Informative and cost-Effective Locations. Exploiting two important properties of this problem -- submodularity and locality -- we prove strong approximation guarantees for our pSPIEL approach. We also provide extensive experimental validation of this practical approach on several real-world placement problems, demonstrating significant advantages over existing methods.

Joint work with Carlos Guestrin, Anupam Gupta, Jon Kleinberg and Ajit Singh. In partial fulfillment of the CSD speaking requirement.

Using Regularity in One-Hop Clustering for Sensor Network Localization

Friday, February 24th, 2006 from 12-1 pm in NSH 1507.

Presented by Haowen Chan, CSD

In large scale sensor networks, it is often necessary for each node to know its approximate geographical location in order to perform such tasks as correlating readings with physical locations, routing, or answering queries for data. However, exhaustively programming each sensor node with its location may be too costly for large deployments, and relying on special hardware (such as GPS) is impractical for low cost sensor nodes. The problem is further complicated by obstacles and irregularities that may limit the accuracy of standard triangulation-based localization algorithms. This talk presents a novel approach to localization that makes use of the geographical regularity produced in solving an unrelated problem: one-hop sensor node clustering. By first performing sensor node clustering, we can derive sufficient information to perform approximate localization without requiring ranging or positioning equipment. We show that the resultant algorithm outperforms other currently known localization techniques under certain conditions. Joint work with Adrian Perrig.

This talk is in partial fulfilment of the speaking requirement.

SPIRIT: Streaming Pattern Discovery : methods and applications

Friday, March 24th, 2006 from 12-1 pm in NSH 1507.

Presented by Jimeng Sun, CSD

Data stream is an important model for many different applications: network traffic analysis, sensor network monitoring, moving object tracking, and financial data analysis. A lot of those are monitoring applications, where huge volume of data streams into the system in real-time for processing and mining. In this talk, I will present SPIRIT (Streaming Pattern dIscoveRy in multIple Timeseries). Given n numerical data streams, all of whose values we observe at each time tick t, SPIRIT can incrementally find correlations and hidden variables, which summarize the key trends in the entire stream collection. It can do this quickly, with no buffering of stream values and without comparing pairs of streams. Moreover, it dynamically detects changes which can be used to immediately spot potential anomalies, to do efficient forecasting. And our experimental evaluation and case studies show that SPIRIT can incrementally capture correlations and discover trends, efficiently and effectively. Finally, I will show a web demo on network clustering monitoring using SPIRIT.

Evaluating Expected Character Performance in Data-Driven Animation

Friday, March 31st, 2006 from 12-1 pm in NSH 1507.

Presented by Paul Reitsma, CSD

Humanlike characters are an important area of computer animation. These characters are vital components of many applications, such as computer games, cinematic special effects, virtual reality training, and artistic expression. There are two main challenges to animating humanlike characters. First, our daily familiarity with human motion makes creating realistic humanlike motion particularly difficult, since viewers have an extremely high sensitivity to artifacts or errors in human motion; this is the problem of producing animation of sufficient quality. Second, animators require a motion generation system that can produce all motions required to animate the character through the full range of tasks in the target application; this is the problem of ensuring the motion generation system has sufficient capability. While several technologies have been developed to address these problems, our understanding of such technologies is still developing. Accordingly, techniques for producing high-quality animations---such as motion capture---and for producing high-capability animation systems---such as motion graphs---typically rely heavily on the intuition and abilities of a skilled animator to achieve acceptable results, and hence are difficult to use or automate. The goal of this work was to augment the animator's intuition by providing ways to more objectively quantify and measure a system's capability to produce the required motion, and the quality to viewers of the motion so produced. Our approach is to foster a more quantitative and objective understanding of the motion graph data structure, including both the visual quality of the resulting animation to users and the task-performance capability of a character driven by the motion graph.

Autonomous Visualization

Friday, April 7th, 2006 from 12-1 pm in NSH 1507.

Presented by Khalid El-Arini, CSD

Many classification algorithms suffer from a lack of human interpretability. Using such classifiers to solve real world problems often requires blind faith in the given model. In this talk, I will present our novel approach to classification that takes into account interpretability and visualization of the results. We attempt to produce the most relevant snapshot of the data, in the form of a two-dimensional scatter plot with easily understandable axes. We define a scoring metric for scatter plots, and design and implement an algorithm for efficiently discovering the most relevant plot. We then use this plot as the basis for a classification algorithm. Furthermore, we investigate the trade-off between classification accuracy and interpretability by comparing the performance of our classifier on real data with that of several traditional classifiers. We find that there is little to be lost in terms of accuracy when additional interpretability is sought.

Joint work with Andrew Moore and Ting Liu. In partial fulfillment of the CSD speaking requirement.

Browsing Simulation Options

Friday, April 14th, 2006 from 12-1 pm in NSH 1507.

Presented by Christopher Twigg, CSD

Physically based simulation is a powerful tool that allows animators to achieve realistic, compelling results without tedious manual keyframing. The trade-off, however, is that animators have less creative control over the results. Several techniques have been developed in recent years that give animators more control without sacrificing physical realism, but these tend to impose limitations on the class of simulators that can be used and may require a lengthy optimization process. I will discuss a simpler approach that involves precomputing a large set of simulation possibilities and allowing the user to select from among these. Our approach relies on a simple and intuitive interface based on using constraints to limit options and then ranking to select from among the remaining option. I will show how our approach can be used for animations involving both large numbers of colliding objects and articulated objects.

Joint work with Doug L. James

In Partial Fulfillment of the Speaking Requirement

Mobility and Locality Types for Distributed Programming

Friday, April 28th, 2006 from 12-1 pm in NSH 1507.

Presented by Jonathan Moody, CSD

Distributed computation may be envisioned in many ways, leading to a wide variety of distributed calculi and programming models. In this talk, we present a simple, statically-typed, functional idiom for distributed programming. The essential novelty of our approach consists of a type theory of mobility types []A (anywhere type A) and locality types <>A (somewhere type A). The properties of mobility and locality are defined in a way that is orthogonal to other type constructors, hence the type system is easily extensible to incorporate structured data and polymorphism. The typing of local effects (e.g. I/O, mutable store) is a notable exception. Through examples, we''ll show how this calculus supports a functional style of distributed programming in which type constructors [] and <> serve to manage the spatial "effect" of switching locations. In contrast to more general process calculi, communication deadlock and non-determinism are not intrinsic features of the language. Finally, we discuss the key type-safety result: the absence of runtime marshalling errors.

with: Frank Pfenning

in partial fulfillment of the Speaking Requirement

Certified Computation

Friday, May 5th, 2006 from 12-1 pm in NSH 1507.

Presented by Susmit Sarkar, CSD

Type systems have been used to specify various properties of programs, which can be verified automatedly. We propose a powerful system, called LF/ML, that can be used to verify partial correctness of programs. This is a refinement type system built on a functional programming language. It can prove correctness of various programs, such as our target application, a type inference engine.

In this talk, I will start with reviewing, for a general computer science audience, of prior work in type theory. I will motivate our work as a natural evolution of these ideas.

This is joint work with Karl Crary. The talk is in partial fulfilment of the CSD speaking requirement.

Consumable Credentials in Logic-Based Access Control

Friday, May 12th, 2006 from 12-1 pm in NSH 1507.

Presented by Kevin Bowers, CSD

Logic-based access control has many advantages. Policies are easy specify, easy to understand, and easy to prove correct. To gain access to a resource, users create a logical proof which is then checked by the reference monitor before access is granted. As long as the proof is valid and meets the access-control policy, which is also specified in logic, the user is allowed access. Unfortunately, once a proof has been constructed, it can be copied and reused at will. This prevents the system from implementing any sort of bounded-use policy. We develop a means to overcome this deficiency in logic-based access control. Because resources can now be consumed, we also develop techniques to prevent resources from being wasted, even in a setting where proofs are generated in a distributed manner.

In Partial Fulfillment of the Speaking Requirement

From Adequate Proofs to Efficient Proofs

Friday, May 26th, 2006 from 12-1 pm in NSH 1507.

Presented by Jason Reed, CSD

In the last few decades the idea of mechanizing mathematics has made a lot of progress. It is becoming routine for mathematicians (and computer scientists) to be able to put definitions, theorems, proofs, and other mathematical gadgets into a form that computers can manipulate. There are already many tools available that can confirm mathematical truths to a much higher degree of confidence than hand-checked human-written proofs.

However, these formal proofs typically blow up in size compared to "paper" proofs, because every last formal detail of the proof must (in principle) be included. This talk describes an attempt to address this problem: it turns out that a more compact representation of proofs comes naturally out of thinking about what it means for a formal system to be an adequate representation of the informal mathematics it’s meant to encode.

(In Partial Fulfillment of the Speaking Requirement)

Web contact: sss+www@cs