go back to the main page

 SSS Abstracts 
Spring 2005

go back to the main page

Anytime A* Search with Provable Bounds on Sub-Optimality

Friday, January 14th, 2005 from 12-1 pm in NSH 1507.

Presented by Maxim Likhachev,

In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solution quickly and then continually work on improving it until time runs out. In this talk I will propose an anytime version of a widely popular heuristic search, A* search. The proposed algorithm tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, the search reuses previous search efforts and, as a result, is significantly more efficient than other anytime search methods. In addition to theoretical analysis, I will demonstrate the practical utility of the proposed algorithm with experiments on a simulated robot kinematic arm and a dynamic path planning problem for an outdoor rover.

This is a joint work with Geoff Gordon and Sebastian Thrun.

This talk is in partial fulfillment of the speaking requirement.

Parameter Related Domain Knowledge for Learning in Bayesian Networks

Friday, February 4th, 2005 from 12-1 pm in NSH 1507.

Presented by Stefan Niculescu,

Building accurate models from a small amount of available training data can sometimes prove to be a great challenge. Expert domain knowledge can often be used to alleviate this burden. Domain Knowledge can come in different forms: knowledge about which variables are relevant, knowledge about conditional independences among groups of variables or knowledge about relationships among parameters of a given structure model. In this talk I am going to discuss about how to take advantage of knowledge about relationships among parameters for the purpose of parameter estimation in Bayesian Networks. First, I am going to argue why it is important to be able to take advantage of this kind of domain knowledge. Then I will discuss parameter learning in several scenarios where expert knowledge is readily available. In the last part of my talk I am going to discuss the contributions of my research in this area. First, I introduce a General Parameter Sharing Framework that describes models like HMMs, Context Specific Independence and Module Networks, but allows for much finer grained parameter sharing assumptions. Second, I describe a hierarchical extension of this framework. Finally I will discuss two other types of domain knowledge: probability mass sharing and probability ratio sharing.

This talk is in partial fulfillment of the speaking requirement.

Boolean Methods for Arithmetic Reasoning

Friday, February 11th, 2005 from 12-1 pm in NSH 1507.

Presented by Sanjit Seshia,

Automatic techniques for analyzing data-intensive and timing-dependent properties of computer systems have lagged behind those for control-intensive properties, such as finite-state model checking. A primary reason is the lack of efficient and automated decision procedures for logics more expressive than Boolean logic, which necessitates trading precision for scalability.

This talk presents a new approach to building scalable and automated decision procedures for first-order logics involving arithmetic. There are two key ideas in this approach. First, decision problems involving arithmetic are transformed to problems in the Boolean domain, such as Boolean satisfiability solving, thereby leveraging recent advances in Boolean methods. Second, the transformation automatically detects and exploits problem structure, using formula analysis and machine learning. Decision procedures based on these problem-specific Boolean methods can outperform other state-of-the-art procedures by two or three orders of magnitude.

This approach has been implemented in two systems, UCLID and TMV, and its utility has been demonstrated in a range of applications, including finding API-level security exploits, detecting viruses and worms, and verifying circuits that operate under timing assumptions.

Verification by Network Decomposition

Friday, February 18th, 2005 from 12-1 pm in NSH 1507.

Presented by Murali Talupur,

Systems consisting of multiple processes executing asynchronously occur very often in practice and verifying such systems is one of the main challenges facing the verification community. In this talk I will describe a new method to verify networks of homogeneous processes which communicate by token passing. Given an arbitrary network graph and an indexed LTL\X property, we can decompose the network graph into multiple constant size networks, thereby reducing one model checking call on a large network to several calls on small networks. We thus obtain cut-offs for arbitrary classes of networks, adding to previous work on the ring topology. Our results on LTL\X are complemented by a negative result which precludes the existence of such reductions for CTL\X properties on general networks.

This talk is in partial fulfillment of the speaking requirement.

SAT-based Abstraction Refinement

Friday, February 25th, 2005 from 12-1 pm in NSH 1507.

Presented by Anubhav Gupta,


A Progressive Register Allocator for Irregular Architectures

Friday, March 11th, 2005 from 12-1 pm in NSH 1507.

Presented by Dave Koes,

Register allocation is one of the most important optimizations a compiler performs. Conventional graph-coloring based register allocators are fast and do well on regular, RISC-like, architectures, but perform poorly on irregular, CISC-like, architectures with few registers and non-orthogonal instruction sets. At the other extreme, optimal register allocators based on integer linear programming are capable of fully modeling and exploiting the peculiarities of irregular architectures but do not scale well. We introduce the idea of a progressive allocator which finds an initial allocation of quality comparable to a conventional allocator, but as more time is allowed for computation the quality of the allocation approaches optimal. This paper presents a progressive register allocator which uses a multi-commodity network flow model to elegantly represent the intricacies of irregular architectures. We evaluate our allocator substituted for gcc's local register allocation pass.

Scaling Task Management in Space and Time

Friday, March 18th, 2005 from 12-1 pm in WeH 4623.

Presented by Joao Pedro Sousa, CSD

Advances in computing and networking are prompting users to change their expectations about the availability of computing. Instead of making primary use of a single machine, users may want to take advantage of the computing capabilities at different locations, much like people take advantage of furniture, today. Furthermore, people carry out computer-supported tasks that span months or years, and that may recur. Today, after a system shutdown, users have to manually reconstruct the computing state of their tasks.

This talk describes a new approach to the scalability of task management in space, across heterogeneous computing environments, and in time. The approach is based on high-level models of what users need from the computing environment for each of their tasks. Such models are exploited at run-time by an infrastructure that automatically configures the computing environment on behalf of users.

We present an architectural framework that grounds our approach, and a utility-theoretic model that enables finding the best match between user needs and the capabilities and resources in the environment.

Multi-Splay Trees

Friday, March 25th, 2005 from 12-1 pm in NSH 1507.

Presented by Chengwen Chris Wang, CSD

The Dynamic Optimality Conjecture states that splay trees are competitive (with a constant competitive factor) among the class of all binary search tree (BST) algorithms. Despite 20 years of research this conjecture is still unresolved. Recently Demaine et al. suggested searching for alternative algorithms which have small, but non-constant competitive factors. They proposed tango, a BST algorithm which is nearly dynamically optimal -- its competitive ratio is O(log log n) instead of a constant. Unfortunately, for many access patterns, tango is worse than other BST algorithms by a factor of log log n.

In this paper we introduce multi-splay trees, which can be viewed as a variant of splay trees. We prove the multi-splay access lemma, which resembles the access lemma for splay trees. With different assignment of weights, this lemma allows us to prove various bounds on the performance of multi-splay trees. Specifically, we prove that multi-splay trees are O(log log n)-competitive, and amortized O(log n). This is the first BST data structure to simultaneously achieve these two bounds. In addition, the algorithm is simple enough that we include code for its key parts.

This talk is in partial fulfillment of the speaking requirement.

Clotho: decoupling memory page layout from storage organization

Friday, April 8th, 2005 from 12-1 pm in NSH 1507.

Presented by Minglong Shao, CSD

As database application performance depends on the utilization of the memory hierarchy, smart data placement plays a central role in increasing locality and in improving memory utilization. Existing techniques, however, do not optimize accesses to all levels of the memory hierarchy and for all the different workloads, because each storage level uses different technology (cache, memory, disks) and each application accesses data using different patterns. Clotho is a new buffer pool and storage management architecture that decouples in-memory page layout from data organization on non-volatile storage devices to enable independent data layout design at each level of the storage hierarchy. Clotho can maximize cache and memory utilization by (a) transparently using appropriate data layouts in memory and non-volatile storage, and (b) dynamically synthesizing data pages to follow application access patterns at each level as needed. Clotho creates in-memory pages individually tailored for compound and dynamically changing workloads, and enables efficient use of different storage technologies (e.g., disk arrays or MEMS-based storage devices). The talk describes the Clotho design and prototype implementation and evaluates its performance under a variety of workloads using both disk arrays and simulated MEMS-based storage devices.

This talk is in partial fulfillment of the speaking requirement.

User Interface Dependability through Goal-Error Prevention

Friday, April 22nd, 2005 from 12-1 pm in NSH 1507.

Presented by Rob Reeder,

User interfaces form a critical juncture between humans and computers. When the interface fails, the user fails, and the mission is lost. For example, in computer security applications, human-made configuration errors can expose entire systems to various forms of attack.

To avoid interaction failures, a dependable user interface must facilitate task completion as quickly and as accurately as possible. Defects in the interface cause user errors (e.g., goal, plan, action and perception errors), that impinge on accuracy goals, and can lead to mission failure.

This paper seeks the causes of goal errors, asking what aspects of a user interface contribute to or detract from a user's propensity to commit goal errors. A design procedure (external subgoal support) was formulated for helping prevent goal errors. Implementing this procedure requires identifying the information needed to set and verify completion of subgoals, and presenting the user with a salient representation of this information. Two interfaces for setting user file permissions were tested: Windows XP and an alternative. The alternative was designed using external subgoal support, and Windows was not. Experiments with 24 human subjects demonstrated the increased effectiveness of the alternative interface, obtaining as much as a four-fold increase in accuracy in a representative user task, and a 94% reduction in the number of goal errors committed.

This talk is in partial fulfillment of the speaking requirement.

Data-Driven Question Answering

Friday, April 29th, 2005 from 12-1 pm in NSH 1507.

Presented by Lucian Vlad Lita,

Most existing question answering systems combine rule-based, knowledge-based and statistical components, and are highly optimized for a particular style of questions in a given language. Typical question answering approaches depend on specific ontologies, resources, processing tools, document sources, and very often rely on expert knowledge and rule-based components. Furthermore, such systems are very difficult to re-train and optimize for different domains and languages, requiring considerable time and human effort.

We present a data-driven, instance-based approach to question answering (IBQA) in which we learn how to answer new questions from similar training questions with known correct answers. We cluster the training questions according to different granularity, scatter, and similarity metrics. From each cluster we automatically learn an "answering strategy" for finding answers to questions in that cluster. When answering new questions, multiple answering strategies are simultaneously employed.

The IBQA approach is resource non-intensive, but can easily be extended to incorporate knowledge resources or rule-based components. Since it does not rely on hand-written rules, expert knowledge, and manually tuned parameters, it is less dependent on specific languages and domains, allowing for fast re-training with minimum human effort. Even with limited data, IBQA achieves good performance, improves with additional training instances, and is easily trainable and adaptable to new types of data.

This talk is in partial fulfillment of the speaking requirement.

Applying Type Refinements

Friday, May 6th, 2005 from 12-1 pm in NSH 3305.

Presented by Joshua Dunfield,

Traditional static type systems guarantee type safety, in the form of type preservation and progress theorems, and broadly describe properties of functions (such as taking integers to lists of integers).

Datasort and index refinements allow programmers to specify stronger properties through the static type system: (1) Datasorts refine algebraic datatypes through a finite subsorting relation (for example, empty lists are included in even lists; even-length lists and odd-length lists are included in all lists). (2) Indices refine datatypes by elements of some decidable constraint domain, such as linear inequalities over integers (for example, trees indexed by an integer denoting the number of non-leaf nodes).

Typechecking in a system with datasort and index refinements proves at compile time that functions have the properties specified by the programmer: that the offset given to an array-access primitive is in range, that color and black height invariants of red-black trees are preserved, and so forth.

This talk focuses on how to identify properties expressible through refinements, and how to define and use appropriate refinements.

This work unifies and extends previous work on datasort refinements, index refinements, intersection types, and union types.

This is joint work with Frank Pfenning.

This talk is in partial fulfillment of the speaking requirement.

Boosting Seller's Revenue in Combinatorial Auctions

Friday, May 13th, 2005 from 12-1 pm in NSH 1507.

Presented by Anton Likhodedov,

In the past few years auction theory has attracted considerable attention in Computer Science community. One of the well recognized open problems in this area is the design of revenue-maximizing combinatorial auctions - that is auctions with multiple heterogeneous items on sale, in which participants are allowed to bid on the combinations of items. This problem is unsolved even for relatively simple cases, such as two items on sale and two bidders.

In this talk I will review the general setup of the optimal combinatorial auction design problem. I will also introduce one of the promising approaches to the problem, which focuses on designing approximations (suboptimal auction mechanisms which yield high revenue) rather than attempting to characterize the optimal mechanism itself. Finally I will present the approximation results obtained in worst-case and average-case frameworks.

This talk is in partial fulfillment of the speaking requirement.

A Type Theory for Distributed Programming

Friday, May 20th, 2005 from 12-1 pm in NSH 1305.

Presented by Jonathan W. Moody,

What makes a programming langauge "distributed"? Is there a simple recipe like: add communication channels, remote function calls, object migration, etc. and stir vigorously? Smells wonderful, but look closer, and there is a fly in our soup! The localized resources -- I/O devices and mutable store, for example. Should we close our eyes and add more seasoning to cover the flavor, or try to strain them out? As always, there are tradeoffs to be considered: between abstraction, runtime cost (management of proxy values), and safety (marshalling/serialization errors).

We present a minimal functional language with explicit mobility and locality types which safely "strain out" these localized resources. Mobile code and values are given the type []A (meaning "anywhere A") and remote resources the type <>A (meaning "somewhere A"). The calculus allows both spawning mobile code for evaluation anywhere, and non-local control transfers for access to remote resources. This by no means exhausts the rich variety of mechanisms for distributed programming introduced in prior work. Rather, it serves as a natural, simple demonstration of the type-soundness of the approach, and a laboratory in which to consider various language extensions.

The calculus is "low-level" in the sense that location abstraction is sacrificed (somewhat) to obtain safety. The programmer must be aware of whether a resource is here or elsewhere, whether a fragment of code is mobile or not. But in exchange, he/she gets marshalling safety and more predictable performance. A little bit of effort for a tasty result!

Joint work with Frank Pfenning.

This talk is in partial fulfillment of the speaking requirement.

Web contact: sss+www@cs