go back to the main page

SSS Abstracts
Fall 2007

go back to the main page

Multi-Dimensional Range Query over Encrypted Data

Friday, September 7^th, 2007 from 12-1 pm in NSH 1507.

Presented by Elaine Shi, CSD

We design an encryption scheme called Multi-dimensional Range Query over Encrypted Data (MRQED), to address the privacy concerns related to the sharing of network audit logs and various other applications. Our scheme allows a network gateway to encrypt summaries of network flows before submitting them to an untrusted repository. When network intrusions are suspected, an authority can release a key to an auditor, allowing the auditor to decrypt flows whose attributes ( e.g., source and destination addresses, port numbers, etc.) fall within specific ranges. However, the privacy of all irrelevant flows are still preserved. We formally define the security for MRQED and prove the security of our construction under the decision bilinear Diffie-Hellman and decision linear assumptions in certain bilinear groups. We study the practical performance of our construction in the context of network audit logs. Apart from network audit logs, our scheme also has interesting applications for financial audit logs, medical privacy, untrusted remote storage, etc. In particular, we show that MRQED implies a solution to its dual problem, which enables investors to trade stocks through a broker in a privacy-preserving manner.

Face Poser: Interactive Modeling of 3D Facial Expressions Using Model Priors

Friday, September 28^th, 2007 from 12-1 pm in NSH 1507.

Presented by Manfred Lau, CSD

I will present our work on the Face Poser system. Face Poser provides an intuitive interface for posing 3D facial expressions. The user can create and edit facial expressions by drawing freeform strokes or controlling points on the 2D screen. We formulate the problem in a maximum a posteriori (MAP) framework by combining a model prior with the user-defined constraints. Our solution therefore produces realistic facial expressions that satisfies the user-defined constraints. Our system is interactive; it is also simple and easy to use. A first-time user can learn to use the system and start creating a variety of natural face expressions within minutes.

In Partial Fulfillment of the CS Speaking Requirement

What Can YouTube Learn From Mystery Science Theatre 3000?

Friday, October 5^th, 2007 from 12-1 pm in NSH 1507.

Presented by Justin Weisz, CSD

Growing up, I never really liked Mystery Science Theatre 3000. The premise was silly, the movies were awful, and the commentary was inane at best, and obtuse at worst. However, the concept of the show fascinated me. Could talking robots really make bad movies entertaining? How did that work, exactly?

With the recent explosion in popularity of online video sites, and their continuing struggle to differentiate themselves in the market, the technology enabling people to engage in "MST3k" experiences is becoming widespread. Sites such as youTube's Streams, Lycos Cinema and Justin.tv already enable their audience to chat with one another while watching videos. However, is it a good idea to let people chat while watching? Do they get something out of it, or is it just an annoying distraction? What do they talk about? Can complete strangers have a conversation simply because they are watching the same video?

This talk is all about talking during a movie. I will present results from two lab studies of people watching videos online, while chatting with one another. It turns out the experience is a lot of fun, strangers are able to "break the ice" with one another, and the distraction from chatting while watching does not adversely affect the experience. Further, the "MST3k effect" is real: chat makes bad videos more enjoyable.

Making Local Service Discovery Confidential with Tryst

Friday, October 26^th, 2007 from 12-1 pm in NSH 1507.

Presented by Jeff Pang, CSD

Mobile wireless devices, such as Zunes, PSPs, and iPhones, operate in environments where anyone within communication range can overhear conversations. Existing security mechanisms, however, do not conceal identities, locations, and relationships when a client attempts to discover devices and services in close proximity, a process we call local service discovery. Local service discovery is an essential bootstrapping component for Wi-Fi and Bluetooth devices, for Windows, Mac OS X, and Linux networking, and for popular applications such as iTunes and iChat. In this talk, we argue that these devices, operating systems, and applications release sensitive information through their use of service discovery. We present empirical evidence that suggests the release of this information in wireless environments poses a real danger to our privacy. For example, Wi-Fi service discovery can reveal places where you have been, including the location of your home. We then discuss the design and implementation of Tryst, an architecture that enhances the confidentiality of existing service discovery mechanisms, thereby preventing the unintended release of this information to third parties.

Additional Info: Joint work with Ben Greenstein, Srinivasan Seshan, and David Wetherall. In Partial Fulfillment of the Speaking Requirement.

Combining Structural Subtyping and External Dispatch

Friday, November 2^nd, 2007 from 12-1 pm in NSH 1507.

Presented by Donna Malayeri, CSD

By-name subtyping (or user-defined subtyping) and structural subtyping each have their strengths and weaknesses. By-name subtyping allows the programmer to explicitly express design intent and enables external dispatch, whereby methods can be added to a class outside its definition. On the other hand, structural subtyping is flexible and compositional, allowing the programmer to take advantage of subtyping relationships not declared in advance. Current object-oriented languages support only one subtyping paradigm or the other.

We describe a core calculus for a language that combines the key aspects of by-name and structural subtyping in a unified framework. Our goal is to provide the flexibility of structural subtyping while still allowing static typechecking of external methods. We illustrate the practical utility of the language through several examples not easily expressed in other languages. Further, since by-name subtyping is the standard in mainstream object-oriented languages, we demonstrate the practical need for structural subtyping through a case study of the Java collections library.

Joint work with Jonathan Aldrich. In Partial Fulfillment of the Speaking Requirement.

Identifying Cycling Genes by Combining Sequence Homology and Expression Data

Friday, November 9^th, 2007 from 12-1 pm in WeH 4623.

Presented by Yong Lu, CSD

Motivation: The expression of genes during the cell division process has now been studied in many different species. An important goal of these studies is to identify the set of cycling genes. To date, this was done independently for each of the species studied. Due to noise and other data analysis problems, accurately deriving a set of cycling genes from expression data is a hard problem. This is especially true for some of the multicellular organisms, including humans.

Results: Here we present the first algorithm that combines microarray expression data from multiple species for identifying cycling genes. Our algorithm represents genes from multiple species as nodes in a graph. Edges between genes represent sequence similarity. Starting with the measured expression values for each species we use Belief Propagation to determine a posterior score for genes. This posterior is used to determine a new set of cycling genes for each species.

We applied our algorithm to improve the identification of the set of cell cycle genes in budding yeast and humans. As we show, by incorporating sequence similarity information we were able to obtain a more accurate set of genes compared to methods that rely on expression data alone. Our method was especially successful for the human dataset indicating that it can use a high quality dataset from one species to overcome noise problems in another.

Joint work with Roni Rosenfeld and Ziv Bar-Joseph. In Partial Fulfillment of the Speaking Requirement.

Characterization of Robust Linear Coding Solutions

Friday, November 16^th, 2007 from 12-1 pm in NSH 1507.

Presented by Doru Balcan, CSD

Many linear encoding approaches focus on representing information by approximating the underlying statistical density of the data, like PCA or ICA, or by developing encoding/decoding algorithms with desirable computational and representational properties, such as Fourier and wavelet-based codes. However, if the coefficients'' precision is limited, optimality of the representation cannot be guaranteed. The issue of optimality under limited precision is a common practical concern, and it is also relevant to biological neural representations, where the coding precision of individual neurons has been reported to be as low as a few bits per spike. In this talk, I will present a new coding scheme called Robust (Linear) Coding, that makes use of arbitrarily many coding units to minimize reconstruction error. One characteristic of Robust Coding is that it can introduce redundancy in the code to compensate for channel noise, unlike PCA or ICA, which aim to reduce redundancy. We can completely analyze the under- and overcomplete cases, and identify necessary and sufficient conditions of the optimal encoder/decoder pair in each case. In the process, we find an exact formula for the lower bound of the error function, as well as an efficient procedure for computing the optima.

Joint work with Eizaburo Doi and Mike Lewicki.

In Partial Fulfillment of the Speaking Requirement.

R-NUCA: Data Placement in a Distributed Shared Cache

Friday, November 30^th, 2007 from 12-1 pm in NSH 1507.

Presented by Nikos Hardavellas, CSD

Increases in on-chip communication delays and the large working sets of commercial and scientific workloads complicate the design of the on-chip cache for multi-core processors. The large working sets favor a shared L2 cache design that maximizes the aggregate cache capacity and minimizes off-chip memory requests. At the same time, the growing on-chip communication delays favor core-private caches that replicate data to minimize delays on global wires. Recent hybrid proposals promise to offer lower average latencies than conventional designs through intelligent migration and replication of data on chip. However, these proposals either address the placement requirements of only a subset of the data accessed by the application, require complicated lookup and coherence mechanisms that increase latency, or fail to scale to high core counts.

In this work, we observe that the cache access patterns of a range of server and scientific workloads can be classified into distinct categories, where each class is amenable to different data placement, migration and replication policies. Based on this observation, we propose R-NUCA, a distributed shared L2 cache design which cooperates with the operating system to achieve an efficient, large-capacity, low-average-latency L2 cache. Our design supports intelligent placement, migration, and replication without the overhead of an explicit coherence mechanism for the distributed L2 slices. We evaluate our design in a range of server and scientific workloads and find that it lowers the effective L2 latency as compared to prior proposals by 13% on average, and by up to 23% in some cases.

In partial fullfillment of the Speaking Requirement

Compiling Self-Adjusting Programs

Friday, December 7^th, 2007 from 12-1 pm in NSH 1507.

Presented by Ruy Ley-Wild, CSD

Self-adjusting programs respond automatically and efficiently to input changes by tracking the dynamic data dependencies of the computation and incrementally updating the output as needed. After a run from scratch, the input can be changed and the output can be updated via change propagation, a mechanism for re-executing the portions of the computation affected by the new values while reusing unaffected parts. Previous research shows that self-adjusting programs can be effective at achieving near-optimal update bounds for various applications.

An ordinary program can made self-adjusting by making data dependencies explicit to identify when code must be re-executed and using memoization to identify when previous work may be reused. While previous proposals provided library support for self-adjusting computation, converting an ordinary program required substantial code restructuring and was error-prone due to various proper-usage restrictions.

We propose a language-based technique for annotating ordinary programs and compiling them into equivalent self-adjusting versions. The annotations serve to infer a conservative approximation of data dependencies automatically and the translation generates correct and efficient self-adjusting programs. This approach offers a natural programming style that requires minimal changes to existing code and achieves the same asymptotic improvements as the library-based approach.

Joint work with Umut Acar and Matthew Fluet.

This talk is in partial fulfillment of the speaking requirement.

Web contact: sss+www@cs