go back to the main page

SSS Abstracts
Fall 2011

go back to the main page

SCION: Scalable, Control, and Isolation on Next Generation Networks

Tuesday, September 13^th, 2011 from 12-1 pm in GHC 8102.

Presented by Xin Zhang, CSD

We present the first Internet architecture designed to provide route control, failure isolation, and explicit trust information for end-to-end communications. SCION separates ASes into groups of independent routing sub-planes, called trust domains, which then interconnect to form complete routes. Trust domains provide natural isolation of routing failures and human misconfiguration, give endpoints strong control for both inbound and outbound traffic, provide meaningful and enforceable trust, and enable scalable routing updates with high path freshness. As a result, our architecture provides strong resilience and security properties as an intrinsic consequence of good design principles, avoiding piecemeal add-on protocols as security patches. Meanwhile, SCION only assumes that a few top-tier ISPs in the trust domain are trusted for providing reliable end-to-end communications, thus achieving a small Trusted Computing Base. Both our security analysis and evaluation results show that SCION naturally prevents numerous attacks and provides a high level of resilience, scalability, control, and isolation.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Simulating Skin Deformation with Musculoskeletal Structure

Tuesday, October 4^th, 2011 from 12-1 pm in GHC 8102.

Presented by Q Youn Hong,

Skin deformation is one of the critical factors to render a natural and dynamic character animation. In graphics, researchers have proposed several skin deformation techniques such as geometry-based, data-driven and physics-driven methods. In this talk, we present a method to simulate skin deformation from motion capture data. Our skin deformation model is built upon several anatomical layers including a skeleton, muscles and soft tissue. We first compute a skeletal motion given a motion sequence captured in the conventional motion capture. A muscle attached to the skeleton changes its volumetric shape with respect to the change of the length and tension of the muscle which are computed from the current skeletal configuration. Coefficients that determine the shape of each muscle are precomputed from skin samples captured with about 400 densely placed markers. We also simulate passive inertial effects with a mass-spring-damper model that exerts spring forces between a skin surface and nearby muscles and skeletal elements. We evaluate the skin deformation result both qualitatively and quantitatively. Experimental results show that our model can express well-known skin effects such as muscle co-contraction of biceps and skin jiggling after the ground contact.

This is joint work with Akihiko Murai, Katsu Yamane and Jessica Hodgins. Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Usable and Secure Password Management

Tuesday, October 25^th, 2011 from 12-1 pm in GHC 8102.

Presented by Jeremiah Blocki, CSD

Although millions of users use passwords everyday to protect important assets (e.g., online banking, trading, commerce, email, social networks, and enterprise resources) we do not know how to create secure and usable passwords. A typical computer user today has many password protected online accounts: Amazon, eBay, PNC bank, Gmail, etc.. Informally, a password management scheme is any method for creating and retrieving each password. A typical user has to select and remember a password for over one-hundred different accounts. Many sites have vastly different password requirements: minimum length, maximum length, special characters, capitalization, etc. Intimidated by the prospect of remembering so many different passwords many users adopt an insecure password management scheme: writing down passwords, reusing passwords and picking weak (low entropy) passwords. A large scale study of password habits revealed that in 2007 a typical user had no more than 7 unique passwords and reused each password around 4 times on average. While there are many articles (and even several books) on how to generate good passwords, there is still a clear need to develop password management schemes which are usable and secure.

We are interested in password management schemes which can be implemented on “human hardware”. A good password management scheme should be usable and secure. Informally, a password management scheme is usable if a human can create and recall passwords without too much effort. We present a mathematical framework for analyzing the security of a password management scheme. In this framework a secure password management scheme must provide concrete security guarantees even against an adversary who has already learned one or more of the users passwords. Using this framework we introduce a secure password management scheme and present evidence that this password management scheme is usable.

This is joint work with Manuel Blum and Anupam Datta.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

On the Duality of Data-intensive File System Design: Reconciling HDFS and PVFS

Tuesday, November 22^nd, 2011 from 12-1 pm in GHC 8102.

Presented by Wittawat Tantisiriroj, CSD

Data-intensive applications fall into two computing styles: Internet services (cloud computing) or high-performance computing (HPC). In both categories, the underlying file system is a key component for scalable application performance. In this paper, we explore the similarities and differences between PVFS, a parallel file system used in HPC at large scale, and HDFS, the primary storage system used in cloud computing with Hadoop. We integrate PVFS into Hadoop and compare its performance to HDFS using a set of data-intensive computing benchmarks. We study how HDFS-specific optimizations can be matched using PVFS and how consistency, durability, and persistence tradeoffs made by these file systems affect application performance. We show how to embed multiple replicas into a PVFS file, including a mapping with a complete copy local to the writing client, to emulate HDFS's file layout policies. We also highlight implementation issues with HDFS's dependence on disk bandwidth and benefits from pipelined replication.

This is joint work with Swapnil Patil, Garth Gibson, Seung Son, Samuel Lang, and Robert Ross.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Exact Pattern Matching with Feed-Forward Bloom Filters

Tuesday, November 29^th, 2011 from 12-1 pm in GHC 8102.

Presented by Iulian Moraru, CSD

In this talk we present a new fast and memory efficient algorithm for simultaneously searching for a very large number of patterns in a very large corpus. This algorithm builds upon the Rabin-Karp string search algorithm and incorporates a new type of Bloom filter that we call a “feed-forward Bloom filter”. While it retains the asymptotic time complexity of previous multiple pattern matching algorithms, we show that this technique, along with a CPU architecture aware design of the Bloom filter, can provide speedups between 2x and 30x, and memory consumption reductions as large as 50x when compared with grep. We also describe other practical benefits of using this algorithm, particularly in virus scanning.

Joint work with Prof. David G. Andersen.

Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services

Friday, December 2^nd, 2011 from 12-1 pm in GHC 4303.

Presented by Bin Fan, CSD

Load balancing requests across a cluster of back-end servers is critical for avoiding performance bottlenecks and meeting service level objectives (SLOs) in large-scale cloud computing services. This paper shows how a small, fast popularity-based front-end cache can ensure load balancing for an important class of such services; furthermore, we prove an O(n log n) lower-bound on the necessary cache size and show that this size depends only on the total number of back-end nodes n, not the number of items stored in the system. We validate our analysis through simulation and empirical results running a key-value storage system on an 85-node cluster.

This is a joint work with Hyeontaek Lim, David Andersen and Michael Kaminsky

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Optimal Column-Based Low-Rank Matrix Reconstruction

Tuesday, December 6^th, 2011 from 12-1 pm in GHC 8102.

Presented by Ali Kemal Sinop, CSD

We prove that for any real-valued m-by-n matrix X, and positive integers r, k with r≤k, there is a subset of r columns of X such that projecting X onto their span gives (r+1)/(r-k+1) approximation to best rank-k approximation of X in Frobenius norm.

We show that the trade-off we achieve between the number of columns and the approximation ratio is optimal up to lower order terms. Furthermore, there is a deterministic algorithm to find such a subset of columns that runs in O(r n m^w log m) arithmetic operations where w is the exponent of matrix multiplication. We also give a faster randomized algorithm that runs in O(r n m²) arithmetic operations.

Joint work with Venkatesan Guruswami.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Integrating Representation Learning and Skill Learning in a Human-Like Intelligent Agent

Friday, December 9^th, 2011 from 1-2 pm in GHC 2109.

Presented by Nan Li, CSD

Building an intelligent agent that simulates human learning for math and science could potentially benefit both education, by contributing to the understanding of human learning, and artificial intelligence, by advancing the goal of creating human-level intelligence. However, constructing such a learning agent currently requires manual encoding of prior domain knowledge; in addition to being a poor model of human acquisition of prior knowledge, manual knowledge-encoding is both time-consuming and error-prone. Previous work showed that one of the key factors that differentiates experts and novices is their different representation knowledge. Experts view the world in terms of deep functional features, while novices view it in terms of shallow perceptual features. Moreover, since the performance of many existing learning algorithms is sensitive to representations, the deep features are also important in achieving effective learning. In this talk, I present an efficient algorithm that acquires representation knowledge in the form of ``deep features'' for specific domains. I integrate this algorithm into a machine-learning agent, SimStudent, which learns procedural knowledge by observing a tutor work examples, and by getting feedback while actively solving problems on its own. I show that the integration of knowledge representation acquisition reduces the requirements for knowledge engineering. Moreover, I propose an approach that automatically discovers student models using the extended SimStudent. By fitting the discovered model to real student learning curve data, I show that it is a better student than human-generated models, and demonstrate how the discovered model may be used to improve a tutoring system's instruction strategy.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Web contact: sss+www@cs