go back to the main page

SSS Abstracts
Spring 2016

go back to the main page

Parallel and Distributed Graph Decomposition with application to Spanners

Monday, February 1^st, 2016 from 12-1 pm in GHC 6501.

Presented by Shen Chen Xu, CSD

Graph decompositions are widely used algorithmic routines. They partition the graph to enable divide-and-conquer algorithms. In this talk we present a parallel and distributed algorithm for one particular graph decomposition, and apply it to the problem of constructing graph spanners.

In partial fulfillment of the CSD Speaking Skills Requirement.

How to learn a quantum state

Friday, March 25^th, 2016 from 12-1 pm in GHC 6501.

Presented by John Wright, CSD

In the problem of quantum state learning, one is given a small number of "samples" of a quantum state, and the goal is to output a good estimate of the quantum state. This is a problem which is not only of theoretical interest, but is also commonly used in current-day implementation and verification of quantum technologies. In this talk, I will describe the first optimal quantum algorithm for this problem. In addition, I will describe optimal algorithms for the related problems of testing and learning specific properties of quantum states. My results make use of a new connection between quantum state learning and longest increasing subsequences of random words, a famous topic in combinatorics dating back to a 1935 paper of Erdős and Szekeres. Motivated by this connection, I will show new and optimal bounds on the length of the longest increasing subsequence of a random word.

Joint work with Ryan O'Donnell

In partial fulfillment of the CSD Speaking Skills Requirement.

CFA: A Practical Prediction System for Video QoE Optimization

Monday, April 18^th, 2016 from 12-1 pm in GHC 6501.

Presented by Junchen Jiang, CSD

Many prior efforts have suggested that Internet video Quality of Experience (QoE) could be dramatically improved by using data-driven prediction of video quality for different choices (e.g., CDN or bitrate) to make optimal decisions. However, building such a prediction system is challenging on two fronts. First, the relationships between video quality and observed session features can be quite complex. Second, video quality changes dynamically.

Thus, we need a prediction model that is (a) expressive enough to capture these complex relationships and (b) capable of updating quality predictions in near real-time. Unfortunately, several seemingly natural solutions (e.g., simple machine learning approaches and simple network models) fail on one or more fronts. Thus, the potential benefits promised by these prior efforts remain unrealized.

We address these challenges and present the design and implementation of Critical Feature Analytics (CFA). The design of CFA is driven by domain-specific insights that video quality is typically determined by a small subset of critical features whose criticality persists over several tens of minutes. This enables a scalable and accurate workflow where we automatically learn critical features for different sessions on coarse-grained timescales, while updating quality predictions in near real-time. Using a combination of a real-world pilot deployment and trace-driven analysis, we demonstrate that CFA leads to significant improvements in video quality; e.g., 32% less buffering time and 12% higher bitrate than a random decision maker.

Joint work with Vyas Sekar, Hui Zhang, and Ion Stoica

In partial fulfillment of the CSD Speaking Skills Requirement.

Monitoring and Blame Assignment for Higher-Order Session Types

Friday, April 22^nd, 2016 from 12-1 pm in GHC 6501.

Presented by Hannah Gommerstadt , CSD

In a distributed setting, processes collaborate to complete tasks where each component of the system has some role in the computation. These processes may be written in different languages, and some may be compromised by an attacker. We use session types, types that change as every step of the computation is performed, to prescribe the communication behavior between concurrent message-passing processes. We then use these session types to dynamically monitor the communication between processes to detect undesirable behavior. In this talk, we discuss how to dynamically monitor communication to enforce adherence to session types in a higher-order setting. We present a system of blame assignment in the case when the monitor detects an undesirable action and an alarm is raised.

Joint work with Frank Pfenning and Limin Jia

In partial fulfillment of the Speaking Requirement

Bridging the Capacity Gap Between Interactive and One-Way Communication

Monday, April 25^th, 2016 from 12-1 pm in GHC 6501.

Presented by Ameya Velingker, CSD

We study the communication rate of coding schemes for interactive communication that transform any two-party interactive protocol into a protocol that is robust to noise.

Recently, Haeupler '14 showed that if an eps > 0 fraction of transmissions are corrupted, adversarially or randomly, then it is possible to achieve a communication rate of 1 - Õ(sqrt(eps)). Furthermore, Haeupler conjectured that this rate is optimal for general input protocols. This stands in contrast to the classical setting of one-way communication in which error-correcting codes are known to achieve an optimal communication rate of 1 - θ(H(eps)).

In this work, we show that the quadratically smaller rate loss of the one-way setting can also be achieved in interactive coding schemes for a very natural class of input protocols. We introduce the notion of average message length, or the average number of bits a party sends before receiving a reply, as a natural parameter for measuring the level of interactivity in a protocol. Moreover, we show that any protocol with average message length l = Ω(poly(1/eps)) can be simulated by a protocol with optimal communication rate 1 - Θ(H(eps)) over an oblivious adversarial channel with error fraction eps.

This shows that the capacity gap between one-way and interactive communication can be bridged even for very small (constant in eps) average message lengths, which are likely to be found in many applications.

This is based on joint work with Bernhard Haeupler.

In partial fulfillment of the Speaking Requirement.

A General Framework for Frontier-Based Approximate Shortest Paths

Monday, May 2^nd, 2016 from 12-1 pm in GHC 2109.

Presented by Aram Ebtekar, CSD

The A* algorithm aims to balance depth-first (greedy) and breadth-first (Dijkstra) search to quickly find an approximate shortest path in a graph. While its runtime will depend on choosing a good heuristic suitable to the domain's structure, we can under mild conditions guarantee that each state is expanded at most once, and that a solution will be found to within a configurable optimality factor. In this talk, we explore these conditions in great generality, leading to a framework that encapsulates a family of known search algorithms. Some of these algorithms provide additional benefits, such as anytime solution bound improvement, dynamic path repair, multi-core processing, and the use of multiple inadmissible heuristics. Using this framework, we can better understand existing algorithms, mix and match their features in a modular fashion, and design new variants with the same theoretical guarantees. In particular, we present ePASE, an efficient parallel version of A.

Joint work with Maxim Likhachev.

In partial fulfillment of the Speaking Requirement.

Forecasting Seasonal Epidemics of Influenza with an Ensemble of Customized Statistical Methods

Friday, May 6^th, 2016 from 12-1 pm in GHC 6501.

Presented by Logan Brooks, CSD

Seasonal epidemics of influenza cause significant morbidity, mortality, and economic loss; Centers for Disease Control and Prevention (CDC) estimates that they are associated with over 200,000 hospitalizations and 3,000 to 49,000 deaths in the United States each year. Accurate and reliable forecasts of disease prevalence can assist the design of more effective countermeasures and improve hospital and public preparedness. Unfortunately, while the number of influenza infections over time within a given season follows a rough pattern, with one or two sharp peaks occurring between December and March, it is not well modeled by simple time series tools such as linear autoregression, and calls for a more tailored approach. I present one of the CMU Delphi group's systems for forecasting disease prevalence, an ensemble of empirical Bayes, regression, and kernel-based methods, and analyze its cross-validated performance when forecasting doctor visit data for influenza-like illness (ILI).

Joint work with David Farrow, Sangwon Hyun, Shannon Gallagher, Roni Rosenfeld, and Ryan Tibshirani

In Partial Fulfillment of the Speaking Requirement

Data Retention in Flash-Based SSDs

Monday, May 9^th, 2016 from 12-1 pm in GHC 6501.

Presented by Yixin Luo, CSD

Retention errors, caused by charge leakage over time, are the dominant source of flash memory errors. In this talk, we will introduce how retention errors affect SSD reliability by showing the results of our experimental characterization of state-of-the-art MLC flash chips. We will also introduce two mechanisms to mitigate these retention errors.

This work was presented as "Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery" in the best paper session at HPCA 2015.

Joint work with Yu Cai, Erich F. Haratsch, Ken Mai, and Onur Mutlu

In partial fulfillment of the Speaking Requirement.

Web contact: sss+www@cs