SCSFC

Modern programming languages safeguard developers from many typical errors, yet more subtle errors—such as violations of security policies—still plague software. Program synthesis has the potential to eliminate such errors, by generating executable code from concise and intuitive high-level specifications. Traditionally, program synthesis failed to scale to specifications that encode complex behavioral properties of software: these properties are notoriously hard to check even for a given program, and so it’s not surprising that finding the right program within a large space of candidates has been considered very challenging. My work tackles this challenge through the design of synthesis-friendly program verification mechanisms, which are able to check a large set of candidate programs against a complex specification at once, whereby efficiently pruning the search space.

Based on this principle, I developed Synquid, a program synthesizer that accepts specifications in the form of expressive types and uses a specialized type checkeras its underlying verification mechanism. Synquid is the first synthesizer powerful enough to automatically discover provably correct implementations of complex data structure manipulations, such as insertion into Red-Black Trees and AVL Trees, and normal-form transformations on propositional formulas. Each of these programs is synthesized in under a minute. Going beyond textbook algorithms, I created a language called Lifty, which uses type-driven synthesis to automatically rewrite programs that violate information flow policies. In our case study, Lifty was able to enforce all required policies in a prototype conference management system.

Nadia Polikarpova is a postdoctoral researcher at the MIT Computer Science and Artificial Intelligence Lab, interested in helping programmers build secure and reliable software. She completed her PhD at ETH Zurich. For her dissertation she developed tools and techniques for automated formal verification of object-oriented libraries, and created the first fully verified general-purpose container library, receiving the Best Paper Award at the International Symposium on Formal Methods. During her doctoral studies, Nadia was an intern at MSR Redmond, where she worked on verifying real-world implementations of security protocols. At MIT, Nadia has been applying formal verification to automate various critical and error-prone programming tasks.

Faculty Host: Jonathan Aldrich
Institute for Software Research

Web applications are affected by a wide array of security vulnerabilities. Of these, logical flaws in particular are good candidates for detection via model checking. However, approaches to analyze potential attacks often rely on manually constructed models. This limits the wider practical applicability of existing formal methods tools.

In this talk, I will present a step to bridge this gap by analyzing web application code (specifically, Java servlets) and extracting state machine models for use by model checkers which target security properties. I will discuss suitable abstractions which are essential to limit the complexity of the generated models. These include user-specified abstractions depending on the provided specifications. We have used this tool within the SPaCIoS EU research project to find vulnerabilities using an approach that can be combined with security testing and model inference. A promising future step is to leverage these models to detect more complex chained attacks.

Marius Minea is an associate professor at the Politehnica University of Timisoara, Romania. He received his PhD from Carnegie Mellon with a thesis on model checking for timed systems, advised by Ed Clarke and was then a postdoctoral researcher at the University of California, Berkeley. His research interests are at the intersection of software analysis, testing and security, where he has led several research projects.

He has taught courses on verification and security, enjoys using live coding in programming classes and has designed a discrete structures course that uses functional programming.

Faculty Host: Tom Cortina

Algorithms and formal verification are two classical areas of computer science. The two fields apply rigorous mathematical proof to seemingly disparate ends---on the one hand, analyzing computational efficiency of algorithms; on the other, designing techniques to mechanically show that programs are correct.

In this talk, I will present a surprising confluence of ideas from these two areas. First, I will show how coupling proofs, used to analyze random walks and Markov chains, correspond to proofs in the program logic pRHL (probabilistic Relational Hoare Logic). This connection enables formal verification of novel probabilistic properties, and provides an structured understanding of proofs by coupling. Then, I will show how an approximate version of pRHL, called apRHL, points to a new, approximate version of couplings closely related to differential privacy, and a new kind of proof by approximate coupling. The corresponding proof technique enables cleaner proofs of differential privacy, both for humans and for formal verification. Finally, I will share some directions towards a possible "Theory AB", blending ideas from both worlds.

Justin Hsu is a final year graduate student in Computer Science at the University of Pennsylvania. He obtained his undergraduate degree in Mathematics from Stanford University. His research interests span formal verification and theoretical computer science, including verification of randomized algorithms, differential privacy, and game theory. He is the recipient of a Simons graduate fellowship in Theoretical Computer Science.

Faculty Host: André Platzer

Computer Science

The goal of my research is to enable robots to perceive objects reliably, to manipulate them effortlessly, and to adapt to new objects and manipulation tasks. Toward this objective, my focus is on developing new hand-eye coordination capabilities for robots, specifically 1) dependable visual perception, 2) compliant soft manipulation, and 3) closing the loop between them. In this talk, I will present my robot vision research which harnesses 3D shape prior and a deep learning algorithm in order to reason about manipulation relevant information. A compliant soft hand will be shown with a focus on its compliance and adaptability. I will then address closing the loop between the visual perception and the soft manipulation and present its applications to object assembly manipulation and object grasping of previously unseen objects. In the later part of my talk, I will introduce challenging yet exciting future research directions.

Changhyun Choi is a Postdoctoral Associate in the Computer Science & Artificial Intelligence Lab (CSAIL) at Massachusetts Institute of Technology (MIT) working with Prof. Daniela Rus. He obtained a Ph.D. in Robotics at the School of Interactive Computing, College of Computing, Georgia Institute of Technology, wherein he was also affiliated with the Institute for Robotics and Intelligent Machines (IRIM). His research interests are in visual perception for robotic manipulation, with a focus on deep learning for object grasping and assembly manipulation, soft manipulation, object pose estimation, visual tracking, active perception, visual verification, and 3D registration. Current position/title and workplace: Postdoctoral Associate, MIT Computer Science & Artificial Intelligence Lab (CSAIL)

Robotics

Algorithms and formal verification are two classical areas of computer science. The two fields apply rigorous mathematical proof to seemingly disparate ends---on the one hand, analyzing computational efficiency of algorithms; on the other, designing techniques to mechanically show that programs are correct.

In this talk, I will present a surprising confluence of ideas from these two areas. First, I will show how coupling proofs, used to analyze random walks and Markov chains, correspond to proofs in the program logic pRHL (probabilistic Relational Hoare Logic). This connection enables formal verification of novel probabilistic properties, and provides an structured understanding of proofs by coupling. Then, I will show how an approximate version of pRHL, called apRHL, points to a new, approximate version of couplings closely related to differential privacy, and a new kind of proof by approximate coupling. The corresponding proof technique enables cleaner proofs of differential privacy, both for humans and for formal verification. Finally, I will share some directions towards a possible "Theory AB", blending ideas from both worlds.

Justin Hsu is a final year graduate student in Computer Science at the University of Pennsylvania. He obtained his undergraduate degree in Mathematics from Stanford University. His research interests span formal verification and theoretical computer science, including verification of randomized algorithms, differential privacy, and game theory. He is the recipient of a Simons graduate fellowship in Theoretical Computer Science.

Computer Science

As machine learning applications become larger and more widely used, there is an increasing need for efficient systems solutions. The performance of essentially all machine learning applications is limited by bottlenecks with effects that cut across traditional layers in the software stack.  Because of this, addressing these bottlenecks effectively requires a broad combination of work in theory, algorithms, systems, and hardware. To do this in a principled way, I propose a general approach called mindful relaxation. The approach starts by finding a way to eliminate a bottleneck by changing the algorithm's semantics. It proceeds by identifying structural conditions that let us prove guarantees that the altered algorithm will still work. Finally, it applies this structural knowledge to implement improvements to the performance and accuracy of entire systems.

In this talk, I will describe the mindful relaxation approach, and demonstrate how it can be applied to a specific bottleneck (parallel overheads), problem (inference), and algorithm (asynchronous Gibbs sampling).  I will demonstrate the effectiveness of this approach on a range of problems including CNNs, and finish with a discussion of my future work on methods for fast machine learning.

Christopher De Sa is a PhD candidate in Electrical Engineering at Stanford University advised by Christopher Ré and Kunle Olukotun. His research interests include algorithmic, software, and hardware techniques for high-performance machine learning, with a focus on relaxed-consistency variants of stochastic algorithms such as asynchronous stochastic gradient descent (SGD). He is also interested in using these techniques to construct data analytics and machine learning frameworks that are efficient, parallel, and distributed. Chris's work on studying the behavior of asynchronous Gibbs sampling received the Best Paper Award at ICML 2016.

Machine Learning

Today's most popular applications are deployed as massive-scale distributed systems in the datacenter. Keeping data consistent and available despite server failures and concurrent updates is a formidable challenge. Two well-known abstractions, strongly consistent replication and serializable transactions, can free developers from these challenges by transparently masking failures and treating complex updates as atomic units. Yet the conventional wisdom is that these techniques are too expensive to deploy in high-performance systems.

I will demonstrate a new approach to designing distributed systems that allows strongly consistent distributed systems to be built with little to no performance cost. Taking advantage of the properties and capabilities of the datacenter environment, we can co-design distributed protocols and the network layer. Specifically, I will describe two systems for state machine replication, Speculative Paxos and Network-Ordered Paxos, and one for distributed transaction processing, Eris, built using this approach. They are able to achieve 5-17x performance improvements over conventional designs. Moreover, they achieve performance within 2% of their weakly consistent alternatives, demonstrating that strong consistency and high performance are not incompatible.

Dan Ports is Research Assistant Professor in Computer Science and Engineering at the University of Washington, where he leads the distributed systems research group. His group's research focuses on building practical distributed systems with strong theoretical underpinnings. Prior to joining the faculty at UW in 2015, Dan received the Ph.D. from MIT (2012), where he was advised by Barbara Liskov, and completed a postdoc at UW CSE. His research has been recognized with best paper awards at NSDI and OSDI.

Faculty Host: Andy Pavlo

The rise of new online platforms for capturing many aspects of our daily lives has opened the door for large-scale computational studies of nearly all facets of behavior, such as exercise, collaboration, and community dynamics.  My research aims to understand and predict these kinds of human behaviors by combining techniques from natural language processing and network science to produce holistic models of people and their social interactions.  The first part of this talk focuses on the challenge of inferring the demographics that describe who people are.  Using the example of location inference, I show how we can efficiently and accurately learn these aspects for hundreds of millions of people across the globe.  I also discuss my recent work on algorithmic bias in demographic inference and show how to mitigate inequality for the ubiquitous task of language identification.  The second part of the talk shifts from individuals to their social interactions and I describe my recent work analyzing how people’s offline behavior and communication strategies change when they join online groups.  I conclude by highlighting future directions in computational social science that I am excited to pursue through the combined lens of language and networks.

David Jurgens is postdoctoral scholar in the department of Computer Science at Stanford University and received his PhD from UCLA.  His research combines natural language processing, network science, and data science to discover, explain, and predict human behavior in large social systems. He was recently awarded a Volkswagen Foundation grant for his work on population modeling to measure the influence of international actors on national news topics.  He is currently the Data Co-Chair for ICWSM and his research on demographic inference has been featured in news outlets such as the MIT Technology Review, Forbes, and Business Insider.

Language Technologies

Language is one of the greatest puzzles of both human and artificial intelligence (AI). Language learning happens effortlessly in children; yet, it is a complex process that we do not fully understand. Moreover, although access to more data and computation has resulted in recent advances in AI systems, they are still far from human performance in many language tasks. How do humans learn and represent language? And how can this inform AI?

In this talk, I first explain how computational modeling sheds light on the mechanisms underlying child word learning, which is an early step in language development. I introduce an unsupervised model that learns word meanings and their semantic connections using general cognitive mechanisms; the model processes data that approximates child input and assumes no built-in linguistic knowledge. I show that to account for child word learning, cognitive mechanisms such as memory and attention need to be integrated into the model's statistical learning.  Next, I explain how investigating human semantic processing helps us model semantic representations more accurately. I show that recent neural models of semantics, despite being trained on huge amount of data, fail at capturing important aspects of human similarity judgements. I also show that a probabilistic topic model does not have these problems, suggesting that exploring different representations may be necessary to capture different aspects of human semantic processing.

Aida Nematzadeh is a post-doctoral researcher at the University of California, Berkeley. She received a PhD and an MSc in Computer Science from the University of Toronto in 2015 and 2010, respectively. Aida’s research provides a better understanding of the computational mechanisms underlying the human ability to learn and organize information, with a focus on language learning. Aida has been awarded a NSERC Postdoctoral Fellowship from Natural Sciences and Engineering Research Council of Canada.

Machine Learning

Recent rapid advances in next-generation sequencing (NGS) technologies have dramatically changed our ability to perform genome-scale analyses of human genomes, including variant detection, gene expression analysis, and the study of gene structures.  In this talk, I will describe computational approaches for conducting these analyses, with an emphasis on variant detection.  I will focus on novel methods for representing and searching a significantly expanded model of the human reference genome, in which a comprehensive catalogue of known genomic variants and haplotypes is incorporated into the data structure used for searching and alignment.  This new way of representing a population of genomes, along with a very fast and memory-efficient search capability, allows our algorithm to produce more detailed and accurate variant analyses than previous methods.  I will also show how to use this new method for HLA typing, a critical task in human organ transplantation.

Daehwan Kim, Ph.D., is a post-doctoral research fellow in the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University.  He received his Ph.D. in Computer Science from the University of Maryland, College Park in 2013 under the supervision of Steven Salzberg, currently Director of the Center for Computational Biology at JHU.  Dr. Kim’s expertise is in algorithms and high performance computing approaches in Next Generation Sequencing analysis with a focus on sequence alignment and fusion gene/transcript discovery.

Computational Biology

Pages

Subscribe to SCSFC