go back to the main page

 SSS Abstracts 
Spring 2014

go back to the main page

Understanding the Power of Convex Relaxation Hierarchies: Effectiveness and Limitations

Friday, March 7th, 2014 from 12-1 pm in GHC 4303.

Presented by Yuan Zhou, CSD

Optimization is a broad class of problems arising everywhere in computer science, operations research, etc. Efficient algorithms that approximate the optimal solution are desirable for many optimization problems that seem computationally intractable. Convex programming relaxation hierarchies have proven to be the most powerful tool to design approximation algorithms for optimization problems.

In this talk, I will introduce convex relaxation hierarchies and explain how to understand their computational power. Using several fundamental optimization problems like Max-Cut and Sparsest-Cut as examples, I will illustrate: 1) how to design approximation algorithms via hierarchies, 2) how certain problems are resistant to hierarchies, and their significance, and 3) a surprising connection between hierarchies and the theory of algebraic proof complexity.

In the end, I will propose directions to further explore the power of convex relaxation hierarchies, which hopefully will lead to answer the most important question in approximation algorithms research -- whether the simplest semidefinite programming relaxation is optimal for many optimization problems.

Logic Programming with Temporality, Names and List Comprehension

Friday, March 21st, 2014 from 12-1 pm in GHC 4303.

Presented by Kuen-Bang Hou (Favonia), CSD

Logic programming has been known for its ability to execute specifications as programs directly. The one based on linear logic can further model the states in real systems. Unfortunately, some common concepts are rather inconvenient to express in a typical linear logic. For example, maintaining a global counter for the number of data in a system requires careful manual updating in every rule.

Motivated by social networking sites, I have created a new logic for logic programming which addresses several usability issues in a typical linear logic. The extensions include temporality (which can be used to handle global information), name generation and list comprehension. I also showed that there is a sound and complete compilation from this logic to a known linear logic. This talk will primarily focus on the definition and the usage of the new logic.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement

Separation Logic with Heap Predicates

Tuesday, March 25th, 2014 from 12-1 pm in GHC 4303.

Presented by Salil Joshi, CSD

Separation logic has proven to be a succesful technique for reasoning about programs that manipulate pointers. Various extensions have been proposed over the years for dealing with concurrent programs.

In this talk I will present a new form of separation logic which provides a particularly simple and elegant treatment of both concurrency and aliasing by changing how we look at the heap. I move away from the traditional separation logic approach of talking about the precise contents of heap locations, and instead talk about predicates that those contents must satisfy. This shift in viewpoint is realized by a new form of assertion called a heap predicate. Heap predicates enable local reasoning about aliasing and concurrency, including concurrent programs with data races. As an example, I will demonstrate how locks can be programmed using more primitive operations, and specified using heap predicates.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement

Joint Work with Karl Crary and Ligia Nistor

Detecting associations between genetic variants and output traits using prior biological knowledge

Friday, March 28th, 2014 from 12-1 pm in GHC 4303.

Presented by Seunghak Lee, CSD

One of the fundamental problems in computational biology is to detect genetic variants associated with output traits such as disease status, heights, or gene expressions. However, detecting trait-associated genetic variants has been a challenging problem because in practice, we do not have enough statistical power to detect them reliably; we usually have a small number of samples with a large number of genetic variants.

In this work, we present a novel method that uses prior biological knowledge to boost the statistical power of detecting genetic variants associated with traits. Specifically, we use biological knowledge about groups of correlated genetic variants (e.g. genetic variants in linkage disequilibrium) and groups of correlated traits (e.g. co-expressed genes). Given the grouping information, we assume that a group of correlated traits may be affected by the common genetic variants, or a group of genetic variants may affect the common traits. Under such assumptions, we incorporate the biological knowledge into a sparse regression model using L1/L2 penalties. We illustrate our approach with examples, and show how prior biological knowledge helps increase the power to detect associations between genetic variants and traits.

This is joint work with Eric Xing.

This will be presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Assessing the Global Cyber and Bio Threat

Tuesday, April 8th, 2014 from 12-1 pm in GHC 4303.

Presented by Ghita Mezzour, ECE & ISR

Cyber-attacks are an international problem in today’s inter-connected world. Cyber threat from any country is a threat and concern to everyone. Identifying the most threatening countries is an important step towards reducing the global cyber threat. Two types of countries pose great cyber security threat. The first type of countries offers favorable conditions to hosting cyber-criminal infrastructure. In the second type of countries, governments develop cyber warfare capabilities. These governments are even more threatening if they also have a weapon of mass destruction capability such as a bioweapon capability.

In this work, I identify countries that extensively host cyber-criminal infrastructure, and reasons why these countries are attractive for hosting such infrastructure. In my analysis, I use Symantec WINE data which consist of attack reports from more than 10 million Symantec customer computers worldwide. I also present a methodology for assessing countries' cyberwarfare and bioweapon capabilities. My methodology assesses both countries' motivations and latent abilities. In my assessment, I use socio-political data, and measures of countries' cyber and bio expertise and infrastructure.

Reasoning about Declassification in Epistemic Logic

Tuesday, April 22nd, 2014 from 12-1 pm in GHC 4303.

Presented by Arbob Ahmad, CSD

Noninterference (NI) defines a program to be secure if changes to high-security inputs cannot alter low-security outputs. NI indirectly states the epistemic property that the low-security outputs do not include knowledge obtained from the high-security inputs. However, NI is too restrictive a property to incorporate declassification of some high-security inputs and not others. By reasoning about declassification in epistemic logic, we express what knowledge is obtained through the execution of a program by the declassifications. Therefore, as a consequence of the adequacy of the epistemic logic for modeling the program behavior, we can still derive some NI-like properties for those high inputs that have not been declassified. In this talk, I will present our formulation of declassification and explain how it affects the epistemic properties of programs.

Joint work with Robert Harper

In partial fulfillment of the Speaking Requirement

Scheduling and Partitioning Big Models for Parallel Machine Learning

Tuesday, May 6th, 2014 from 12-1 pm in GHC 4303.

Presented by Seunghak Lee, CSD

When training machine learning models with many variables, a single machine is often inadequate since (1) the variables may be too large to fit in memory, and (2) training can take a long time. A natural choice is to turn to distributed computing on a cluster. However, naive parallelization of machine learning algorithms can make inefficient use of distributed memory, while failing to obtain convergence speed-ups or inference correctness. Such inefficiencies are often caused by dependencies between variables.

In this talk, we will explore partitioning and update scheduling of variables in distributed machine learning algorithms, in order to improve their memory efficiency and speed up convergence without compromising correctness. Specifically, I will describe how partitioning and scheduling strategies can scale up machine learning models including Lasso, Matrix Factorization, and Latent Dirichlet Allocation with many variables. The experiments will demonstrate the speed and scalability of our approach in comparison to other parallel machine learning methods.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Web contact: sss+www@cs