On Nash-equilibria of approximation-stable games. With Nina Balcan, Avrim Blum, Or Sheffet and Santosh Vempala. SAGT 2010.
In this paper, we define the notion of games that are approximation-stable, meaning that all $\epsilon$-approximate
equilibria are contained inside a small ball of radius $\delta$
around a true equilibrium, and investigate a number of their
properties. Many natural small games such as matching pennies and
rock-paper-scissors are indeed approximation stable. We show
furthermore that there exist
2-player n-by-n approximation-stable games in which the Nash
equilibrium and all approximate equilibria have support $\Omega(log
n)$. In addition, we also show that all $(\epsilon,\delta)$
approximation-stable games must have an $\epsilon$-approximate equilibrium of
support $O(\frac{\delta^{2-o(1)}}{\epsilon^2}log n)$, yielding a
faster algorithm for computing approximate Nash-equilibria, for games
which are approximation-stable.
Stability yields a PTAS for k-median and k-means clustering. With Avrim Blum and Or Sheffet. FOCS 2010.
Ostrovsky et al. [ORSS'06]
showed that if a clustering instance satisfies a stability condition, namely that the optimal (k-1)-means clustering is more
expensive than the optimal k-means clustering by a factor of
$max{100, 1/\alpha^2}$, then one can achieve a
$(1+f(\alpha))$-approximation to the k-means optimal in time
polynomial in n and k by using a variant of Lloyd's algorithm. In this paper we improve on their result by showing that
if the (k-1)-means optimal
is more expensive than the k-means optimal by a factor $1+\alpha$
for some constant $\alpha>0$, then we can obtain a PTAS for the k-means problem.
Our technique can also be used to obtain a PTAS under the assumption
considered by [Balcan et al.'09] that all $(1+\alpha)$
approximations are $\delta$-close to a desired target clustering, when
all target clusters have size greater than $2\delta n$.
Improved guarantees for agnostic learning of disjunctions. With Avrim Blum and Or Sheffet. COLT 2010.
Given a distribution D over {0,1}^n X {-1,+1}, define OPT to be the error achieved by the best disjunction with respect to D. The problem
is to come up with a polynomial time algorithm which uses random samples from D to construct a prediction rule whose error
is atmost (c.OPT + \epsilon), for any \epsilon > 0. In this paper we show how to achieve c = O(n^{1/3 + \alpha}), for any \alpha > 0. This improves on the previous best
bound of O(\sqrt(n)) by [Peleg'07].
Supervised clustering. With Reza Bosagh Zadeh. NIPS 2010.
This paper studies the clustering framework proposed in [Balcan, Blum'08]. In this
framework we assume that a given set of m points belongs to k target clusters and each cluster is defined by some concept in a concept class C.
The goal of the clustering algorithm is to interact with the user and figure out the target clustering. This paper gives a simple query-efficient
algorithm to cluster any concept class in this model. We also study two natural generalizations of the original model.
Online stochastic optimization in the large: Application to kidney exchange. With Tuomas Sandholm. IJCAI 2009.
This paper studies trajectory-based online algorithms for solving large stochastic programs. We apply these algorithms to match patients
in large kidney exchanges.
Decision trees for entity identification: Approximation algorithms and hardness results. With Venkatesan Chakaravarthy, Vinayaka Pandit, Sambuddha Roy and
Mukesh Mohania. PODS 2007.
Consider the following problem: Given an input table containing information about a set of entities over a fixed set of
attributes and a probability distribution over the entities, construct a decision tree that identifies each entity unambiguously
by testing the attribute values such that the average number of tests is minimized. Building on prior work ([Adler, Heeringa'05]) we study the general version of the problem
involving arbitrary input tables and arbitrary probability distribution over the set of entities. We consider a natural greedy algorithm and prove an approximation guarantee of
O(r_K log(N)), where N is the number of entities, K is the maximum number of distinct values of an attribute, and r_K is a suitably
defined Ramsey number. In addition, our analysis indicates a possible way of resolving a Ramsey theoretic conjecture by Erdos.
Image modeling using tree structured conditional random fields. With Aakanksha Gagrani and Balaraman Ravindran. IJCAI 2007.