Virgi's Research

Virginia Vassilevska's Research

Finding weighted subgraphs or paths:

My IPL 2008 paper concerns space efficient combinatorial algorithms for finding a k-clique in a graph. It gives the first o(n^k) algorithm for the problem which uses O(n^c) space for any constant c>0 independent of k. The exact runtime is O(n^k/log^{k-1} n). The paper also gives an efficient algorithm for finding so called semicliques.
My ICALP 2008 paper develops a data structure which maintains a Boolean matrix under row/column updates, supporting fast multiplication with Boolean vectors, where the product runtime depends on the sparsity of the vector. This data structure allows us to obtain fast combinatorial algorithms for problems like graph transitive closure, all pairs shortest paths in undirected unweighted graphs and DAG least common ancestors. For many of these problems we obtain the first nontrivial combinatorial improvements for some graph sparsities. The paper is joint with Guy Blelloch and Ryan Williams. Preliminary version: [pdf]
My SODA 2008 paper is on finding nondecreasing paths in an edge-weighted graph. The problem is defined as follows - one is given a source and destination vertex pair and is asked to find a path from the source to the destination for which the consecutive edge weights on the path form a nondecreasing sequence, and the last edge weight on the path is minimized over all paths from the source to the destination. The problem was first considered by G. Minty in 1958 in connection to train scheduling. When given a schedule of trains between various source-destination pairs, a trip is desired which takes one from a particular source to a particular destination so that one arrives at the destination as early as possible. It is easy to see that one can model the problem with an edge weighted graph for which the edge weights are times, so that a valid train trip must have nondecreasing times, and the earliest possible arrival is the minimum weight of a nondecreasing path to the destination. In my paper I show that in the RAM model of computation there is a relatively simple linear time algorithm for the single source version of the problem. To obtain linear time I use the atomic heaps data structure of Fredman and Willard. If instead one uses a binary tree structure (e.g. splay trees or treaps) the running time would be O(m log log n). Even though it is asymptotically weaker, this implementation would be more practical. In the paper I also show that the all-pairs version of the problem is solvable in truly subcubic time (roughly O(n^2.9)). The paper appears in the proceedings of SODA 2008 and was invited to the special issue in TALG. Here is a preliminary version: [pdf].
Another recent paper of mine concerns finding the maximum bottleneck paths between all pairs of vertices in a graph. The edge weights are arbitrary real numbers, and a bottlneck edge on a path is the edge of minimum weight. We show how to compute all-pairs-bottleneck-paths in subcubic time, roughly O(n^2.8). Our algorithm is similar to our first weighted triangle algorithm, uses fast matrix multiplication and is strongly polynomial. The result is interesting because the problem is closely related to APSP. This is joint work with Ryan Williams and Raphael Yuster. Our paper appears in the proceedings of STOC'07. (preliminary version: [ps] [pdf])
My paper (joint with Ryan Williams and Raphael Yuster) "Finding the smallest H-subgraph in real weighted graphs and related problems" improves on our previous result. For example, we show how to find a maximum node weighted triangle in O(n^2.6), and moreover, in the same time we can find for every pair of vertices the maximum weighted triangle which goes through them. We generalize the result for any given subgraph type of fixed size. The paper is included in the proceedings of ICALP'06 [Springer] (preliminary version:[pdf]).
In a joint paper with Ryan Williams, I show how to find a maximum node weighted triangle (with arbitrary weights) in O(n^2.7). This is the first algorithm to beat the naiive cubic running time. We show some applications of the result and present some progress towards obtaining a similar result for the edge weighted case. One of our applications is on a stable matching problem in computational economics. Our paper appears in the proceedings of STOC'06. [ACM] (preliminary version: [ps] [pdf])
The above two papers and some more results were combined in a journal version to appear in TALG. Here is a preliminary version: [ps] [pdf].

Approximation, inapproximability and hybrid algorithms:

In 2005, together with Ryan Williams and Maverick Woo I wrote a paper on hybrid algorithms [ACM] (preliminary version: [pdf] [ps]). Our paper appears in the proceedings of SODA'06. Our model of a hybrid algorithm includes a set of heuristics and a selector which on a preliminary scan of the input quickly decides which heuristic to run. The main idea is that by allowing several heuristics each of which works well with respect to a different performance measure, one can obtain overall performance guanrantees which overcome even the known hardness of the problem on each of the respective measures separately. The catch is that you will run different heuristics on different instances. For example, you may solve some exactly in subexponential (but perhaps superpolynomial) time, and you may have to approximate some in poly time but with a better approximation ratio.
My work in 2005 also includes an improvement on the inapproximability of the shortest superstring problem which besides giving a slightly better constant than the one already known, shows that certain restricted instances of the problem are as hard to approximate as the general case. My paper can be found in the proceedings of MFCS'05 ([Springer]). Preliminary version: [ps].
In the summer of 2003 I was involved in a project on finding nonoverlapping dense blocks in a matrix which would improve sparse matrix multiplication. I worked on this with Ali Pinar in the Lawrence Berkeley National Lab. The abstract problem formulation is as follows. Given a boolean matrix A and a smaller boolean matrix P called the pattern, one wishes to cover all nonzeroes of A with as many nonoverlapping subpatterns as possible. The problem is NP-hard for most patterns. For so called dense blocks (pattern matrices of all ones) there is a trivial 2-approximation. For 2x2 dense blocks a PTAS is known. Nevertheless, we give a very efficient 2/3 approximation. For more on this check out the journal paper in the special issue of ETNA on Combinatorial Scientific Computing, or my LBL Tech Report.

Dynamic algorithms and geometric data structures:

Daniel Golovin, Guy Blelloch and I worked on designing uniquely represented data structures for problems in computational geometry. The problem is to have data structures the memory representation of which is history-independent, i.e. it only depends on the current state of the data structure but not on the order in which it was updated. We give some interesting results for problems such as orthogonal range queries and 2D dynamic convex hull. Our bounds match the best known for these problems using not necessarily uniquely represented data structures. Our paper appears in the proceedings of SWAT'08.
Together with my advisor, I investigated a new approach to the incremental 2-dimensional orthogonal range query problem in the so called list model. In this model the x and y-coordinates of the stored points are also kept in lists according to their sorted order, so that when a new point (A,B) is to be added, it is given together with the predecessors A' and B' of A and B in the sorted coordinate lists. In this very general model we give the first doubly logarithmic query runtime for a linear space data structure allowing inserts in polylogarithmic time. Our bounds are as follows: when n points are stored, the storage is O(n), the insertion time is O(log n loglog n) and the query time is O(kloglog n) where k is the size of the output. Our work uses a data structure for a problem called "ordered subsets". In our paper on uniquely represented computational geometry data structures this problem reappears, and we are able to convert the original data structure into a uniquely represented one.
I was also involved on projects on dynamic planar point location and traceable data structures. My collaborators in these two projects are my advisor Guy Blelloch, Srinath Sridhar and Umut Acar. In the first project we show how to automatically dynamize a static algorithm for planar point location using the framework of Acar et al. We obtain relatively good bounds. For the second project, we extend the persistent data structures of Driscoll et al. to support tracking of all node accesses. In particular, we support efficient operations which, given a version of a node v, return the earliest version of the node after v at which the node has been accessed (say, written)([pdf]). Unfortunately, this framework turned out to be equivalent to dynamic fractional cascading.

A democratic addition to web search engines:

Together with Michelle Goodstein I designed a game which lets users rate search results for queries. The game is supposed to tell us whether some pages appearing high in a search engine's ranking is webspam. Preliminary results show that the game is lots of fun. Furthermore, it is designed so that it is strategy-proof, i.e. the best strategy for a player is to answer honestly. If a player scores badly most of the time, then this can be taken as evidence that they are either not honest, or are bad at rating query-result pairs. In any case, we can collect only the answers of "good" players and can be reasonably certain that these answers are truthful. In a conservative but democratic vote processing scheme we strive to eliminate web spam. Check out our tech report: [pdf], and our conference submission: [pdf].

Older projects:

On the crossing number of K_{9, 9} : In the summer of 2002 I worked together with Mark Bilinski and Richard Wilson on verifying the Zarankiewicz conjecture for the special case K_{9, 9}. This is the smallest symmetric unverified case. We wrote a program through which we obtained some interesting results which suggest that the conjecture holds for this graph and possibly for K_{7, 9}. A large amount of the project was devoted to optimizing the program which ran in exponential time. Here is my final progress report.
Coding and Graceful Labeling of Trees : In the summer of 2001 under the guidance of Richard Wilson I attempted to make some progress on the Ringel-Kotzig conjecture which states that every tree has a graceful labeling. I read every single paper I could find on the subject. The problem is very tough. I made some partial progress but unfortunately most of the things I proved were already known. Together with my friend Kaloyan, I coded up 3 programs which labeled trees gracefully - one was an exhaustive search, another was a known algo by Aldred and McKay, and a third used a simulated annealing approach. The simulated annealing program performed quite well. During the same summer I studied several dynamic Huffman coding algorithms, implemented and compared them. Here is my final report.

Go back to the Main Page.