\ David Wajc's Homepage


Oct. '18: My paper with Mohsen Ghaffari, “Simplified and Space-Optimal Semi-Streaming for (2+ε)-Approximate Matching” was accepted to SOSA 2019.
Sep. '18: I am visiting Ola Svensson at EPFL for the Fall Semester.
Apr. '18: My paper with Bernhard Haeupler and Ellis Hershkowitz, “Round- and Message-Optimal Distributed Graph Algorithms” was accepted to PODC 2018.
Apr. '18: My paper with Moab Arar, Shiri Chechik, Sarel Cohen and Cliff Stein, "Dynamic Matching: Reducing Integral Algorithms to Approximately-Maximal Fractional Algorithms", was accepted to ICALP 2018.
Apr. '18: My paper with Anupam Gupta, Guru Guruganesh and Amit Kumar, "Fully-Dynamic Bin Packing with Limited Repacking", (to be merged with "A Tight Approximation for Fully Dynamic Bin Packing without Bundling", by Björn Feldkord, Matthias Feldotto and Sören Riechers) was accepted to ICALP 2018.
Nov. '17: My paper with Ariel Procaccia and Harnui Zhang, "Approximation-Variance Tradeoffs in Facility Location Games", was accepted to AAAI 2018.
Oct. '17: My paper with Ilan R. Cohen, "Randomized Online Matching in Regular Graphs", was accepted to SODA 2018.

Contact Information

GHC 7001, Carnegie Mellon University

Filter by type:

Sort by year:

Simplified and Space-Optimal Semi-Streaming (2+ε)-Approximate Matching

Mohsen Ghaffari, David Wajc
Conference PaperSOSA 2019


In a recent breakthrough, Paz and Schwartzman (SODA'17) presented a single-pass (2+ɛ)-approximation algorithm for the maximum weight matching problem in the semi-streaming model. Their algorithm uses O(n log^2 n) bits of space, for any constant ɛ>0.

We present two simplified and more intuitive analyses, for essentially the same algorithm, which also improve the space complexity to the optimal bound of O(n log n) bits --- this is optimal as the output matching requires \Omega(n log n) bits. Our analyses rely on a simple use of the primal dual method and a simple accounting method.

Round- and Message-Optimal Distributed Graph Algorithms

Bernhard Haeupler, D. Ellis Hershkowitz, David Wajc
Conference PaperPODC 2018


Distributed graph algorithms that separately optimize for either the number of rounds used or the total number of messages sent have been studied extensively. However, algorithms simultaneously efficient with respect to both measures have been elusive. For example, only very recently was it shown that for Minimum Spanning Tree (MST), an optimal message and round complexity is achievable (up to polylog terms) by a single algorithm in the CONGEST model of communication.

In this paper we provide algorithms that are simultaneously round- and message-optimal for a number of well-studied distributed optimization problems. Our main result is such a distributed algorithm for the fundamental primitive of computing simple functions over each part of a graph partition. From this algorithm we derive round- and message-optimal algorithms for multiple problems, including MST, Approximate Min-Cut and Approximate Single Source Shortest Paths, among others. On general graphs all of our algorithms achieve worst-case optimal ˜O(D+sqrt{n}) round complexity and ˜O(m) message complexity. Furthermore, our algorithms require an optimal ˜O(D) rounds and ˜O(n) messages on planar, genus-bounded, treewidth-bounded and pathwidth-bounded graphs.

Fully-Dynamic Bin Packing with Limited Repacking

Anupam Gupta, Guru Guruganesh, Amit Kumar, David Wajc
Conference PaperICALP 2018


We study the classic Bin Packing problem in a fully-dynamic setting, where new items can arrive and old items may depart. We want algorithms with low asymptotic competitive ratio while repacking items sparingly between updates. Formally, each item i has a movement cost ci≥0, and we want to use α·OPT bins and incur a movement cost γ·ci, either in the worst case, or in an amortized sense, for α, γ as small as possible. We call γ the recourse of the algorithm. This is motivated by cloud storage applications, where fully-dynamic Bin Packing models the problem of data backup to minimize the number of disks used, as well as communication incurred in moving file backups between disks. Since the set of files changes over time, we could recompute a solution periodically from scratch, but this would give a high number of disk rewrites, incurring a high energy cost and possible wear and tear of the disks. In this work, we present optimal tradeoffs between number of bins used and number of items repacked, as well as natural extensions of the latter measure.

Dynamic Matching: Reducing Integral Algorithms to Approximately-Maximal Fractional Algorithms

Moab Arar, Shiri Chechik, Sarel Cohen, Cliff Stein, David Wajc
Conference PaperICALP 2018


We present a simple randomized reduction from fully-dynamic integral matching algorithms to fully-dynamic “approximately-maximal” fractional matching algorithms. Applying this reduction to the recent fractional matching algorithm of Bhattacharya, Henzinger, and Nanongkai (SODA 2017), we obtain a novel result for the integral problem. Specifically, our main result is a randomized fully-dynamic (2+ɛ)-approximate integral matching algorithm with small polylog worst-case update time. For the (2+ɛ)-approximation regime only a fractional fullydynamic (2+ɛ)-matching algorithm with worst-case polylog update time was previously known, due to Bhattacharya et al. (SODA 2017). Our algorithm is the first algorithm that maintains approximate matchings with worst-case update time better than polynomial, for any constant approximation ratio. As a consequence, we also obtain the first constant-approximate worst-case polylogarithmic update time maximum weight matching algorithm.

Randomized Online Matching in Regular Graphs

Ilan R. Cohen, David Wajc
Conference PaperSODA 2018


In this paper we study the classic online matching problem, introduced in the seminal work of Karp, Vazirani and Vazirani (STOC 1990), in regular graphs. For such graphs, an optimal deterministic algorithm as well as efficient algorithms under stochastic input assumptions were known. In this work, we present a novel randomized algorithm with competitive ratio tending to one on this family of graphs, under adversarial arrival order. Our main contribution is a novel algorithm which achieves competitive ratio 1-O(sqrt(log d)/ sqrt(d)) in expectation on d-regular graphs. In contrast, we show that all previously-studied online algorithms have competitive ratio strictly bounded away from one. Moreover, we show the convergence rate of our algorithm’s competitive ratio to one is nearly tight, as no algorithm achieves competitive ratio better than 1-O(1/sqrt(d)). Finally, we show that our algorithm yields a similar competitive ratio with high probability, as well as guaranteeing each vertex a probability of being matched tending to one.

Approximation-Variance Tradeoffs in Facility Location Games

Ariel Procaccia, David Wajc, Hanrui Zhang
Conference PaperAAAI 2018


We revisit the well-studied problem of constructing strategyproof approximation mechanisms for facility location games, but offer a fundamentally new perspective by considering risk averse designers. Specifically, we are interested in the tradeoff between a randomized strategyproof mechanism’s approximation ratio, and its variance (which has long served as a proxy for risk). When there is just one facility, we observe that the social cost objective is trivial, and derive the optimal tradeoff with respect to the maximum cost objective. When there are multiple facilities, the main challenge is the social cost objective, and we establish a surprising impossibility result: under mild assumptions, no smooth approximation-variance tradeoff exists. We also discuss the implications of our work for computational mechanism design at large.

A Faster Distributed Radio Broadcast Primitive (Extended Abstract)

Bernhard Haeupler, David Wajc
Conference PaperPODC 2016


We present a faster distributed broadcasting primitive for the classical radio network model.

The most basic distributed radio network broadcasting primitive - called Decay - dates back to PODC'87 result of Bar-Yehuda, Goldreich, and Itai. In any radio network with some informed source nodes, running Decay for O(d log n + log^2 n) rounds informs all nodes at most d hops away from a source with high probability. Since 1987 this primitive has been the most important building block for implementing many other functionalities in radio networks. The only improvements to this decades-old algorithm are slight variations due to [Czumaj, Rytter; FOCS'03] and [Kowalski and Pelc, PODC'03] which achieve the same functionality in O(d log(n/d) + log^2 n) rounds. To obtain a speedup from this, d and thus also the network diameter need to be near linear, i.e., larger than n^(1-eps).

Our new distributed primitive spreads messages for d hops in O(d (log n log log n)/(log d) + log^{O(1)} n) rounds with high probability. This improves over Decay for any super-polylogarithmic d = log^{omega(1)} n and achieves near-optimal O(d log log n) running time for d = n^eps. This also makes progress on an open question of Peleg.

Near-Optimum Online Ad Allocation for Targeted Advertising

Joseph (Seffi) Naor, David Wajc
Conference PaperEC 2015. Invited to TEAC Special Issue for EC 15


Motivated by Internet targeted advertising, we address several ad allocation problems. Prior work has established these problems admit no randomized online algorithm better than (1-1/e)-competitive ([Karp et al. 1990; Mehta et al. 2007]), yet simple heuristics have been observed to perform much better in practice. We explain this phenomenon by studying a generalization of the bounded-degree inputs considered by [Buchbinder et al. 2007), graphs which we call (k,d)-bounded. In such graphs the maximal degree on the online side is at most d and the minimal degree on the offline side is at least k. We prove that for such graphs, these problems' natural greedy algorithms attain competitive ratio 1-(d-1)/(k+d-1), tending to one as d/k tends to zero. We prove this bound is tight for these algorithms.

Next, we develop deterministic primal-dual algorithms for the above problems achieving competitive ratio 1-(1-1/d)k>1-1/e^{k/d}, or exponentially better loss as a function of k/d, and strictly better than 1-1/e whenever k ≥ d. We complement our lower bounds with matching upper bounds for the vertex- weighted problem. Finally, we use our deterministic algorithms to prove by dual-fitting that simple randomized algorithms achieve the same bounds in expectation. Our algorithms and analysis differ from previous ad allocation algorithms, which largely scale bids based on the spent fraction of their bidder's budget, whereas we scale bids according to the number of times the bidder could have spent as much as her current bid. Our algorithms differ from previous online primal-dual algorithms, as they do not maintain dual feasibility, but only primal-to-dual ratio, and only attain dual feasibility upon termination. We believe our techniques could find applications to other well- behaved online packing problems.

You Will Get Mail! Predicting the Arrival of Future Email

Iftah Gamzu, Zohar Shay Karnin, Yoelle Maarek, David Wajc
Conference PaperWWW (Companion Volume) 2015


The majority of Web email is known to be generated by machines even when one excludes spam. Many machine-generated email messages such as invoices or travel itineraries are critical to users. Recent research studies establish that causality relations between certain types of machine- generated email messages exist and can be mined. These relations exhibit a link between a given message to a past message that gave rise to its creation. For example, a shipment notification message can often be linked to a past online purchase message. Instead of studying how an incoming message can be linked to the past, we propose here to focus on predicting future email arrival as implied by causality relations. Such a prediction method has several potential applications, ranging from improved ad targeting in up sell scenarios to reducing false positives in spam detection.

We introduce a novel approach for predicting which types of machine-generated email messages, represented by so-called "email templates", a user should receive in future time windows. Our prediction approach relies on (1) statistically inferring causality relations between email templates, (2) building a generative model that explains the inbox of each user using those causality relations, and (3) combining those results to predict which email templates are likely to appear in future time frames. We present preliminary experimental results and some data insights obtained by analyzing several million inboxes of Yahoo Mail users, who voluntarily opted-in for such research.

Best-response dynamics out of sync: complexity and characterization

Roee Engelberg, Alex Fabrikant, Michael Schapira, David Wajc
Conference PaperEC 2013


In many computational and economic models of multi-agent interaction, each participant repeatedly "best-responds" to the others' actions. Game theory research on the prominent "best-response dynamics" model typically relies on the premise that the interaction between agents is somehow synchronized. However, in many real-life settings, e.g., internet protocols and large-scale markets, the interaction between participants is asynchronous. We tackle the following important questions: (1) When are best-response dynamics guaranteed to converge to an equilibrium even under asynchrony? (2) What is the (computational and communication) complexity of verifying guaranteed convergence? We show that, in general, verifying guaranteed convergence is intractable. In fact, our main negative result establishes that this task is undecidable. We exhibit, in contrast, positive results for several environments of interest, including complete, computationally-tractable, characterizations of convergent systems. We discuss the algorithmic implications of our results, which extend beyond best-response dynamics to applications such as asynchronous Boolean circuits.

On the complexity of vertex-coloring edge-weightings

Andrzej Dudek, David Wajc
Journal Paper Discrete Mathematics & Theoretical Computer Science 2011


Given a graph G=(V,E) and a weight function w:E →R, a coloring of vertices of G, induced by w, is defined by χw(v) = ∑_{e∋v} w(e) for all v∈V. In this paper, we show that determining whether a particular graph has a weighting of the edges from {1,2} that induces a proper vertex coloring is NP-complete.

Negative Association:

Definition, Properties, and Applications


In these notes we present the notion of Negative Association, discuss some of its useful properties, and end with some example applications. The slogan to bear in mind here is “independent, or better”.

I have had the pleasure of teaching various courses, both at the Technion and CMU.

At Carnegie Mellon

Probability and Computing (15-359/659):
Spring 2015

At the Technion:

Data Structures 1 (234218):  

Summer 2011Summer 2010

Introduction to Systems Programming (234122): 
Spring 2009

Introduction to CS (234114): 

``How do you pronounce 'Wajc'?", you ask?
Answer: The second syllable of the words e-vites and invites.
Using the International Phonetic Alphabet, Wajc would be spelled [vajts].

Still unsure? Check out this recording of Kira Radinsky:

Here's a subset of a Polish transcription table to explain how those letters are supposed to represent those sounds.

Ww       v     as in "vat"
  Aa     o     as in "hot"
  Jj     y     as in "yes"
  Cc     ts     as in "bits"

And before you ask: no, I don't speak Polish. Here's a relevant meme by Krzysztof Onak: