15-854 4/19/00 The k-server problem ====================================== - Recall k-server problem: we control k "servers", each located at some point in a metric space. Get a sequence of requests to points in the space. Each time we get a request, we need to move one of our servers there. Generalizes paging, weighted caching, and other problems. - Will consider on finite metric space of n>>k points for simplicity (results don't depend on space being finite, but it simplifies the arguments) - Can view as MTS problem on {n \choose k} states. In each task vector, task costs are 0 or infinity. So, only paying movement costs. - k-server conjecture: can you achieve C.R. of k? O(k)? poly(k)? f(k)? - note, det MTS results no good since they scale like n^k. Actually, the randomized MTS alg gets close: poly(k*log(n)). First result by Fiat-Rabani-Ravid: k^O(k). [complicated alg] Improvement by Grove: 2^k. [harmonic alg] Koutsoupias-Papadimitriou: 2k-1 [work-function alg] We will go through argument in more recent paper by Koutsoupias, that simplifies things a bit. Big open question: can you use randomization to get o(k)? ========================================================================= Before going into WF alg, let's start with some simpler algs and see why they don't work. Strawman alg #1: greedy. Always move the closest server. What's a bad example? Strawman alg #2: OPT-chasing. Compute opt cost in hindsight of ending in each of the k possible configurations you get by moving one server to the request. Move the server that leads you to the one that's smallest. This can do badly too. (k=3, 4 pts like this: **.................**) Work-function algorithm: Say you're in state X and consider the configurations you get by moving one server to the request. Go to state Y that minimizes OPT(Y) + d(X,Y). I.e., you're minimizing the sum of distance traveled and the OPT cost to end there. [Note: we will use OPT(X) to denote the optimal cost of ending in config X. The papers use w(X) and call w the "work-function". I'll just call it OPT. I will also say "state" and "configuration" interchangeably] Let's recall MTS setting. There we looked at the WF alg a little differently. Let's just check that they are the same thing. There we said "if the current task causes your state to become pinned, move to a state that's pinning you and is not pinned by anyone else". X is "pinned" by Y if OPT(X) = OPT(Y) + d(X,Y). Our motivation was that adversary could penalize X without increasing OPT values, so we'd better not be there, and if we move in this way, then we can charge off our distance moved to the increase in OPT. Why they are the same: point is that since cost of servicing task in X is infinity, we know X becomes pinned, and in fact it is pinned by one of those k configurations. [why?] The argmin is the one pinning us. The MTS view actually gives us a nice fact about the WF alg which will be the starting point of the analysis. Say at time i-1, the alg is in state X. Then we get request i, OPT(X) goes up by some amount up_i, so we move to some new state Y. Let "UP" be the sum of the up_i. Since every time we move some distance d, we decrease OPT at our state by d (by def of the algorithm), we know the alg's cost is <= UP. In fact, we can never get below the global OPT, so this means that UP >= alg + OPT. So, all we need to do is prove that UP <= 2k*OPT + const in order to get the 2k -1 bound. ---------- Simplification #1. Instead of keeping track of the state of the online alg, let's prove something harder, namely that sum_i max_X(OPT_i(X) - OPT_{i-1}(X)) <= 2k*OPT + const. I.e., instead of using the increase in the state of the algorithm, we're using the largest increase of OPT (the "work function") in that time step. The LHS here is sometimes called the "pseudocost" of the algorithm. This simplifies things since we no longer need to think about the algorithm anymore. ---------- Simplification #2. Which state X gives the max in the above expression? It turns out that by augmenting the space a bit, we can get a handle on which state it is, as a function of just the current request. Idea: we'll augment the metric space so that every point has an "antipode". Let D be the diameter of the space. The antipode \bar{a} of a point a has the property that: 1. d(a, \bar{a}) = D 2. for all points b, d(a,b) + d(b,\bar{a}) = D. [yes, 2 implies 1] For instance, think of a circle. How to augment? Just create a new space \bar{M} that looks exactly like the original space M, and set d(b,\bar{a}) as in the above definition. Let's just verify that we haven't violated anything. Distances > 0. Triangle inequality: d(a, \bar{b}) <= d(a,c) + d(c, \bar{b})? RHS = d(a,c) + (D - d(c,b)) >= D - d(a,b) [since d(a,c)+d(a,b) >= d(c,b)] Also, make sure thigns are symmetric: d(a,b) = d(\bar{a},\bar{b}) by design. d(a,\bar{b}) = d(\bar{a},b) too. Here's the claim: if the ith request is to some point r, then the max increase of OPT occurs at the state X consisting of all k servers at \bar{r}. [We'll get back to proving or at least verifying some small examples of this at the end. For now, let's believe it and go on.] So, let's use r_i to denote the ith request, and define X_i to be the state consisting of all k servers at the point \bar{r_i}. So, we've simplified the problem to proving that: sum_i (OPT_i(X_i) - OPT_{i-1}(X_i)) <= 2k*OPT + const. We'll prove for const = k^2 * D. [note: can make proof work without need for finite number of points or even bounded diameter, but this just simplifies our lives] --------- Let's prove it. Easier to view our goal this way: sum_i OPT_i(X_i) <= sum_i OPT_{i-1}(X_i) + 2k*OPT + (k^2 * D) It's acutally now pretty simple. Look at each OPT_i(X_i) separately. Let's look at how the optimal in hindsight for the entire sequence actually serviced requests. Where did the server that served r_i go next? Two possibilities: Possibility #1: nowhere. r_i was a terminal point for that server. In that case, we'll just use the simple fact that OPT_i(X_i) can't be much more than the overall OPT in the end. At worst, it's OPT + k*D. Since there are at most k of these terminal values of i, this costs us a total of k*OPT + k^2 * D. (That's where the additive constant and half of the 2k comes from). Possibility #2: it goes to r_j for some j>i. In that case, we'll bound OPT_i(X_i) by using the fact that OPT_i(X_i) <= OPT_{j-1}(X_j) + k*d(r_i, r_j) This is just because one way of ending at X_i after serving the first i requests is to actually serve even more requests (the first j-1, where j-1 >= i) and end at X_j, and then move the state from X_j to X_i. Notice that d(X_j,X_i) = k*d(r_i,r_j). This actually finishes it off since we've matched each OPT term on the LHS to a different OPT term on the RHS, and the sum of all the additive portions sum_i k*d(r_i,r_j) = k*OPT. So, we're all done modulo that simplification we made. [Note: seems like ought to be room for improvement with possibility #1 since we were so sloppy] ---------- Last part: why was simplification #2 OK? --> do some examples. --> comes from "quasiconvexity". For any time i, any states X and Y, we have: for any point x in X, there exists y in Y such that OPT_i(X) + OPT_i(Y) >= OPT_i(X - x + y) + OPT(Y + x - y). --> why it's true - find an alternating path --> how to use quasiconvexity