15-854 Approximation and Online Algorithms	02/21/00

* "buy at bulk network design" problem
* approximating metrics with tree metrics

===========

Last time: looked at approximate nearest neighbor.  Hard to do in high
dimensions.  Idea was: can project to low dimensional space O((log
n)/epsilon^2) and get distortion only (1+epsilon).  Then solve in this
lower dimensional space.  Use PLEB trick. 

Today we're going to look at a very different kind of "approximating a
complex metric space by a simple one".  Motivate with "buy at bulk
network design problem"
=======

The problem:  Given a weighted graph (e.g., telephone network) and a
collection of pairwise demands for bandwidth:  e.g., nodes A and B
need to communicate with bandwidth of 1mbit/sec, nodes C and D need to
communicate with 2mbit/sec, etc.  We're also given a pricing schedule
for how much it costs to buy a link of a certain length with a certain
amount of bandwidth.  The assumption is that the cost is linear in the
length, but may be sublinear in the bandwidth because of volume discount.
E.g., costs X dollars/mile per 1mbit/sec line, but maybe only 7X dollars/mile
for a 10mbit/sec line.  [Let's say we're given a chart of cost as a
function of demand, and this is sublinear.]  Our goal is to find the
cheapest way to buy lines to satisfy the demands. 

Notice that this problem is easy if the graph is a tree...why?

So, if we could approximate distances in our graph with a tree, then
solving on the tree would give us an approximate solution for our
original problem.  Unfortunately, lots of graphs (e.g., a cycle) can't
be approximated by a tree metric.  

So, this suggests a notion of randomized approximation [Bartal '96]

Let M be a metric space over a set of vertices V.  We say that a set S
of metric spaces (and distribution D over S) probabilistically
"alpha-approximates" M if for every u,v in V: 
(1) for every N in S, d_N(u,v) >= d_M(u,v)
(2) E_{N \leftarrow D}[d_N(u,v)] <= alpha d_M(u,v).

E.g., a cycle can be probabilistically 2-approximated by a line, by
cutting it at a random point.  No distance goes down.  Worst expected
blowup is factor of two between neighbors.

Claim: if can probabilistically alpha-approximate the shortest-path metric
in our given graph with a distrib over trees (with an efficient
construction), then this gives us a randomized alpha-approximation for
the buy-at-bulk problem. 

Why?  Look at mapping of OPT in graph G to new tree T.  Expected
blowup in length of any path is <= alpha, since the length of a path
is just a sum of lengths of edges. So, blowup in cost is <= alpha
(less if get more sharing).  Our cost in T is <= this cost.  Mapping
our solution back to graph G, we replace edge 
(u,v) used with shortest path from u to v in G [a canonical one if
there is more than one].  This only decreases our cost by property
(1).  So, expected value of our approx is <= alpha. 

THEOREM [Bartal '96]: Any n-point metric space can be probabilistically
alpha-approximated by a k-HST space (a special nice kind of tree that we
will define in a second) for alpha = O(k*log^2(n)).  [think of k as 2]

-> Improved to O(k*log(n)*loglog(n)) in [Bartal '98]

-> [Charikar et al]: do this with a distribution over only O(n log n)
trees.  Note: this then gives us a *deterministic* approximation algorithm.
[This would be a nice paper to present.  Uses LPs]

=======================================================================
k-HSTs / k-heirarchical metric spaces
-------------------------------------

A k-HST is a metric space with the property that you can partition the
nodes into clusters, and then recursively partition each cluster into
subclusters, and so on, so that 
(1) the distance between two nodes depends only on the smallest
	cluster that they both belong to, and
(2) the diameters of the clusters decrease by at least a factor of k
	on the way down.

These are really nice for divide-and-conquer types of algorithms, and
actually were motivated from online algorithms.

Can think of a k-HST space as a tree, where the points of the metric
space are at the leaves, internal nodes reprensent clusters [can
annotate with diameter of their space], and edge length is (1/2)(D_parent -
D_child).   

Note: only problem from point of view of buy-at-bulk problem is that
the internal nodes of the tree don't correspond to points of the
underlying metric space.  What we can do is identify each internal
node with an arbitrary point in the cluster when mapping our routes in
the tree back to routes in the original space, and so long as k>=2,
this will only cause a factor of 2 blowup.

=========================================================================
Let's prove the theorem first for the special case of n equally-spaced
points on a line.  [From the point of view of online algorithms, even
this special case had interesting implications].  Lets fix k=2 for
simplicity. 

So, we want to be able to view the line as clusters of diameter < n/2,
s.t. points in different clusters are distance at least n apart, and
then recursively each cluster breaks into two of at most half the
diameter, etc.

Algorithm: break the line at a random point x between 0 and n/2, and
then at x + n/2.  Define clusters to be n apart.  Prob that two
neighbors are cut is 2/n so their expected distance in new space is <
3.  Then recursively do the same for each cluster (the value of "n"
has dropped by at least a factor of 2).  For any two neighbors, their
expected distance in the new graph is O(log n).  So, for any two
nodes, their expected distance has blown up by at most O(log n) too.


Now, what about a general graph?

Let's fix k=2.  We'll prove a bound of O(log(n)*log(D)) where D is the
diameter of the space.  Then we will see how to replace D with n if we
want.

Idea: pick an arbitrary start node, and then a random radius according
to an exponential distribution.  I.e., we will flip a coin of bias p,
--- where let's say p = 8log(n)/D --- until it comes up heads.  The
number of coin flips will be the radius.  This will be one cluster.
Then we do it again with some point from the remaining graph, and so
on, until we have grouped everybody into clusters.  Then we recurse on
each.

First of all, we expect the radius of a cluster to be D/(8 log n).
So, it's unlikely (1/n^2) that it is more than D/4.  Let's assume that
in fact this never happens --- so the diameter of each cluster is at most
D/2. 

Now, what about the blowup of an edge?  Let's pick an edge (u,v) of
length L and ask for the chance it ever gets cut at the top level.
When we pick our start node and grow the radius, either both are
inside the cluster, or one is inside and one is outside, or both are
outside.  Can think of this as "win / lose / play again".  So, what we
really care about is Pr(lose)/Pr(win or lose).  
I.e., what is the probability that this edge is cut GIVEN that the
radius has reached the nearer of {u,v} to the start?  The answer is
that it is at most L*p.  So, the expected new length is at most LpD +
(1-Lp)L, which is at most L(8 log(n) + 1).

The number of iterations is at most log(D).  So, this gets us the
log(n)*log(D) blowup.

To replace the log(D) with a log(n): if the radius we pick ever cuts a
tiny edge (edge length < D/n^2, say), then we pick again.  The chance
we have to pick again is only a constant so only a constant blowup.
This ensures that each edge only has to worry about O(log n)
iterations.