The Greedy Method

PGSS Computer Science Core Slides

Eve: My favorite technique!

Sometimes, the best way to approach a problem is to greedily take the best-looking item until the goal is met.

We examine

Minimum spanning trees
Set covers

Minimum Spanning Trees

Problem class MinimumSpanningTree

Input: graph (V, E) with weights w: E->positive reals.
Output: subgraph (V, E') connecting all vertices, with minimum total weight.

application: find the cheapest way to connect offices with a fiber optic network.

        500
      o --- o
      |    /| 
  200 | 100 | 2000
      |  /  |
      | /   |
      o --- o
        5000

A Greedy Approach

Algorithm Kruskal
E' <- empty set
while E' does not connect all vertices do
  e <- next lightest edge
  if e connects vertices E' does not connect then
    E' <- E' union { e }
  fi
od
return E'

example:

        500
      o --- o
      |    /| 
  200 | 100 | 2000
      |  /  |
      | /   |
      o --- o
        5000

We look at the 100-weight edge first. It connects two unconnected vertices, so we include it.

Look at the 200-weight edge. It connects two unconnected vertices, so we include it.

Now try the 500-weight edge. Its endpoints are already connected; ignore it. We look at the 2000-weight edge, and include it.

      o     o
      |    /|  
      |   / | 2000
      |  /  |
      | /   |
      o     o

Greed Works

Say Kruskal returns E'. Take any set F connecting all vertices. We improve it to make it more like E'. This will imply that E' is the optimal solution.

Let e be an edge in F but not in E'. Remove it. If this disconnects nothing, we have improved the set and we are done. Otherwise we have

      /----\
      |A   |
      |  o |
      \--|-/
         |e
      /--|-\
      |  o |
      |B   |
      \----/

Let e' be the cheapest edge between A and B. E' includes e', so e is not e', so F is better with e replaced by e'.

Set Cover

Problem class SetCover:

Input: base set S, collection C of subsets of S.
Output: subcollection C' with fewest subsets from C whose union is S.

application: build as few fire hydrants as possible so every house is next to one.

      *---o---o---*---o
       \_  \_ | _/| _/
         \   \|/  |/
          o---o---o
           \_____/
* represents is the optimal hydrant placement

We take S to be the vertices (houses) of the graph; for each vertex v, we place into C the set of v and its adjacent vertices.

fact: All exact algorithms take exponential time (unless P != NP conjecture is false).

Greedy Approximation

Algorithm Greedy-Set-Cover
C' <- empty set
U <- S
while U != empty set do
  A <- subset from C covering the most of U
  C' <- C' union { A }
  U <- U minus A
od
return C'

example:

      *---o---o---o---*
       \_  \_ | _/| _/
         \   \|/  |/
          o---*---o
           \_____/
* represents the hydrant placement chosen by the algorithm

Note that this is not optimal! But it is close.

Spot: Arf!

Performance

Theorem: Let n = |S| and k be the size of the optimal solution. Greedy-Set-Cover returns at most k ln n + 1 subsets.

First set added has >= n / k items, leaving n (1 - 1/k) items in U. U is still covered by the optimal set, so the next set added to C' has >= |U|/k items, leaving <= (1 - 1/k) |U| <= n (1 - 1/k)^2 items in U. Generally, after the ith iteration, U has <= n (1 - 1/k)^i items.

When does this reach 1? Note that

(1 - 1/k)^k <= 1/e.

We use this to simplify our expression.

n (1 - 1/k)^i = n ((1 - 1/k)^k)^(i / k) <= n (1/e)^(i/k)

Now we work backwards to see how large i must be for this to be at most 1.

  n (1/e)^(i/k) <= 1
     n <= e^(i/k)
     ln n <= i/k
     i >= k ln n

So after at most k ln n iterations there will be at most 1 item left. Picking up this item will take one more iteration. So C' has at most

  k ln n + 1

items in the set.