# The Greedy Method

PGSS Computer Science Core Slides
Eve: My favorite technique!

Sometimes, the best way to approach a problem is to greedily take the best-looking item until the goal is met.

We examine

• Minimum spanning trees
• Set covers

### Minimum Spanning Trees

Problem class MinimumSpanningTree

Input: graph (V, E) with weights w: E->positive reals.
Output: subgraph (V, E') connecting all vertices, with minimum total weight.

application: find the cheapest way to connect offices with a fiber optic network.

```        500
o --- o
|    /|
200 | 100 | 2000
|  /  |
| /   |
o --- o
5000
```

### A Greedy Approach

```Algorithm Kruskal
E' <- empty set
while E' does not connect all vertices do
e <- next lightest edge
if e connects vertices E' does not connect then
E' <- E' union { e }
fi
od
return E'
```

example:

```        500
o --- o
|    /|
200 | 100 | 2000
|  /  |
| /   |
o --- o
5000
```
We look at the 100-weight edge first. It connects two unconnected vertices, so we include it.
```      o     o
/
100
/
/
o     o
```
Look at the 200-weight edge. It connects two unconnected vertices, so we include it.
```      o     o
|    /
200 |   /
|  /
| /
o     o
```
Now try the 500-weight edge. Its endpoints are already connected; ignore it. We look at the 2000-weight edge, and include it.
```      o     o
|    /|
|   / | 2000
|  /  |
| /   |
o     o
```

### Greed Works

Say Kruskal returns E'. Take any set F connecting all vertices. We improve it to make it more like E'. This will imply that E' is the optimal solution.

Let e be an edge in F but not in E'. Remove it. If this disconnects nothing, we have improved the set and we are done. Otherwise we have

```      /----\
|A   |
|  o |
\--|-/
|e
/--|-\
|  o |
|B   |
\----/
```
Let e' be the cheapest edge between A and B. E' includes e', so e is not e', so F is better with e replaced by e'.

### Set Cover

Problem class SetCover:

Input: base set S, collection C of subsets of S.
Output: subcollection C' with fewest subsets from C whose union is S.

application: build as few fire hydrants as possible so every house is next to one.

```      *---o---o---*---o
\_  \_ | _/| _/
\   \|/  |/
o---o---o
\_____/
* represents is the optimal hydrant placement
```
We take S to be the vertices (houses) of the graph; for each vertex v, we place into C the set of v and its adjacent vertices.

fact: All exact algorithms take exponential time (unless P != NP conjecture is false).

### Greedy Approximation

```Algorithm Greedy-Set-Cover
C' <- empty set
U <- S
while U != empty set do
A <- subset from C covering the most of U
C' <- C' union { A }
U <- U minus A
od
return C'
```

example:

```      *---o---o---o---*
\_  \_ | _/| _/
\   \|/  |/
o---*---o
\_____/
* represents the hydrant placement chosen by the algorithm
```
Note that this is not optimal! But it is close.

Spot: Arf!

### Performance

Theorem: Let n = |S| and k be the size of the optimal solution. Greedy-Set-Cover returns at most k ln n + 1 subsets.

First set added has >= n / k items, leaving n (1 - 1/k) items in U. U is still covered by the optimal set, so the next set added to C' has >= |U|/k items, leaving <= (1 - 1/k) |U| <= n (1 - 1/k)^2 items in U. Generally, after the ith iteration, U has <= n (1 - 1/k)^i items.

When does this reach 1? Note that

```(1 - 1/k)^k <= 1/e.
```
We use this to simplify our expression.
```n (1 - 1/k)^i = n ((1 - 1/k)^k)^(i / k) <= n (1/e)^(i/k)
```
Now we work backwards to see how large i must be for this to be at most 1.
```  n (1/e)^(i/k) <= 1
n <= e^(i/k)
ln n <= i/k
i >= k ln n
```
So after at most k ln n iterations there will be at most 1 item left. Picking up this item will take one more iteration. So C' has at most
```  k ln n + 1
```
items in the set.