Computational Geometry
15-451 CMU Spring 2003
Gary Miller
Contributors: Danny Sleator
Gary Miller
Boilerplate:
What is Computational Geometry(CG)?
Why CG in algorithms class?
Where is CG used?
Is it fun or boring?
At present CG is the design and analysis of algorithm for geometric
problems that arise on low dimensions, 2 or 3. Most algorithms work
well in 2D while 3D problems are not as well understood.
CG will allow us to see our old algorithms in use and see new ones.
Some application of CG are:
Computer Graphics
images creation
hidden surface removal
illumination
Robotics
motion planning
Geographic Information Systems
Height of mountains
vegetation
population
cities, roads, electric lines
CAD/CAM computer aided design/computer aided manufacturing
Computer chip design and simulations
Scientific Computation
Blood flow simulations
Molecular modeling and simulations
----------------------------------------------------------------
Basic approach used by computers to handle complex geometric objects
Decompose the object into a large number of very simple objects
with simple interactions.
Examples:
* An image might be a 2D array of dots.
* An integrated circuit is a planar triangulation.
* Mickey Mouse is a surface of triangles
Basic algorithmic design approaches
Divide-and-Conquer
* There are two different types of Divide-and-Conquer.
e.g. Merge sort
e.g. Quick sort
* Line-Sweep in 2D.
* Random Incremental
------------------------------------------------------------------
I will spend the next two lectures on random incremental algorithms for:
* 2D convex hull of a point set.
* An expected linear time algorithm for 2D linear programming.
------------------------------------------------------------------
First we discuss what abstract object we will need and their
representation
Abstract Object | Representation
-------------------------------------
Real Number | Floating Point, Big Number
Point | Pair of Reals
Line | Pair of Points, A set of constraints
Line Segment | Pair of Points, A set of constraints
Triangle | Triple of Points, etc
-----------------------------------------------------------------
Using points to generate objects
P1, ..., Pk in Real^d
Linear Combinations | Subspace = \Sum \alpha_i Pi for \alpha_i
\in Real
Affine Combinations | Plane = \Sum \alpha_i Pi
such that \sum \alpha_i = 1 for \alpha_i \in Real
e.g. { \alpha P + \beta Q | \alpha + \beta = 1 } line thru P and Q
Convex Combinations | Body = \Sum \alpha_i Pi
such that \sum \alpha_i = 1 and \alpha_i >= 0 for \alpha_i \in Real
e.g. { \alpha P + \beta Q + \gamma R | \alpha + \beta + \gamma = 1 and
\alpha, \beta, \gamma >= 0 }
Triangle with corners P,Q, and R
------------------------------------------------------------------
Primitive Operations
1) Equality P=Q? Can be hard!
2) Line segment intersection testing
3) Line side test
Input: (P1,P2,P3)
Output: True: If P3 is right of ray P1 -> P2
4) In circle test
Input: (P1,P2,P3,P4)
Output: True: If P4 in circle thru (P1,P2,P3)
5) Etc
2) Segments L1 = [P1,P2] and L2 = [P3,P4]
Pi = ( xi ) (vector of two reals)
( yi )
Note that a convex combination of two points P1 and P2 is
a new point P1 \alpha_1 + P2 \alpha_2 where
\alpha_1 and \alpha_2 are non-negative and sum to 1.
So L1 intersects L2 iff there exists equal convex combinations
That is, there exist alpha 1, 2, 3, 4 such that:
(P1, P2) ( \alpha_1 ) = (P3, P4)(\alpha_3)
\alpha_2 \alpha_4
such that \alpha_1 + \alpha_2 = 1
\alpha_3 + \alpha_4 = 1
\alpha_i >= 0
This is just a system of four linear equations with
four unknowns. (eliminate \alpha_1 and \alpha_3
trivially to make it two equations and two unknowns.
Can solve for the formula by hand.)
3) Line side tests
Input: P1,P2,P3
Output: True: If P3 is right of ray P1 -> P2
Subtract P1 from both of the other vectors.
Let V2 = P2-P1 and V3 = P3-P1. Now the cross
product V2xV3 is the signed area of the parallogram
formed by V2 and V3. This area is > 0 if and only if
P3 is right of ray P1 -> P2.
V2
*--------*
/ /
/ /
/ /
*--------*V3
If V2 = ( X'2 , Y'2 ) and V3 = ( X'3 , Y'3 )
Then the cross product is (X'2 Y'3) - (Y'2 X'-3)
Alternatively:
( X1 Y1 1 )
Claim: LineSideTest(P1,P2,P3) = det ( X2 Y2 1 )
( X3 Y3 1 )
( X1 y1 1 )
LHS = det ( X'2 Y'2 0 )
( X'3 Y'3 0 )
LHS = det ( X'2 Y'2 )
( X'3 Y'3 )
-------------------------------------------------------------------
The Convex Hull Problem
The sorting problem of CG.
A point set A subset R^d is convex if it is closed under convex
combinations. That is, if we take any convex combination of any
two points in A, the result is a point in A.
DEF: ConvexClosure(A) = CC(A) = smallest convex set containing A
There are two different definition of the convex hull. The first one
is the one we will use. The second is used by some books.
DEF1: ConvexHull(A) = boundary of CC(A)
DEF2: ConvexHull(A) = CC(A)
A computer representation of a convex hull must include the
combinatorial structure. In two dimensions, this just means a simple
polygon in, say counter clockwise order. In many of our algorithms
we'll need to be able to traverse the hull in either direction
starting form a vertex or an edge. We'll also need to be able to add
and delete points and edges. These operations are all easily done
in constant time with the appropriate representation.
We can charactorize when a pair is on the convex hull as follows:
(a,b) is on CH(A) iff a,b in A and forall a' in A a' is either left or on the line (a,b)
-------------------------------------------------------------------
2D Convex Hulls by divide and conquer
Let the points be P1...Pn, where Pi = (xi, yi)
Preprocess the points by sorting all of them by
their x coordinates. (This step happens just once. In all
recrusive calls, we can assume the points are sorted.)
Now recursively compute the convex hull
of {P1...Pn/2}, call it LH, and and {Pn/2+1...Pn}
call it RH.
(Picture here)
Now the problem is to stitch the left and right halves together to
form a hull for all the points. Here's how to do this:
Let a be the rightmost point in LH and b be the leftmost point in
RH. Start with the line segment [a,b]. Let a'(b') be the next
point clockwise around the LH(RH) from a(b). Let a"(b") be the
next point counter-clockwise around the LH(RH) from a(b). Consider
the three points [a',a,a] if a' is right of then set a = a' and
continue. If it's on or left, then stop. Now do the analogous
walk around RH starting from b. If at some point neither walk can
proceed, the segment [a,b] is the new segment to be added to the CH
of all the points on the bottom.
We give pseudo code:
Repeat the following:
While a' is right of (a,b) do a=a'
While b" is right of (a,b) do b=b"
An analogous algorithm suffices to compute the new edge to be added
on top of the hull. Once these new edges are found, the new convex
hull is stitched together form the relevant parts.
Preprocessing takes O(n log n) to sort. After that it's a standard
divide and conquer algorithm with the following recurrence:
T(n) = 2T(n/2) + n
This solves to O(n log n).
-------------------------------------------------------------------
This was not presented. GLM Spring 2004
Graham Scan for 2D convex hulls
Here's another algorithm. The input points are P1, P2, ... Pn.
Find the point with the smallest Y coordinate and swap it with P1.
Now for each point Pi, compute the angle of the ray P1-->Pi, and
sort the points by this angle. (These angles are all between 0 and
pi.)
Start with segment P1--->P2. In general there will be a sequence of
segments, starting from P1, P2 .... Each turn in this sequence is a
left turn. Our goal is to process a new point, and add it onto this
sequence.
Here's the algorithm. Add the new point Pi to the end of this
sequence. If the sequence of segments now has a right turn at the
end, then delete the second to last point of this sequence. Repeat
this process until the sequence is restored to one that has only left
turns. The direction of a turn can be done using a line side test.
After all the points are processed in this way, we can just add the
last segment from Pn to P1, to close the polygon, which will be the
convex hull.
Each point can be added at most once to the sequence of vertices, and
each point can be removed at most once. Thus the running time of the scan
is O(n). But remember we already paid O(n lg n) for sorting the points at
the beginning of the algorithm, which makes the overall running time of the
algorithm be O(n lg n).
(Alternatively, keep a token on each edge of the current edge
sequence. Use the token to pay for a segment's removal. Adding a new
segment is one unit of work, plus another token to keep on it.)
-------------------------------------------------------------------
Lower bound for compute the convex hull
Suppose the input to a sorting problem are the number X1, ..., Xn
Consider the convex hull problem:
(X1,X1^2), ..., (Xn,Xn^2)
The convex hull of these points must return them in sorted order.
Thus:
Any comparison based CH algorithm must make \Omega( n log n )
comparisons.
-------------------------------------------------------------------
QuickSort and Backwards Analysis
Before we present and analyze a simple randomized CH algorithm, let's
go back and look at another way to analyze QuickSort called Backwards
Analysis.
We assume that the keys are distinct
Recall randomized QuickSort:
QS(M)
1) pick random b in M
2) split M into S and L with S < b < L
3) return [QS(S), b, QS(L)]
The cost of the call QS(M) (Not counting recursive calls) is the size
of the array M minus 1. (This is how many comparisons are done.)
------------------------------------
To do the backwards analysis we propose a simple game with a cost
function that has the same behavior.
Consider a dart board with n empty squares.
+-----------------------+
| | | | | | | | |
+-----------------------+
All n squares start out empty.
The dart game proceeds as follows:
While there exists a empty square do:
Pick an empty square at random. Put a dart into it. The cost
of the step is the number empty squares to the left and right
of the new dart.
The expected cost of this game is exactly the same as the expected
cost of randomized quicksort. Why? Let's prove this by induction.
In the base case, n=0 or n=1. In this case QS(M) costs zero and the
dart game costs zero. In the general case, the cost of the first dart
is n-1, and the cost of the top level call to QS(M) is just n-1 (not
counting the costs of the recursive calls). We know that the dart
game eventually fills every cell with a dart. And that the darts to
the right of the first dart have no influence on what happens to the
left of it. So we can rearrange the sequence of darts to put those
that are in the left half first, then those in the right half. This
does not change the cost of the dart game. Now by induction the cost
of the dart game in the left half is the same as quicksorting the left
half. The right half is the same.
The dart game can be played backwards. Here's how to do it.
Layout a dart board with n squares each with a dart.
While there exists a dart on the board do:
Randomly pick a dart. The cost is the number of empty squares
to the left and right of the new dart. Remove the dart.
The only random parameter in the forward dart game is the permutation
that is chosen to add the darts. Similarly the only random parameter
in the reverse dart game is the permutation that is chosen to remove
the darts. Thus if we made a movie of the dart game and watched it
backwards, it would be totally indistinguisable (drawn from the same
distribution) from a movie of the backwards dart game.
Thus, if we can analyze the expected cost of the backwards dart game,
it will tell us the expected cost of the forward dart game (and thus
quicksort).
-------------------------------------------------------------------------
Let's analyze the backwards dart game.
Suppose there are i darts left. Define Ti as follows:
Ti = Expected cost to remove a random dart from a
board of i darts.
One way of computing this cost is to take the total cost of each of
the possibilities (the dart to remove) and divide by the number of
darts (i). The total cost of all possibilities can be computed by
allocating the costs to the empty cells. So when considering the
option of removing a dart d, put a token into each of the empty cells
in the blocks of empty cells to the left and right of d. The number
of tokens distributed is the cost of the option of removing dart d.
The sum of these costs over all darts is the number of tokens on all
the empty cells. It's easy to see that each empty cell can get at
most two tokens, one from each of the two darts at the boundaries of
that block of empty cells.
Total cost of all options at step i <= 2*(# empty cells) = 2(n-i)
Ti = Average cost of all options at step i <= 2(n-i)/i = 2n/i - 2
Let E_n be the expected cost of the backwards dart game on a
board of size n.
E_n = Tn + Tn-1 + .. T1
= [2n SUM 1/i] - 2n
i = 1...n
= 2nHn - 2n = O(n log n)
This is the same as the cost of the dart game running in the forward
direction, and the expected cost of quicksort.
--------------------------------------------------
2D Random Incremental Convex Hull
Input: P = { P1, ..., Pn}, where Pi is in R^2.
These are called the "sites". No three sites are
colinear.
Output: The convex hull of P.
Recall that the representation of the convex hull is a doubly linked
data structure of the sites and the edges between them. This makes is
possible in O(1) time to remove a site or add a site to the convex
hull. This representation is used internally during the running of
the algorithm.
Procedure: RandomIncrementalCH(P)
0) Pick three sites from P, say P1, P2, and P3. Construct our
standard convex hull representation of the triangle consisting of
these three points. Also choose a point C interior to T.
1) Construct the ray from C to each other site in P, and continuing
to infinity. This ray must cross one of the edges of the
triangle.
2) Partition the sites in P by which edge of T their ray crosses.
For each edge of T we keep a bit more information in the data
structure. We keep the set of sites whose rays cross this edge of
T. This will be maintained as the algorithm proceeds below.
3) Compute a random permutation of the remaining sites, P4...Pn.
Now process each of the sites Pi in this order. (The invariant
that we maintain is that we have at any point in time the convex
hull of the sites that have been processed.) Here is how site Pi
is processed:
Looking at the ray from C-->Pi and the edge e of the current
convex hull that it crosses, determine if Pi is inside or
outside of the current convex hull. If it's inside continue
with the next site.
Now, if Pi is outside, then update the data structure with
BuildTent(Pi,e).
4) The convex hull we have at the end is the convex hull of all the
sites.
Procedure: BuildTent(p,e)
Let CH denote the current convex hull (convex hull of all the
processed sites). The goal of this step is to processes one site.
1) Find all edges visible to p on CH by searching out from e in both
directions. These are called the "visible edges".
2) Replace visible edges with two new edges (a "tent") in the CH.
The new edges will be from p to the left and right end points of
the visible region.
3) For each site in the site sets maintained for each visible edge:
assign it to whichever of the two new edges its ray crosses.
---------------------------------------------------------------------
Note about optimizing this: In an actual implementation you would not
keep around unprocessed sites who are inside the covex hull. So the
sets of sites maintained at the edges would always be outside the
current convex hull. This only makes the algorithm more efficient.
However, we'll analyze the algorithm as stated above.
---------------------------------------------------------------------
Running time analysis.
It's obvious that all work other than that done by BuildTent()
takes O(n) time. What about BuildTent()?
Let's partition the cost of BuildTent() running over the entire
computation, into two parts. The "Edge Cost" is the cost of parts 1
and 2. It's proportional to the number of visible edges. The "Ray
Cost" is the cost of part 3, and it's proportional to the number of
sites in the site sets of visible edges processed there.
Let's analyze these costs separately. It's easy to see that the total
Edge Costs are O(n). Just keep a token on each edge of the current
CH. When we run BuildTent() we remove some visible edges and add two
new edges. Use the tokens on the visible edges that are removed to
pay for their removal. Then put a new token on each of the new edges.
Thus only two new tokens are needed each time, for a total cost of 2n
tokens.
What about the Ray Costs? Here's where we use backward analysis.
Let's write down the sequence of sites (after the random permutation
of P4...Pn).
P1 P2 P3 | P4 P5 P6 ... Pi Pi+1 Pi_2 ... Pn
\_____________/ \______________/
processed unprocessed
In a backward running of the algorithm we pick a random processed
point and unprocess it. There are i-3 points to choose from (P1, P2
and P3 are not part of the randomization). The number of rays stored
in all the edges of the current CH is one for each unprocessed point,
or (n-i). What is the expected Ray Cost of the step? There are i-3
unprocessed sites to choose from to remove. The ones that are
interior to the current CH have zero ray cost. The ones on the CH
incur a ray cost equal to the number of rays crossing the edge to the
left of this site plus the number of rays crossing the edge to the
right of this site. (This situation is exactly analogous to the
situation we analyzed in removing darts from the quicksort dart board.
The role of an empty cell on the board is played by a ray to an
unprocessed site.) So among all the options of processed sites to
unprocess, each ray is counted twice. (Actually it may be less
because of the possibility that some of P1, P2 or P3 are still on the
CH, and we're not allowed to choose these.) So the expected ray cost
is:
R_i = Expected ray cost of unstep i <= 2(n-i) / (i-3)
<= 2n/(i-3)
E_n = Expected total Ray Cost <= SUM [ 2n/(i-3) ]
i = 4...n
<= 2nH(n-3) = O(n log n)
(Here H(k) is the kth harmonic number).
------------------------------------------------------
Linear Programming in Fixed Dimension
Our next example of a random incremental computational geometry
algorithm is for solving 2-dimensional linear programming. The algorithm
works in any fixed dimension. But we will only consider 1D and 2D here.
Recall: An LP is Max c^T x
Subject to Ax <= d
We say that the LP is feasible if there exits x s.t. Ax <= d
In 2D this is (a1, b1) ( x ) =< ( d1 )
. . y .
. . .
. . .
an, bn dn
Each row is a Half-Space(Half-Plane) hi = { (x,y) | ai x + bi y =< di }
One natural problem is to compute the intersection of a collection n of half-spaces
We can make a table of problems that look like sorting and ones that look like selection.
Sorting | Selection
--------------------------
Half-Space-inter | LP
Convex Hull | ?
Meshing |
---------------------------------------------------
Let's think of the 2D LP geometrically:
Input: Half-Spaces {h1, ..., hn} and a vector c
We make the following simplifications which are not that hard to remove.
1) We assume that no hi is normal to c
2) We assume that we know a bounding box of the feasible region i.e. we
know two half-spaces m1 and m2 containing the feasible region such
that neither is contained in the other.
-------------------------------------------------------
We first show how to solve the 1D problem in linear time.
Input: Constraints ai x =< bi ai \not= 0
WLOG ai = +-1
We divide the constraints into two sets, those that contain +\infinity
and those that contain -\infinity
C^+ = { i | x =< bi }
C^- = { i | -x =< bi } ie -bi =< x
The feasible region must be above \alpha = Max{ -bi | i \in C^- }
and below \beta = Min{ bi | i \in C^+ }
Thus the LP has a feasible solution iff \alpha =< \beta.
If there is a feasible solution the optimal is either \beta or \alpha
depending on the sign of c, the objective function.
We will assume that our 1D code returns undef when there is no feasible solution.
Theorem: 1D LP is O(n) time.
----------------------------------------------------------
We are now ready to give our 2D solution for LP.
Procedure 2D-LP( m1, m2, h1, ..., hn, c)
1) Set v_0 = 2D-LP(m1,m2,c) ( we do this by simply returning the CH(m1) \inter CH(m2)
2) Randomly order h1, ..., hn
3) For i=1 to n do
if v_{i-1} \in hi then set v_i = v_{i-1}
else
Construct the following 1D LP problem on L = CH(hi)
h1' = L \inter h1,
....
hi' = L \inter hi
c' = projection(c, L) Note: c' not= zero.
Set v_i = 1D-LP(h1', ..., hi', c' )
If v_i is undef report "no solution" and halt
(*)
else continue
----------------------------------------------
CORRECTNESS
We first show that the procedure 2D-LP is correct by induction on i
That is at time (*) in the code v_i is a solution of
LP(m1,m2,h1, ... , hi)
Assume that at time i-1 the algorithm was correct.
We may assume that it did not return "no solution".
We walk through the code and show that it does the correct thing.
1) if v_{i-1} \in hi then v_{i-1} is a solution.
2) if v_{i-1} \not\in hi
Claim: hi is a critical constraint and therefore v_i \in CH(hi).
Thus by solving the 1D problem we get the correct answer.
QED
-------------------------------------------------------------
TIMING:
Random Incremental 2D LP (2D-LP) is O(n) expected time.
We use backwards analysis.
Suppose we remove one of the i constraints at random, say, hj:
We shall say that a constraint is critical if removing it changes the optimal solution.
if hj is not critical we charge O(1) cost say K
if hj is critical we charge O(i) cost say K i
There must be at most two critical constraints. If there there are three or
more constraints at the optimal then there are no critical constraints.
So in the worst case there is always exactly two critical constraints.
Thus the expected cost to remove one of i constraints is
Ei = (2K i + (i-2)K)/i =< 3K
Thus the expected cost per backwards step is constant.
Thus the algorithm works on expected constant time per constraint.
O(n) expected time.