Recitation 4

Thinking about Contracts (HW2)

Another Look at Amortized Analysis

An example: Multi-dequeue

Let's look at the implementation of linear search as given in homework 2:

int find(int x, int[] A, int n) //@requires 0 <= n && n <= \length(A); //@requires is_sorted(A, 0, n); { int i = n-1; while (i >= 0 && A[i] >= x) { if (A[i] == x) return i; i = i - 1; } return -1; }

There are two loop invariants that we can write for this loop:

//@loop_invariant -1 <= i && i <= n-1; //@loop_invariant i == n-1 || A[i+1] > x;The first invariant is an obvious one that deals the range of the loop variable itself. (Remember to include the final value of the loop variable in the invariant.) The second invariant says that when you're at position i, the element to its right (and all the others to the right since the array is sorted) is greater than the element you're looking for. Of course, this doesn't apply the first time you test the loop since you haven't looked at any elements yet which is why the first part of this invariant is needed.

Let's see why these invariants are correct for this loop.

Look at the first invariant. Is this invariant true when the loop condition is tested the first time. Clearly it is since i = n-1. Now, if the invariant is true at the start of an iteration, is it true at the end of that iteration (just before the loop condition is tested again)?

That is, is -1 ≤ i' && i' ≤ n-1 true at the end of the iteration? Since i' = i - 1 and we want to show that

-1 ≤ i-1 && i-1 ≤ n-1

0 ≤ i && i ≤ n

Using the assumption that the invariant is true at the start of the iteration, we assume that i is between -1 and n-1. Can we also state that it is between 0 and n? In this case, yes, because the loop only runs when i ≥ 0 and i can't be n since i starts at n-1 and is decremented each time the loop runs. So

0 ≤ i && i ≤ n

is a true statement. (Remember this is stating something about i, not i'.)

Now looking at the 2nd invariant, is i == n-1 || A[i+1] > x true when the loop condition is tested for the first time? Yes, since i = n -1. Now if we assume the invariant is true at the start of the iteration, is the following true at the end of the iteration:

i' == n-1 || A[i'+1] > x

Since i' = i - 1, we need to show the following is true:

i == n || A[i] > x

By our previous argument, i can't ever be n since we're decrementing i starting from n-1. So for the expression to be true, we have to show that A[i] > x. We know that the loop runs since A[i] ≥ x and to get to the end of the iteration, A[i] is not equal to x, therefore, A[i] > x must be true at the end of the iteration. (Again, keep in mind you're stating something here about i, not i'.)

Thus, we've shown that both invariants hold at the end of each iteration if they hold at the start of each iteration.

Consider the binary counter example from class. We have a k-bit counter and we want to increment it n times. We assume that flipping a bit from 0 to 1 (or 1 to 0) is a constant operation.

Looking at an individual increment operation, you would say that the
worst case would be O(k) for an increment when you increment
`01111...1` to `10000...0`. This requires k bit flips.
You might think that the worst case of n increments is therefore O(nk),
but this upper bound is not very strong since this particular situation
can't occur n times in a row.

Using amortized analysis, we can get a more accurate worst-case analysis for this problem. Whenever we flip a bit from 0 to 1, this is 1 operation, but instead of charging 1 token for this, we can charge 2 tokens, one for the flip itself, and one to bank for later when the bit has to flip back to 0 (so we're "prepaying" for a future unit of work). So now when you look at each increment, all bits that flip from 1 to 0 will have tokens banked to pay for those operations, and the increment will only cost 2 tokens for the bit that flips from 0 to 1. So each increment always costs 2 tokens = O(1). So a sequence of n increments costs O(n*1) = O(n) in an amortized sense.

Recall in class that a queue is a first in first out structure
implemented using a linked list. Each data element is stored
in a struct called a list with two fields, `data` and
`next`. The `next` field points to the next list
element (or is NULL if there is no next list element).
A queue is a struct with two pointers, `front` and `back`.
The `front` pointer points to the first element,
but we set up the list so there is an extra list element after
the last queue element and `back` points to this
"empty" list element.

When we add an element to back of
the queue, we call this *enqueue* and
this is a constant operation.

When we remove an element from the front of
the queue, we call this *dequeue* and this is also a constant
operation.

Let's define a new operation called *multi-dequeue* that
removes up to k elements from the queue. (The elements are not
returned; they are just discarded.)

void multideq(queue Q, int k) //@requires k >= 0; //@requires is_queue(Q); //@ensures is_queue(Q); int kk = k; while (kk > 0 && !is_empty(Q)) { Q->front = Q->front->next; kk = kk - 1; } }

(Note that if the queue becomes empty, the linked list will still have one list element.)

Now suppose we wanted to run some arbitrary sequence of n queue operations
chosen from `enq`, `deq`, and `multideq`. Using worst-case
analysis, we would look at the `multideq`
operation and think that this operation
takes O(k) assuming the queue has at least k elements, so a sequence of n
of only these operations would take O(nk) time in the worst-case.

However, this is where amortized analysis comes in again. Using amortized analysis, we can show that the worst-case is actually much better than O(nk). Much like the binary counter example, we see that every element that gets enqueued must eventually get dequeued. Each enqueue operation is a constant operation, so instead of charging one unit of work (or one token) for an enqueue, we can charge two units of work or two tokens (still constant), one to pay for the enqueue itself and one to bank for the eventual dequeue (if needed). So then when we do a dequeue, we should have a token banked to pay for the dequeue assuming the queue is not empty. If we do a multi-dequeue and the queue has m elements, we should have m tokens banked, so if we remove k elements using multideq, if k ≤ m, we're ok, and if k > m, we can only remove m elements so we're still ok. Thus, the cost of an enqueue is 2, and the cost of deq and multideq is 0. So a sequence of n queue operations will cost at most 2n which is an amortized cost of O(n).

Note that amortized cost is not the same as average cost. When you analyze using amortized analysis, you are getting a worst-case cost, but you're showing that it's a tighter approximation than the naive cost that a standard worst-case analysis would provide.