15-122 Principles of Imperative Computation
Recitation 3

Using strings in C0
The formal definition of Big O
Reviewing invariants (Homework 1)

Using strings

A string is a sequence of characters. Unlike C, it is not accessed as an array of characters (although it is stored this way internally). Instead, there are a set of string operations in a library to help you work with strings:

// Returns the length of the given string
int string_length(string s);

// Returns the character at the given index of the string.
// If the index is out of range, aborts.
char string_charat(string a, int idx);

// Returns a new string that is the result of concatenating b to a.
string string_join(string a, string b);

// Returns the substring composed of the characters of s beginning
// at index given by start and up to but not including the index
// given by end
// If end ≤ start, the empty string is returned
// If end < 0 or end > the length of the string, it is treated as
// though it were equal to the length of the string.
// If start < 0 the empty string is returned.
string string_sub(string a, int start, int end);

// Returns true if a and b are exactly the same, character for
// character, Returns false otherwise
bool string_equal(string a, string b);

// Returns a negative integer if a is lexicographically "less" than
// b, a positive integer if a is lexicographically "greater" than b,
// or 0 if a and b are equal.
int string_compare(string a, string b);

Most of these functions are self-explanatory, but let's look at the last function a little bit. When we compare strings, this is done using lexicographical ordering. Each character is represented by an ASCII code internally and these values are compared starting with the first character of each string. If this pair has the same ASCII code, the compare function moves on to the next second character in each string. This continues until either one character has an ASCII value that is less than the other character's ASCII value, one string runs out of characters to analyze (the shorter string is "less" than the other), or both strings run out of characters to analyze (strings must be equal then).

In the ASCII system, codes are given in order for uppercase letters (starting from 65 for 'A', 66 for 'B', etc.), lowercase letters (starting from 97 for 'a', 98 for 'b', etc.), and digits (starting from 48 for '0', 49 for '1', etc.). So when letters are compared, this turns out to be alphabetical ordering if only one case is used. Be careful when mixing cases. Based on lexicographical ordering, here are a series of strings from "smallest" to "largest" lexicogrpahically:

"Cat" < "Catch" < "ChEEse" < "CheX" < "Cheese" < "Chew"

The compare function returns a negative integer if the first string argument is "less than" the second string argument (lexicographically), a positive integer if the first string is "greater than" the second string (lexicographically) or 0 if the two strings are equal.

A look at the formal defintion of Big O

When we say that some function is in O(f(n)), what this means is that asympototically (for larger and larger n), c*f(n) acts as an upper-bound approximation for our function, for any n greater than some starting n value, called n0.

Consider the following example: Show that n2 + 5n + 25 is in O(n2).

For this to be true, cn2 ≥ n2 + 5n + 25 for all n ≥ n0.
So we need to find values of c and n0 to satisfy this equation. One way to approach this is to find an intersection point between the functions and let the n value for this intersection be n0:

cn02 = n02 + 5n0 + 25

c = 1 + 5/n0 + 25/n02

If we choose n0 = 5, then c becomes 3. This means:

3n2 ≥ n2 + 5n +25 for all n ≥ 5. (Think about it: 3n2 is a parabola that starts at 0 and increases. If the other function intersects it, then eventually 3n2 will be greater than the other function and they won't intersect again.)

Since we found a c and n0 to satisfies the formal definition, n2+ 5n + 25 is inO(n2).

Reviewing Invariants (Homework 1)

Consider the following function from the first homework:

int log(int n)
//@requires n ≥ 1;
//@ensures (1 << \result) == n;
{
  int i = 0;
  int k = n;
  while (k > 1)
    //@loop_invariant k ≥ 1;
    //@loop_invariant (1<<i) * k == n;
    {
      k = k / 2;
      i = i + 1;
    }
  return i;
}

Assuming that n is a power of 2, we will show that the invariants hold and they will lead to us showing that the postcondition is ensured.

The invariant is true when the loop condition is tested the first time since i = 0 and k = n:

k ≥ 1 is true since k = n and n ≥ 1.

(1 << i) * k == n is true since (1 << 0) = 1 and 1*k ==n is true since k = n.

For each iteration of the loop:

Assume k ≥ 1 and (1 << i)*k == n at the start of the iteration.

During the iteration: k' = k / 2 and i' = i + 1.

We wish to show that k' ≥ 1 and (1 <<i')*k' == n at the end of the iteration.

To show: k' ≥ 1 &rarrow; k/2 ≥ 1 &rarrow; k ≥ 2.
Since k > 1 to run iteration, k ≥ 2 is true.

To show: (1 <<i')*k' == n &rarrow; (1 <<i+1)*k/2 == n.
Since 1 << i+1 is equivalent to 2i+1 and k must be a power of 2 (since it starts as a power of 2 and we only divide by 2 each iteration), we can see that
2i+1 * k/2 == 2 * 2i * k/2 = 2i * k = n.

So the loop invariant holds for each iteration.

After the loop terminates (and this is easy to argue), the invariant is true and the negation of the loop condition is also true:

k ≥ 1 and (1 <<i)*k == n and k ≤ 1 → k = 1 and (1 <<i)*k == n → (1 <<i) == n.

This means that 2i = n or equivalently i = log2 n. Since the function returns i, which is the log of n (exactly), the postcondition is ensured.

In general, n might not be exactly a power of 2, so the loop invariants need to be generalized much like we did with the isqrt function:

//@loop_invariant k ≥ 1;
//@loop_invariant (1<<i)*k ≤ n;
//@loop_invariant (1<<(i+1))*k > n || (1<<(i+1)) < 0;
(Do you see why the last condition needs to be added to test for a negative result?)


written by Tom Cortina, 9/14/10