24 Jan 1996

Outline

Assigning constants to character pointers

You may remember the following on the blackboard yesterday:

  s = "Hello";
Avrim seemed to say that this is OK. I want to point out that this is wrong. C (and C++) will allow you to do this in your code. Don't do it. There are two reasons.

The first reason is that some compilers, to conserve memory space, merge multiple occurrences of the same strings into one. Consider, for instance, the following code:

  s = "Hello";
  strcpy(s, "Bye");
  t = "Hello";
Probably what the programmer expected was:
    /---\
  s:| +-+-->"Bye"
    \---/
    /---\
  t:| +-+-->"Hello"
    \---/
And indeed, on many compilers, this is what happens. On the other hand, with some compilers both "Hello" strings will refer to the same string, so that the final memory state looks like this:
    /---\
  s:| +-+-->"Bye"
    \---/    ^
    /---\    |
  t:| +-+----^
    \---/

A second reason is that a string constant may be overwritten after returning from a function.

So don't do it. Now we can move onto finishing out the lecture.

Defining structures

To resume the example we had last time, we were considering a simple map:

              A street, length 200
                 0 -------> 1
                  \       /
B street length 150\     / C street, length 100
                    \   /
                      2
We want to define an adjacency-list representation of this map. So we want an array of lists, with one list for each intersection, where the list for an intersection will contain all the street segments exiting from that intersection. (I will probably slip into calling intersections vertices, since this is an instance of a graph - we'll study graphs later - and an intersection in this map is called a vertex in a graph.)

The first thing we do is declare the structure representing a single street segment:

  struct street_seg {
    int endpt;
    char street_name[40];
    int length;
    street_seg *next;
  };
Then, to make an array of lists, with one list for each vertex, we just make an array of pointers to street_seg structures:
  street_seg *intersections[3];
or, more readably,
  typedef street_seg *street_list; // street_list MEANS street_seg *
  street_list intersections[3];

Now we've defined it, we need to build the list. Let's do this by writing a function that will insert information into a street list. To do this we will need to dynamically allocate memory.

Dynamically allocating memory

In C++, you use "new" to allocate memory. In this example, we can write the following to allocate a new street segment:

  p = new street_seg;
This allocates a new structure in memory and points p to it. (The new operator returns a pointer to the space it has allocated).

For C programmers, you should start using new rather than calling malloc. You'll want to do this anyway, since malloc calls are so ugly.

Incidentally, new returns NULL when not enough memory is available. Technically, real programs should check to see if it returned NULL each time new is used in order to handle memory problems.

All dynamically allocated memory should be eventually deallocated (in order to avoid these memory-full problems). This is done with the delete operator:

  delete p;
When you program completes, all allocated memory is automatically deallocated, so you need not worry about deallocating memory at this point. Many real programs, though, never terminate until a user instructs it to; for these programs, it is important to be careful about memory allocation.

Now let's write a function to insert a street coming from an intersection with the prototype

  street_list insert_info(street_list l, int ending, char *name, int len);
so that a call to the function might be
  intersections[0] = insert_info(intersections[0], 2, "B", 150);
This should remind you strongly of the homework. Since inserting information in a list is such an important operation, we're repeating the code again in lecture.

So the function might look something like the following:

  street_list
  insert_info(street_list l, int ending, char *name, int len) {
    street_list new_road = new street_seg;
    new_road->endpt = ending;
    strcpy(new_road->street_name, name);
    new_road->length = len;
    // now, need to hook up to list l.  How?
    new_road->next = l;
    return new_road;
  }
Do you see any problems with this code? I think it's all OK.

Now, let's suppose we have filled in the array. We've allocated structures and filled in the values. How does this look?

It looks something like the following:

                 0     1     2
              /-----+-----+-----\
intersections |  +  |  +  |  +  |
              \--+--+--+--+--+--/
                 |     |     |
                 v     v     v 
              /----\ /----\ /----\
              | 1  | | 2  | | 0  |
              | A  | | C  | | B  |
              |200 | |100 | |150 |
              |_+__| |NULL| |_+__|
                v    \----/   v
              /----\        /----\
              | 2  |        | 1  |
              | B  |        | C  |
              |150 |        |100 |
              |NULL|        |NULL|
              \----/        \----/

Now we can look at specific values. Try to determine what the value of the following would be:

  intersections[0]->length;
  intersections[0]->next->length;
  intersections[intersections[0]->endpt]->street_name;
In the first instance, the value would be 1. The second expression is equal to 150, and the final example the value is the string "C".

Consider the following implementation of insert_info:

street_list
insert_info(street_list l, int ending, char *name, int len) {
    street_seg new_road;
    new_road.endpt = ending; 
    // ...
    new_road.next = l;
    return &new_road;
}
Do not write code like this, because local variables (such as new_road in this example) are thrown out on returning from a function.

It would be nice to have a list insertion function that is called like the following:

  insert_info(intersections[0], 2, "B", 150);
One solution is to use call-by-reference on the first parameter:
  void
  insert_info(street_list &l, int ending, char *name, int len) {
    street_list new_road = new street_seg;
    new_road->endpt = ending;
	// ...
    new_road->next = l;
	l = new_road;
  }
To avoid using call-by-reference, however, you can add the street as the second element of the list, as follows:
  void
  insert_info(street_list l, int ending, char *name, int len) {
    street_list new_element = new street_seg;
    new_element->endpt = ending;
    // ...
    new_element->next = l->next;
    l->next = new_element;
  }
Do you see anything wrong with this code? It'll work as long as the list l has at least one thing. In fact, often programmers start the list with a dummy element in order to allow this sort of thing.

Allocation of arrays of a type is another frequently desired operation. In this example, we arbitrarily limited the number of intersections to three; if we want a more useful program, though, we might allow a data file to specify the number of intersections, and allocate the memory in this way:

  struct_list *pittsburgh;
  pittsburgh = new street_list[num_intersections];
We define pittsburgh to be a pointer to a street_list because the adjacency list is an array of lists; we want each element of pittsburgh to be a street_list, so we make pittsburgh itself to be a pointer to street_list.

Deallocation of the same array uses a slightly different operator:

  delete[] pittsburgh;
The reason for this slightly different operator involves class destructors, about which we haven't talked, so I won't go into why this is. Just get into the habit of deallocating arrays in this way.

Deallocation should be done carefully. In particular, in this example presumably several allocated street_seg structures are hanging off the pittsburgh array, none of which will be legally accessible thereafter, so huge chunks of memory are lost if each of the lists in the array are not first deallocated before deallocating the array itself.