10 Apr 1996

Outline

Let's talk about Assignment 6 today.

Input files

The input for this assignment is a set of 3 files containing data that describes a street map of Pittsburgh. The following examples are taken directly from the assignment handout (small map). The comments are mine, and don't actually appear in the files. (You probably don't have to write all of this down.)

places.txt

Places are listed alphabetically.
8                          // number of places
Ali_Baba 404 S_Craig_St    // name of place 0, num. on street, name of street
Baker_Hall 4850 Frew_St
...

intersections.txt

For this assignment, only the first line of this file is needed. We'll look at the rest of the file in assignment seven.

19                         // number of intersections
631912 499709              // x and y coordinates of intersection 0
632883 485117              // x and y coordinates of intersection 1
...

streets.txt

Streets are listed alphabetically.

12                         // number of streets
5th_Ave 2                  // first street, alphabetically, 2 blocks
 7 5000 5099 982 8         // first block on 5th_Ave, from node 7 to 8
 8 5100 5199 419 1         // second block on 5th_ave, from node 8 to 10
Amberson_Ave 1
 9 834 999 356 8
...
The street map can be thought of as a graph, with intersections corresponding to nodes, and blocks corresponding to edges. (See figure on handout.)

Tasks

The program is to perform 4 tasks:

  1. Given a street address (e.g., 450 S._Craig), output the numbers of the two adjacent intersections (0 and 1).

    If the name does not exactly match a street name, look for a partial match. If there is more than one partial match, list them all. If there is one exact match or partial match, check to see if the address lies on any of the blocks on the streets.

  2. Given a street address as in Task 1, report all known places (i.e., items in places.txt) that are on the same block as the given address.
  3. Given an intersection number and a radius R, report the numbers of all intersections within R blocks of the given intersection.
  4. Given a street address and a radius, list all the places in places.txt that are within the given number of blocks of the given address. Let us say that any place on the same block as the given address has distance 0.

In addition to writing the code that solves these four tasks, you will also have to write the code that reads in the input data and builds your internal data structures.

Data structures

Before writing any code to solve any of the tasks, it makes sense to spend some time thinking about what data structures are most appropriate for this assignment. The first of these is easy:

Graph

Adjacency list representation
advantages
space efficient, since graph is known to be sparse.
disadvantages
code is a little more complicated than for adjacency matrix

Now let's brainstorm a little bit about what data structures we could use to represent the places and streets, and what algorithms we might use to solve the tasks. Try to think of some alternative yourself before looking at what we suggest.

Street

a struct, containing name, number of blocks, and a linked list of blocks, sorted by starting address
advantages
  • space efficient
  • blocks are given in sorted order
  • easy access to name and number of blocks,
  • easy to walk through blocks in order
disadvantages
  • linear search for block containing address on street
  • linear search even if address is invalid (easy solutions)
a struct, containing name, number of blocks, and an array of blocks, sorted by starting address
advantages
  • space efficient
  • blocks are given in sorted order
  • easy access to name and number of blocks,
  • easy to walk through blocks in order
  • binary search for block containing an address
disadvantages
  • more difficult to code
  • most streets don't contain that many blocks anyway
Any other reasonable representations?

Streets

array of streets, sorted by street name
advantages
  • streets are given in sorted order
  • no insertions/deletions
  • finding exact and partial matches (Tasks 1, 2, and 4)
  • can be very easily coded using linear search
  • with just a little more effort, matches can be found using binary search
disadvantages
are there any?
hash table
advantages
  • your solution will be unique
disadvantages
  • searching for partial matches will probably be difficult, unless your hash function maps similar names to the same bucket, in which case you're losing the advantages of a hash table
binary search tree / splay tree
advantages
  • supports fast searching for an exact match
  • supports fast searching for partial matches using ``members in range'' function
  • Sleator's ``bulletproof'' code is available for you to use
disadvantages
  • getting the details of partial matching right is a little tricky
  • debugging may be trickier, since code is more complicated

Places

This won't be used until Task 4.

array, sorted by street name
advantages
  • input already given in sorted order
  • no insertions or deletions to be done
  • will support looking up of places by name -- not needed in this assignment, but a natural operation
disadvantages
  • sorted order doesn't help any with task 4!
  • we might have to search the whole array for each block
For each block, associate a linked list containing the places on that block.
advantages
  • makes listing the places on a block in Tasks 2 and 4 easy.
disadvantages
are there any?

Queue

A queue will be needed for Tasks 3 and 4.

array
advantages
  • you might already have working code
  • can be very fast and space efficient
disadvantages
  • have to worry about array overflowing
linked list
advantages
  • you might already have working code
  • don't have to worry about resizing an array
disadvantages
  • have to use pointers

Now let's take a step back and look at the different objects for which we may want to define a struct:

intersection/node (maybe)
block/edge        (probably)
street            (probably)
place             (probably)
We'll want several global structures, too:
graph                 (yes)
list of places        (not globally, but one list per block)
list of streets       (yes)
list of intersections (no, already part of adjacency list rep of graph)

The breadth-first search algorithm

Normally we think of this algorithm as performing a search from a single node to all of the other nodes in the graph. In Task 4, however, there are two important differences. First, we are going to search from a block, or edge, to the rest of the graph. One way to do this is to put both endpoints of the edge in the queue initially. Second, instead of listing the nodes (intersections) as we visit them, we want to list the places that we see, which are located on the edges. This is an important difference, because breadth-first search normally ignores an edge if it leads to an already visited node, which we can't necessarily do.

Example: (nodes listed in order they are removed from queue.)

    0       1
      o----o         4 o
      |               /|
      |              / |
      |           3 /  | /~\
      o------------o   | |_|  Anja & Jason's House
    2      /~\      \  |
           |_|       \ |
         Bruce's      \|
         House       5 o
Even though 5 is already in the queue when the edge from 4 to 5 is examined, we have to list Anja & Jason's House