Program 7

Suite Using Collection Classes

Advanced Programming/Practicum
15-200


Introduction Now that we are using Java 1.5, with generic collection classes, please note the following. For the Pre-Practice Exam, Program #1, and Program #2, write and submit two versions of each program: one without using Java generics and one using Java generics. I'd suggest writing the Pre-Practice Exam and Program #1 first using UNgeneric collections, and then translate them to use generics. For Program #2, I suggest writing it using Java generics first, and then translate back to UNgeneric collections. The actual programs, whether genric or UNgeneric will look very similar. Constantly ask yourself, "Is my solution the simplest it can be?" Toward this end, use the new Java "for" loop whenever appropriate.

For Program #3 and Program #4, use EITHER (whichever you are most comfortable with). For Program #5, DO NOT use Java generics.

This programming assignment is designed to ensure that you know how to use combinations of collection classes to model and solve a wide variety of different programming problems. The kind of abstract thinking that goes into modeling solutions to these programming problems with collection classes is critical to your development as computer scientists. Collection classes are important and useful themselves, and they also rely on almost everything powerful that we have learned about Java: interfaces, classes in an inheritnce hierarchy, casting, and general pupose iterators. OK, I don't mean to oversell this assignment, but I do think that it is important: the first in-class programming exam will be devoted to writing a program, similar to but simpler than these, using collection classes (see the Pre-Practice Exam below).

You will write five different programs for this assignment (plus the Pre-Practice Exam), each requiring a small amount of code, all in a main method (or maybe on static helper method or anonymous class: don't be overly general). The collection classes you will use are queues, sets, lists, and maps. The programs you will write are (the numbers in parentheses are the number of lines in my solution, not counting imports, blank lines, nor lines with just opening or closing braces).

  • Read a file and produce a frequency count of the tokens it contains (28 lines).
  • Read a directed graph and produce the reverse graph (32 lines).
  • Read a directed graph and prompt the user for the name of a node; compute all nodes in the graph reachable from it (34 lines).
  • Read a large text file and produce random text mimicing its style (44 lines).
  • Read a table of values characterizing a function a map from an n argument list to a value; produce a curryed version of the function (32 lines).
All these programs will be described in greater detail below. Instead of starting with empty project folders, download and unzip the Start Project Folders which contains appropriately named folders with all the necessary classes and imports: you just write the main method in each application. Write each program in its own project folder; then put all five folders in another folder whose name is your name and the program number (e.g., pattis-7). Then zip this folder and dropoff that single zip file.

Note that pairing is prohibited on this assignment. Each student should do his/her own work in preparation for the first in-class programming exam, which will be a program similar to one from the early ones in this assignment. I hope that while it might take you a while to complete the first few programs, that once you get accustomed to thinking about and using collections, that you will be able to write the last programs (which are more complicated) at the same speed.

As always, you can check the behavior of your programs against mine by downloading and running my Executable, to help you understand the specification of the problem and observe the programmer/user interaction that you are to implement


General Information For problems in this suite, you must be familiar with the Java interfaces for Queue, Set, Map, List, Iterator, and Comparator. In addition, you must understand the use of the classes SimpleQueue, HashSet, HashMap, ArrayList, and LinkedList, which implement these collections. You must also understand the use of the static sort method defined in the Arrays and/or Collections classes, which use these interfaces, especially Comparator: if you need to use the Arrays class, you must be able to convert between collections and arrays. Finally, you should be familiar with the casting of generic (Object) references, when necessary, and be able to read/use/write small classes with public instance variables that store compound data.

Note: the following line of code prompts the user for a file name and stores into allTokens a String consisting of all the tokens read from the file, separated by single spaces. These tokens can be extracted sequentially by using an object constructed from the StringTokenizer class and simple code. All file input on the test is handled in this way.

  TypedBufferReader tbr = new TypedBufferReader("Enter name of file of words");
  String allTokens      = tbr.readAllTokens();
Qualifications: For full credit on this programming assignment you must
  • Use all the collection classes (and possibly the specified arrays) in the problem statement
  • Use no other collections or arrays.
By following this guideline, your solution should be compact, elegant, and efficient. Work hard at writing beautiful code, letting the collection classes do as much work as possible, automatically. Of course, your program must work not only on the tested data, but on other similar data as well. Read Sun's Javadoc for details on the relevant interfaces and classes used in these problems.

Pre-Practice Exam Although I am listing this program here, you should write it after solving Program #1 and Program #2 below. The Pre-Practice Exam is written in the style and format of the Practice Exam and the real In-Class Programming Exam that you will write. I will go over this problem in an upcoming recitation section, discussing both how to read and solve it (and giving some useful hints). Ideally, you should read this problem before the recitation, and then solve it yourself (either right after you read it, or after the recitation in which I solve it), before the Practice Exam. It has its own form of input that is similar to -but a bit more complicated than- the input forms in this programming assignment. It includes an iterator that returns lines from the input file as a String, whose individual tokens are then returned via a StringTokenizer.

Program #1
Word Frequency
Simple Statement: Read a file of words, building a map (Map[String] -> Integer) from each word to its frequency (the number of times that it occurs in the file), and print the map. Then build a list (List[Map.Entry*]) consisting of the word/frequency pairs in the map. Sort and print this list twice: first with the word/frequency pairs ordered alphabetically; second with the word/frequency pairs ordered from highest to lowest frequency (words with equal frequency can appear in any order; the built-in sort method will take care of these details).

Details: The following details describe more precisely the program that you are to write.

  1. Create the map while processing the input file using the standard call to readAllTokens and the StringTokenizer class.

  2. When printing the map (and later the list, twice) use a single line of Java code each time, calling the standard toString method on the collection being printed (either implicitly or explicitly).

  3. Build the list using the Map.Entry objects stored in the map.

  4. To sort the list, either use two anonymous classes (or write two small named classes) specifying the ways to order the Map.Entry objects stored in the list.
Sample Input: Test your program on the input file input1.txt, which contains the following data
  w x y z
  w x y
  w x w a c
  a b
  c
Sample Output/Interaction: The program will have the following interaction (computer-output appears bold-faced; user-input does not)
  Enter name of file of words: input1.txt

  Map:
  {b=1, x=3, a=2, w=4, z=1, c=2, y=2}

  List (sorted alphabetically):
  [a=2, b=1, c=2, w=4, x=3, y=2, z=1]

  List (sorted by frequency):
  [w=4, x=3, a=2, c=2, y=2, b=1, z=1]

Program #2
Reverse Graph
Simple Statement: Read a file of pairs of node names representing a edges in a directed graph, building a map (Map[String] -> Set[String]) from the first node (source) to a set of the second nodes (destinations). Print all these edges, one source node per line (and all its edges), sorted alphabetically by source node. Build a second map (again Map[String] -> Set[String]) that represents a graph that is the reverse of the first one: if the first graph has and edge a->b then the second one has an edge b->a. Print all these edges, one source node per line (and all its edges), sorted alphabetically by source node. Note that there are multiple data files for this program.

Details: The following details describe more precisely the program that you are to write.

  1. Create the map while processing the input file using the standard call to readAllTokens and the StringTokenizer class.

  2. Write a method to sort/print the map; it will be called for the original map and the map for the reversed graph. When printing the set of destination nodes, use a single line of Java code, calling the standard toString method on the set being printed (either implicitly or explicitly).

  3. To reverse the graph iterate through each of its source nodes; for each source node, iterate through each of its destination nodes; for each source->destination edge in the original graph, put the edge destination->source in the map storing the reverse graph.
Sample Input: Test your program on the input file graph1.txt, which contains the following data
  a b    a c
  b d
  c e    c f
  d g
  e d
  f d    f g
This represents the graph
  Sample Output/Interaction: The program will have the following interaction (computer-output appears bold-faced; user-input does not)
  Enter name of file with graph: graph1.txt

  Graph: source -> {destination} edges
  a -> [b, c]
  b -> [d]
  c -> [f, e]
  d -> [g]
  e -> [d]
  f -> [g, d]

  Reverse Graph: source -> {destination} edges
  b -> [a]
  c -> [a]
  d -> [b, f, e]
  e -> [c]
  f -> [c]
  g -> [f, d]
Note that the source nodes are SORTED alphabetically, but the set of desintation nodes is NOT SORTED. This represents the graph
 

Program #3
Reachable
Simple Statement: Read a file of pairs of node names representing a edges in a directed graph, building a map (Map[String] -> Set[String]) from the first node (source) to a set of the second nodes (destinations). Print all these edges, one source node per line (and all its edges), sorted alphabetically by source node. Prompt the user for the name of a node. Build a set (Set[String]) into which you we will put all nodes reachable from the first, by searching outward from a user-supplied node. Print all the reachable nodes. Note that there are multiple data files for this program; some of them describe circular graphs, which must be correctly processed (avoiding infinite loops in your algorithms).

Details: The following details describe more precisely the program that you are to write.

  1. Create the map while processing the input file using the standard call to readAllTokens and the StringTokenizer class.

  2. When printing the set of destination nodes, use a single line of Java code, calling the standard toString method on the set being printed (either implicitly or explicitly).

  3. To search the graph, build a set (of reachable nodes) and a queue (of nodes to search from), initializing the later with the user-supplied node. While the queue still has nodes, remove one, put it into the set, and for all its destination nodes THAT ARE NOT ALREADY in the set, put them in the queue.

  4. When the queue finally becomes empty (can you prove this always happens -there is no infinite looping), print all nodes in the reachable set.
Sample Input: Test your program on the input file graph1.txt, which contains the following data
  a b    a c
  b d
  c e    c f
  d g
  e d
  f d    f g
See the first picture above.

Sample Output/Interaction: The program will have the following interaction (computer-output appears bold-faced; user-input does not)

  Enter name of file with graph: graph1.txt

  Graph: source -> {destination} edges
  a -> [b, c]
  b -> [d]
  c -> [f, e]
  d -> [g]
  e -> [d]
  f -> [g, d]

  Enter node to start from: c
  Reachable: [g, f, e, d, c]
Note that the source nodes are SORTED alphabetically, but the set of desintation nodes is NOT SORTED.

Program #4
Word Generator
Simple Statement: Prompt the user for the order statistic n: 1, 2, 3, etc. Read a file of tokens, building a map (Map[List[Stringn]] -> List[String*]) from a list of n words to a list of the words in the text following these words: e.g., if n were 2, the map would contain a key for every pair of words in the text, and a value that is a list of all the words following the key (no matter where the pair occurs, with NO DUPLICATES allowed). Print all the associations, one per line, in any order (the n words followed by the list of words that follow them in the text).

Prompt the user for the number of random words to generate, and then prompt for the n words to start with. Build a list (List[String*]) using the words to start with to generate a random next word, then use the previous n words (dropping the oldest word and adding the new word generated) to generate another random word; repeat. Note: you might have to stop prematurely if you generate the last n words in the text, if these words occur nowhere else. That is because in this case, there is no random word to generate following them! Print the list.

Details: The following details describe more precisely the program that you are to write.

  1. Create the map while processing the input file using the standard call to readAllTokens and the StringTokenizer class.

  2. To build the map, start by prereading n words into the list (assume that this is always possible; how might it not be); then read the next word and put it in as a value associated with the list; then, drop the oldest word and add this next word (always keeping the list length at n); repeat until there are no words to read.

  3. Print the map by just iterating over the Map.Entrys, printing the key and value separately (but on the same line).

  4. Generate the list of words by using a technique similar to building the map: keep a list of n words (originally the user is prompted for them), using the list to generate the next word, and altering the list by dropping its oldest word and adding the new one (always keeping the list length at n).

  5. Write the private helper method private static String randomWord(List l) that generates a random number in the range [0,l.size()) and return the word in the list at that location.
Sample Input: Test your program on the input file input1.txt, which contains the following data.
  a b c b a d c b a d c a a b a a d
This isn't real text, but it allows us to exhibit the correct output much more compactly.

Sample Output/Interaction: The program will have the following interaction (computer-output appears bold-faced; user-input does not)

  Enter Order Statistic: 2
  Enter name of sample text file: input1.txt
  
  Allowed Transitions
  [a, d] -> [c]
  [a, b] -> [c, a]
  [a, a] -> [b, d]
  [b, c] -> [b]
  [b, a] -> [d, a]
  [c, b] -> [a]
  [c, a] -> [a]
  [d, c] -> [b, a]
  
  Enter # of words to generate: 10
  Enter prefix word[0]: a
  Enter prefix word[1]: d

  Results = [a, d, c, a, a, b, a, d, c, a, a, d]
So, as you can see, the only allowed transition from [a,d] is c; in the text above, a d appears twice, and it is followed each time by a c. The allowed transitions from [a,b] are c and a; in the text above, a b appears twice, first followed by c and then by a.

In the result we start with a d (specified by the user), we know only c can come next; then using d c we know that either b or a must come next; it randomly chooses a...


Program #5
Currying a Function
Simple Statement: Note, this process -reducing an multi-argument function to a single argument function that returns another function- is named after the logician Haskell Curry.

Prompt the user for the arity, n, of the function being curryed: 2, 3, etc. (must be at least 2). Read a file with each line containing n+1 values (n arguments of the function and the function's value for this arguments) building a map (Map[List[Stringn]] -> String) from the list of arguments to the value. Thus, if the line said a b c it means f(a,b) = c Print the associations in this map. Create a new map of this function completely curryed: repeatedly reduce the arity of the argument list until it is one, mapping each reduced arity list to a value that itself is a map. Print the associations in this map. In the case of an arity of three, the result is Map[String] -> (Map[String] -> (Map[String] -> String))

Note that there are multiple data files for this program.

Details: The following details describe more precisely the program that you are to write.

  1. Create the map while processing the input file using the standard call to readAllTokens and the StringTokenizer class.

  2. To build the map, read n values, appending them to a list, and then put this list into the map associated with the next value read.

  3. When printing the map, use a single line of Java code, calling the standard toString method on the map being printed (either implicitly or explicitly).

  4. To reduce the arity by 1, iterate over the Map.Entrys in the current map, building a curryed map; for each value, translate it as follows: the association [a,b,c]->d would be translated into [a,b]->{c->d}; if another assocation were [a,b,x]->z the result of translating both would be [a,b]->{c->d,x->z}.

  5. Perform the arity reduction n-1 times, so that the arity of the resulting map/function is 1; when going from arity 2 to arity 1 replace the list by the sole value in the list.
Hint: Write aprogram that reduces the arity by 1, until it works on the data shown below. Then, nest that code in a loop. Note that there are multiple data files for this program. Sample Input: Test your program on the input file input2.txt, which contains an arity 2 function that for simplicity catenates its arguments.
  a b ab
  a c ac
  a d ad
  b a ba
  b c bc
  b x bx
This file represents a function of arity 2. Each line is read like f(a,c) = ac.

Sample Output/Interaction: The program will have the following interaction (computer-output appears bold-faced; user-input does not)

  Enter the arity of the function: 2
  Enter name of file of tabulated function: input2.txt

  Original Function
  {[a, d]=ad, [a, c]=ac, [a, b]=ab, [b, c]=bc, [b, a]=ba, [b, x]=bx}

  Curryed Function
  {b={a=ba, x=bx, c=bc}, a={b=ab, d=ad, c=ac}}

  Function is completely curryed
The resulting funcition is of arity 1; f(a) is itself a function, represented by the map {b=ab, d=ad, c=ac}, so f(a)(b) is the value ab, just as in the original function (of arity 2) f(a,b)=ab.

Extra Credit There is one interesting bug in the Word Generator program, causing the map to not be built correctly (and not in any subtle way; you'll see the problem in 3-d). The first person to come upon this bug and describe it on the bboard will get one extra point; I will describe the solution thereafter in response.