Old version
This is the CS 112 site as it appeared on May 8, 2019.
Problem Set 8
due by 11:59 p.m. on Wednesday, May 1, 2018
No late submissions will be accepted after 11:59 p.m. on Thursday, May 2.
Preliminaries
In your work on this assignment, make sure to abide by the collaboration policies of the course.
If you have questions while working on this assignment, please
come to office hours, post them on Piazza, or email
class instructor
.
Make sure to submit your work following the instructions found at the end of the assignment.
Part I
40 points total
Problem 1: Complete trees and arrays
12 pts. total; individual-only
Put all of your work for this problem in a plain-text file named
ps8pr1.txt
.
Assume that you have a complete tree with 200 nodes, and that you represent it in array form.
-
Node A of the tree is in position 50 of the array. What are the indices of A’s left child, right child, and parent? Explain how you got your answers.
-
What is the height of the tree? Explain your answer briefly.
-
The bottom level of the tree contains some number of leaf nodes. Is the rightmost leaf node in the bottom level the left child of its parent or the right child of its parent? Explain your answer briefly.
-
In Problem 6, you will implement a method that determines if the complete tree corresponding to an array of values is a heap. What is the efficiency of determining if an array of length
n
represents a heap in the best case? In the worst case? Use big-O notation, and explain your answers briefly.
Problem 2: Heap basics
6 pts. total; individual-only
Put all of your work for this problem in a PDF file named
ps8pr2.pdf
. See below for details on how to create the PDF file.
Consider the following heap of integers:

-
Show the heap as it will appear:
-
after one removal (i.e., after one call to the
remove()
method discussed in lecture) -
after a second removal.
You should include two separate diagrams, one for how the heap will look after each removal. To do so, you should:
-
Open the template that we have created for this problem in Google Docs.
-
Select File -> Make a copy..., and save the copy to your Google Drive using the name
ps8pr2
. -
Edit the diagram for part 1 by clicking on it and then clicking the Edit link that appears below the diagram.
-
Make the necessary changes to the tree to show the results of the first removal.
-
Click the Save & Close button.
-
Make a copy of your edited diagram within the Google Doc and edit it to show the results of the second removal. Click the Save & Close button for that new diagram.
-
-
Suppose we have the original heap and that 40 is inserted followed by 25. Show the final heap by editing the diagram that we have provided for part 2 in your copy of the Google Doc.
Once you have made all of the necessary edits to your copy of the Google Doc, choose File -> Download as -> PDF Document, and save the PDF file on your machine.
Problem 3: Heaps and heapsort
7 pts. total; individual-only
Put all of your work for this problem in a PDF file named
ps8pr3.pdf
. See below for details on how to create the PDF file.
Consider the following complete tree of integers:

-
Turn this tree into a max-at-top heap using the procedure outlined in lecture, and show the heap that is produced.
-
Open the template that we have created for this problem in Google Docs.
-
Select File -> Make a copy..., and save the copy to your Google Drive using the name
ps8pr3
. -
Edit the diagram for part 1 by clicking on it and then clicking the Edit link that appears below the diagram.
-
Make the necessary changes to the tree to show the final heap.
-
Click the Save & Close button.
-
-
What is the array representation of the max-at-top heap that you obtained in part 1?
-
Heapsort begins by turning the array to be sorted into a heap. Assume that your answer to part 2 is the result of this process of turning the original array into a heap.
What will the array look like after one element is put into its final position by heapsort – i.e., at the end of the first iteration of the loop in
heapSort()
?What will the array look like after the second element is put into its final position?
Once you have made all of the necessary edits to your copy of the Google Doc, choose File -> Download as -> PDF Document, and save the PDF file on your machine.
Problem 4: Hash tables
9 points total; individual-only
Put all of your work for this problem in a plain-text file named
ps8pr4.txt
.
The following sequence of keys is to be inserted into an initially empty hash table of size 8 that employs open addressing:
cat, goat, dog, bird, bison, ant, flea, bat, duck
The hash function assigns to each key the number of characters in the
key. For example, h("cat")
is 3, because "cat"
has 3 characters.
-
Assume that linear probing is used to insert the keys. Determine which key causes overflow, and show the table at that point by copying the following template into your text file and adding the keys at the appropriate positions:
0: 1: 2: 3: 4: 5: 6: 7:
-
Now assume that quadratic probing is used. Determine which key causes overflow, and show the table at that point.
-
Finally, assume that double hashing is used, with the value of the second hash function based on the first character in the key: a = 1, b = 2, c = 3, d = 4, e = 5, f = 6, g = 7, etc. Determine which key causes the table to overflow, and show the table at the point at which it does so.
Problem 5: Huffman encoding
6 points; individual-only
We will cover the material needed for this problem during the coming week.
Put all of your work for this problem in a PDF file named
ps8pr5.pdf
. See below for details on how to create the PDF file.
Consider the following table of character frequencies:
Character | Frequency |
o | 6 |
f | 9 |
l | 13 |
c | 22 |
a | 30 |
-
(4 points) Show the Huffman tree that would be constructed from these character frequencies.
-
Open the template that we have created for this problem in Google Docs.
-
Select File -> Make a copy..., and save the copy to your Google Drive using the name
ps8pr5
. -
Edit the diagram by clicking on it and then clicking the Edit link that appears below the diagram. It includes the some of the necessary nodes and edges, but you should create more of them as needed, move them into the appropriate positions, and connect them.
-
Click the Save & Close button.
-
-
(2 points) Using the Huffman tree from part 1, what will be the encoding of the string focal?
Once you have made all of the necessary edits to your copy of the Google Doc, choose File -> Download as -> PDF Document, and save the PDF file on your machine.
Part II
60 points total
Preparing for Part II
Begin by downloading the following zip file: ps8.zip
Unzip this archive, and you should find a folder named ps8
, and
within it the files you will need for Part II. Make sure that you
keep all of the files in the same folder.
Problem 6: Determining if an array is a heap
20 points; individual-only
Make sure to begin by following the instructions given above under Preparing for Part II.
In Problem6.java
, we have given you the following pair of methods, one
of which still needs to be implemented:
public static <T extends Comparable<T>> boolean isHeap(T[] arr) { if (arr == null || arr.length == 0) { return false; } return isHeapTree(arr, 0); } private static <T extends Comparable<T>> boolean isHeapTree(T[] arr, int i) { // Implement this method. }
Together, these methods are meant to determine if an array of objects
corresponds to a heap. The methods are generic, and they should work
with an array of any type T
that implement Comparable<T>
.
Implement the second method so that it uses recursion to process
the complete tree/subtree whose root is at position i
in the array
arr
. The method should return true
if that tree/subtree is a heap,
and false
if it is not a heap. A leaf node is the root of a heap with
only one node.
Note that the first method – which is a public wrapper method for
the method that you will write – makes the initial call to your
method with a value of 0
for i
, which means that it will determine
if the complete tree represented by the full array is a heap. You should
not modify this wrapper method.
In implementing your recursive method, you may find it helpful to review:
-
the arithmetic rules for navigating through a complete tree that is stored in an array
-
how the
compareTo()
method can be used to compare two objects.
We have included a main()
method with one unit test. You should
add at least two others.
Problem 7: Implementing a hash table using separate chaining
40 points; individual-only
In lecture, we discussed the following interface for a hash table ADT:
public interface HashTable { boolean insert(Object key, Object value); Queue<Object> search(Object key); Queue<Object> remove(Object key); }
We also examined one implementation of this interface that uses
open addressing (the OpenHashTable
class that we’ve included in
the folder for Part II).
In this problem, you will write your own implementation of this
interface that uses separate chaining. Call your class
ChainedHashTable
, and put it in an appropriately named Java
file. Make sure that the class header indicates that it implements the
HashTable
interface.
Maintaining the state of the hash table
Each ChainedHashTable
object should include at least two private
fields:
-
one called
numKeys
for the number of keys in the hash table -
one called
table
that holds a reference to the array that you will use for the hash table. However, rather than having each element of the array store a single entry as we do in open addressing, each element of the array should serve as a bucket or chain of entries.
More precisely:
-
Each element of the array should hold a reference to the first node in a linked list of nodes. You may not use our
LLList
class or any other linked-list class for this purpose. Rather, you should define a private inner class for the nodes, and you should manipulate these nodes yourself within your hash table methods. -
Each node should store a single key and the collection of values associated with that key, as well as a reference to the next node in the linked list (if any).
-
You should use an
LLQueue
object for the collection of values associated with a given key, just as we do in theOpenHashTable
class.
Other tasks
-
Copy the
h1()
method from ourOpenHashTable
class into your new class, and use it as the hash function. -
Because separate chaining allows all entries whose keys are hashed to a given position in the table to stay in that position, you will not need to perform any probing. As a result, you won’t need the second hash function
h2()
because it is only necessary for double hashing. -
Include a constructor that takes the size of the hash table as its only parameter and initializes the hash table accordingly. It should throw an
IllegalArgumentException
if the size is not positive. -
Implement each method from the
HashTable
interface as efficiently as possible. Make sure that each of these methods has the same basic functionality as the corresponding method from theOpenHashTable
class discussed in lecture. In particular, they should throw exceptions for the same reasons that theOpenHashTable
methods do.As you add or remove items from the table, make sure that you update
numKeys
accordingly.Because a given chain can grow to an arbitrary length, the hash table will never overflow, and thus your
insert
method can always returntrue
. -
Our
OpenHashTable
class doesn’t include atoString()
method, but you should include one in yourChainedHashTable
class. It should return a string of the form{table[0], table[1], ..., table[size - 1]}
where each position of the table is represented by either:
-
the key or keys in that position’s chain of entries, separated by commas and surrounded by square brackets.
-
the word
null
if there are no keys in that position
For example:
> ChainedHashTable table = new ChainedHashTable(5); > table.insert("howdy", 15); > table.insert("goodbye", 10); > table.insert("apple", 5); > table {[apple, howdy], null, null, [goodbye], null}
Note that:
-
Position 0 has a chain with two keys, because both
"howdy"
and"apple"
are assigned a hash code of 0 byh1()
when the table size is 5. -
Position 3 has a chain with one key, because
"goodbye"
is assigned a hash code of 3 byh1()
when the table size is 5.
-
-
Define an accessor method for the number of keys. Call this method
getNumKeys()
. For example:> ChainedHashTable table = new ChainedHashTable(5); > table.insert("howdy", 15); > table.insert("goodbye", 10); > table.insert("apple", 5); > table.getNumKeys() 3 > table.insert("howdy", 25); # insert a duplicate > table.getNumKeys() # duplicates don't change the number of keys 3
-
Although a hash table that uses separate chaining won’t overflow, it can become too full to offer decent performance. To allow clients to measure the degree of fullness, add a method called
load()
that takes no parameters and that returns a value of typedouble
that represents the load factor of the table: the number of keys in the table divided by the size of the table.For example:
> ChainedHashTable table = new ChainedHashTable(5); > table.insert("howdy", 15); > table.insert("goodbye", 10); > table.insert("apple", 5); > table.load() 0.6 > table.insert("pear", 6); > table.load() 0.8
-
To allow clients to obtain all of the keys in the hash table, add a method called
getAllKeys()
that takes no parameters and that returns an array of typeObject
containing all of the keys in the hash table.For example:
> ChainedHashTable table = new ChainedHashTable(5); > table.insert("howdy", 15); > table.insert("goodbye", 10); > table.insert("apple", 5); > table.insert("howdy", 25); # insert a duplicate > Object[] keys = table.getAllKeys(); > keys { apple, howdy, goodbye }
-
To deal with situations in which the table becomes too full, add a method called
resize()
that takes an integer representing the new size, and that grows the table to have that new size. It should not return a value.As discussed in lecture, it is not possible to simply copy every element of the current table into a new, larger table. This is because a given key may belong in a different position in the larger table. As a result, you will need to rehash the current keys in the hash table to ensure that they end up in the correct position in the resized table.
For example:
> ChainedHashTable table = new ChainedHashTable(5); > table.insert("howdy", 15); > table.insert("goodbye", 10); > table.insert("apple", 5); > table {[apple, howdy], null, null, [goodbye], null} > table.resize(7); > table {null, [apple], null, null, null, [howdy], [goodbye]}
Note that in this case, resizing the table causes all three keys to end up in different positions!
Special cases:
-
The method should throw an
IllegalArgumentException
if the specified new size is less than the table’s current size. -
If the specified new size is the same as the table’s current size, the method should return without doing anything.
-
-
Add a
main()
method that performs at least two unit tests for each of the methods in the class. Use the same format as the sample unit test inProblem6.java
.
Optional Extra Credit
Problem 8: Analyzing a text document
20 points; individual-only
Overview
Hash tables can be useful when analyzing textual data.
In the file TextInfo.java
, we’ve given you the beginnings of a
class that will serve two purposes:
-
It will provide a blueprint for
TextInfo
objects that can be used to analyze a text file and to store information about the words that appear in that file. -
It will also serve as program that users can run to analyze a text, and to ask for information about specific words in that text.
Here is the menu of options that the program will provide:
1) Find out where a word appears 2) Lookup a word's frequency 3) List the most common words 4) Quit
Each TextInfo
object will maintain two separate hash tables:
-
one that keeps track of the line numbers on which each word in the text file appears
-
one that keeps track of each word’s frequency (i.e., how often it appears in the text).
In addition, the TextInfo
object will need another data structure
that allows you to efficiently implement option 3: listing the k most
common words in the text for some integer k. You may use any of the
data structures and algorithms that we have studied this semester.
We have included the most recent Java classes from lecture in the ps8
folder, but feel free to use code that we gave you for the other
problem sets as well.
In general, you should try to implement all of the necessary functionality as efficiently as possible.
Your tasks
Here are the steps you should take:
-
Define private fields for the two hash tables and for the additional data structure for option 3. Ideally, you should use your
ChainedHashTable
class from the previous problem for the hash tables. However, if you weren’t able to get that class to work, you may use ourOpenHashTable
class instead, although you will need to give it some of the additional functionality that we asked you to include in theChainedHashTable
class. -
We have given you the beginnings of a constructor for
TextInfo
objects that takes a string specifying the name of a text file. Complete it so that it initializes the fields of the new object based on an analysis on the text file. In particular, it should add the necessary entries to the hash tables, and it should initialize the additional data structure that you are using for option 3.Because we haven’t covered file processing in Java, we’ve given you some skeleton code for reading the file line-by-line. The code that we have provided:
-
creates a
File
object for the file whose name is passed in -
determines the number of bytes in the file, and computes a size for the hash tables that is based on that value
-
creates a
Scanner
object to read from the file -
uses a
while
loop to obtain one line at a time from the file.
You should add whatever code is needed to process each line. You should also add whatever code is needed before and after the
while
loop.Notes:
-
We have included a special clause in the header of the constructor
(throws FileNotFoundException
) that Java requires because the process of creating aScanner
for a file can produce this exception. -
Make sure that you use the
size
variable that we have provided when constructing the two hash tables. -
Don’t forget that you can use the
split()
method to split aString
object into an array of strings. -
You do not need to attempt to remove punctuation from the words.
-
-
Define a method called
getNumWords()
that returns the number of unique words in the text file represented by theTextInfo
object. You should not need to use a separate field for this value. Rather, you should make use of functionality that is already present in the data structures that you created in the constructor.Here’s one example:
> TextInfo text = new TextInfo("romeo.txt"); > text.getNumWords() 2470
-
Define a method called
linesForWord()
that takes a string representing a word and that returns a string specifying the line numbers on which that word appears in the text file represented by theTextInfo
object. If the word does not appear in the file, the method should return the string"none"
.For example:
> TextInfo text = new TextInfo("romeo.txt"); > text.linesForWord("thy") "{102, 109, 152, 161, 163, 279, 325, 463, 520, 596, 605, 612, 630}" > text.linesForWord("Romeo") "{924}" > text.linesForWord("Romeo,") "{343, 472, 701, 1035}" > text.linesForWord("Java") "none"
Hint: When you retrieve the values associated with a key in one of our hash table implementations, you get back a queue of values. Our
LLQueue
class includes atoString()
method that you can use to see all of the values in the queue, so you can – and should! – take advantage of that method! -
Define a method called
frequency()
that takes a string representing a word and that returns an integer specifying the number of times that the word appears in the text file represented by the calledTextInfo
object.For example:
> TextInfo text = new TextInfo("romeo.txt"); > text.frequency("thy") 13 > text.frequency("Romeo") 1 > text.frequency("Romeo,") 4 > text.frequency("Java") 0
-
Define a method called
kMostCommon()
that takes an integerk
as its only parameter, and that returns an array of strings containing thek
most common words in the text file represented by the calledTextInfo
object – beginning with the most common word, then the second most common word, etc.For example:
> TextInfo text = new TextInfo("romeo.txt"); > String[] words = text.kMostCommon(10); > words { the, of, I, and, a, to, in, my, is, you } > text.kMostCommon(20) { the, of, I, and, a, to, in, my, is, you, with, that, ROMEO:, his, not, And, be, BENVOLIO:, it, as }
-
Complete the
main()
method so that it makes the appropriate method calls and displays the appropriate information. We have given you some of the code for this method. You should fill in the rest. A sample run of the program can be found here.
Submitting Your Work
Submission Checklist:
-
You have read the Java Style Guide (linked on Course page)and followed all guidelines regarding format, layout, spaces, blank lines, and comments;
-
You have removed or changed all comments inherited from my templates which do not relate to your solution;
-
ensured that the signature of the methods specified in the assignment have not been changed.
-
You have verified that your programs satisfy all the performance tests in the templates;
Follow the instructions and submit the following files:
- your
ps8pr1.txt
file for Problem 1 - your
ps8pr2.pdf
file for Problem 2 - your
ps8pr3.pdf
file for Problem 3 - your
ps8pr4.txt
file for Problem 4 - your
ps8pr5.pdf
file for Problem 5 - your modified
Problem6.java
file containing your work on Problem 6 - your
ChainedHashTable.java
file containing your work on Problem 7; - your modified
TextInfo.java
file containing your work on Problem 8 (optional)
Warnings
-
Make sure to name the folder ps8.
-
Make sure to use these exact file names as specified or we will not be able to find your files and grade them.
-
Make sure that you do not try to submit a
.class
file or a file with a~
character at the end of its name. -
If you make any last-minute changes to one of your Java files (e.g., adding additional comments), you should compile and run the file after you make the changes to ensure that it still runs correctly. Even seemingly minor changes can cause your code to become unrunnable.
-
Please rememner to NOT create a TAR file.