University of Virginia, Department of Computer Science
CS201J: Engineering Software, Fall 2002

Problem Set 3: Implementing Data Abstractions Comments




Honor Code Reminder: If you are currently taking CS201J, it is a violation of the course pledge to look at Problem Set answers and discussions from previous years. Please don't do it.




Average: 74/100.

Suppose this is the implementation of degree:
   public int degree () {
      // EFFECTS: Returns the degree of this, i.e., the largest exponent
      //    with a non-zero coefficient.  Returns 0 if this is the zero Poly.
      return terms.lastElement ().power;
    }
Question 1: (Average 6.1 / 8) What rep invariant would make the implementation of degree above satisfy its specification?

For the code to execute without any possible run-time exceptions, we need:

terms != null (otherwise terms. could give a null object exception)
terms.size > 0 (otherwise lastElement () could give a NoSuchElementException)
terms.containsNull == false (otherwise .power could give a null object exception)
terms.elementType == \type(TermRecord) (the code is actually missing a cast - it should be return ((TermRecord) terms.lastElement ()).power)
But, that's not enough to know the implementation meets its specification. We need to know also that the last element in the vector corresponds to the element with the largest exponent with a non-zero coefficient. A reasonable rep invariant would require that the terms are sorted by their power value, and that all the terms have non-zero coefficients:
for all 0 < i <= terms.size:
   terms[i].power >= 0
   terms[i].coeff != 0
   for all 0 < j < i, terms[j].power <= terms[i].power
In an alternate implementation with the same rep, suppose this is the implementation of coeff:
   public int coeff (int d) {
      // EFFECTS: Returns the coefficient of the term of this whose
      //     exponent is d.  

      int res = 0;
      Enumeration els = terms.elements ();

      while (els.hasMoreElements ()) {
	 TermRecord r = (TermRecord) els.nextElement ();
	 if (r.power == d) { res += r.coeff; }
      }

      return res;
   }
Question 2: (5.5 / 7) What rep invariant would make the implementation of coeff above satisfy its specification? Explain how a stronger rep invariant would make it possible to implement coeff more efficiently.

Note that the loop sums the values of the coefficients for every record with its power matching d. This indicates that the author of this code is not assuming a given power can only exist in the terms once. Hence, the rep invariant could be:

terms != null (otherwise terms. could give a null object exception)
NoSuchElementException)
terms.elementType = \type(TermRecord) terms.containsNull = false
A stronger rep invariant would make it possible to implement coeff more efficiently. If the implementer of coeff could rely on the terms not containing duplicate powers, then we could implement coeff to return right away after finding the first matching power, instead of having to look through all the terms.

If we made the rep invariant even stronger:

terms[i].power = i for 0 <= i < terms.size
(that is, the vector contains a term record for every power in order) we could implement coeff with just:
public int coeff (int d) {
   if (d >= terms.size ()) return 0;
   else return ((TermRecord) terms.getElementAt (d)).coeff;
}
Question 3: (8.8 / 10) In the example above, we decided to allow null in the els vector. A reasonable alternative design decision would be to not allow null in the els vector. Modify StringSet.java to not allow null in the string set.

Start with inv1/StringSet.java, the version of StringSet after adding the first invariant clause. Instead of adding //@invariant els.containsNull == true you will add //@invariant els.containsNull == false.

Check your implmentation with ESC/Java, and add annotations or fix the code so running ESC/Java on your program produces no warnings. (Recall that you can require a parameter to be non-null using //@requires s != null.)

Our code is StringSet.java [raw file]. We added //@requires s != null annotations to insert, remove, isIn and getIndex. Note that this allows us to simplify the code for getIndex.

Question 4: (9.4 / 10) Describe at least two possible representation choices for StringTable. Discuss the advantages and disadvantages of each representation, and explain why you selected the representation you did. Your choice of representation will have a big impact on how difficult it is to implement the StringTable data type, so think carefully about how you will implement all the methods of StringTable in choosing your representation.

There are many reasonable possible representation choices of StringTable. Here are some possibilities:

  1. Two Vectors, one containing the keys, one containing the values.
  2. A Vector containing key/value records.
  3. A Vector containing key/value records, in order by key.
  4. A Vector containing key/value records, in order by value.
  5. A Hashtable mapping keys to values.
  6. A Vector containing key/value records, in order by value, and a Hashtable mapping keys to values.
For each of the above choices, we could also use arrays instead of Vectors. Using an array should be more efficient for accessing elements, but means we would need to worry about writing code to reallocate a bigger array as elements are added.

Option 1 (two Vectors) has the advantage that we can implement it directly — there is no need to implement a record class to store the key/value pairs. It has the disadvantage that we need to maintain two separate Vectors. This is probably more work, and more likely to lead to errors if we forget to maintain both Vectors, so we reject Option 1. Another disadvantage of Option 1, is that double is a primitive type and cannot be stored directly in a Vector. Instead, we would need to use the Double object type.

Option 2 (a Vector of key/value records) has the advantage that we only need to maintain a single Vector. We will need to implement a record class to store the key and value. If we don't make any constraints on the order of elements in the vector, however, it will be difficult to implement getValue efficiently and implementing getNthLowest will be very complicated. So, we consider options 3 and 4. In Option 3, we keep the entries in the Vector in order by their keys. This will make it possible to implement getValue efficiently (for example, by doing a binary search). It won't help us implement getNthLowest though. If we thought getValue was extremely important to implement efficiently, we might consider this further, but since our priorities are simplicity of implementation, Option 4 (ordering by the values) is a better choice. Keeping the entries sorted by value will make implementing getNthLowest much easier, since we can just select the indexth element from the Vector. (Compare this to how the stronger rep invariant made implementing coeff easier in Question 2.)

The final two options use Hashtables. The Java API provides a java.util.Hashtable class that provides an efficient way of mapping keys to values. Option 5 would make it each to implement getValue — its just a hash table lookup. But, implementing getNthLowest would be extremely difficult. Option 6 combines the Hashtable and Vector sorted by values. This will require extra work since we need to maintain both a Hashtable and the Vector — each new entry will need to be added to both. But, it would have the advantage that both getValue and getNthLowest can be implemented simply and efficiently: getValue is just a Hashtable lookup, and getNthLowest is just a vector index. If performance of both getValue and getNthLowest is important, this option would be the best of the possibilities considered.

Since we think simplicity of implementation is more important, though, we select Option 4 and implement our rep with:

class TableEntry {
    /*@non_null@*/ public String name;
    public double value;
}

public class StringTable 
{
    // Rep:
    private Vector entries; // A Vector of TableEntry objects.
}
Question 5: (8.9 / 10) What is your representation invariant?

As explained above, we keep the Vector sorted by value so this should be in our rep invariant:

    // invariant the entries are sorted in order by entries[i].value
    //    for all 0 <= i < j < entries.size, entries[i].value <= entries[j].value
In addition, the entries must not contain duplicate elements with the same key:
    // invariant the entries do not contain duplicate keys 
    //    for all 0 <= i < entries.size, 0 <= j < entries.size,
    //        entries[i].key.equals (entries[j].key) ==> i = j
The other properties of our rep invariant can be expressed formally with ESC/Java annotations — entries must not be null, every element or entries must be a TableEntry object, and entries must not contain null:
    //@invariant entries != null;
    //@invariant entries.elementType == \type(TableEntry) 
    //@invariant entries.containsNull == false 
In addition, we express the constraint between the number of elements in our vector and the numEntries specification (ghost) variable with:
    //@invariant entries.elementCount == numEntries 
There is also a representation invariant for our TableEntry record type. The name value may not be null. We can express this using /*@non_null@*/:
class TableEntry {
    /*@non_null@*/ private String name;
    public double value;
    ...
Question 6: (8.8 / 10) What is your abstraction function? Your abstraction function should map your concrete representation to the abstract string table.

Our abstraction needs to map our concrete representation, a Vector of TableEntry objects to the abstraction notion in the StringTable specification: A typical StringTable is {<s0: d0>, <s1: d1>, ... }.

    AF(c) = { <c.entries[i].key: c.entries[i].value> | 0 <= i < entries.size }

Question 7: (18.2 / 25) Implement the StringTable datatype. Check your implementation with ESC/Java. Add annotations to document your invariant and preconditions and postconditions of methods as necessary to remove warnings.

If you had trouble with this, please go through our implementation carefully. [Download]


import java.io.*;
import java.util.Vector;
import java.util.Enumeration;

class TableEntry {
    /*@non_null@*/ public String name;
    public double value;
    
    public TableEntry (/*@non_null@*/ String n, double r) {
	name = n;
	value = r;
    }
    
    public String toString () {
	return "<" + name + ": " + value + ">";
    }
}

public class StringTable 
{
    // overview: StringTable is a set of <String, double> entries,
    //    where the String values are unique keys.  A typical StringTable
    //    is {<s0: d0>, <s1: d1>, ... }.
    //

    //@ghost public int numEntries ; // The number of entries in the table

    // Rep:
    private Vector entries ;

    // Rep Invariant:

    //@invariant entries.elementType == \type(TableEntry) 
    //@invariant entries.containsNull == false 
    //@invariant entries.elementCount == numEntries 
    //@invariant entries != null;

    // invariant the entries are sorted in order by entries[i].value
    //    for all 0 <= i < j < entries.size, entries[i].value <= entries[j].value
    // invariant the entries do not contain duplicate keys 
    //    for all 0 <= i < entries.size, 0 <= j < entries.size,
    //        entries[i].key.equals (entries[j].key) ==> i = j
    
    // Abstraction Function:
    //
    // AF(c) = { <c.entries[i].key: c.entries[i].value> | 0 <= i < entries.size }
    

    public StringTable () 
    //@ensures numEntries == 0;
    {
	entries = new Vector ();
	//@set entries.elementType = \type(TableEntry)
	//@set entries.containsNull = false
    } /* ESC/Java cannot establish invariant for empty Vector */ 
    
    /* 
    ** This method was used in PS2, but you do not need to implement it for PS3.
    */

    // Read a StringTable structure from a file formated as the output of toString
    public StringTable (InputStream instream) 
       // requires: The stream instream is a names file containing lines of the form
       //                   <name>: <rate>
       //           where the name is a string of non-space characters and the rate is
       //           a floating point number.
       // modifies: instream
       // effects:  Initializes this as a names table using the data from instream.
    {
	this ();
	
	try {
	    StructuredReader reader = new StructuredReader (instream);
	    while (true) {
		try {
		    String name = reader.readThroughAny (":");
		    reader.skipWhitespace ();
		    double value = reader.readDouble ();
		    reader.skipWhitespace ();

		    if (lookupEntry (name) != null) {
			System.err.println ("Names file contains duplicate entry: " + name);
		    } else {
			addName (name, value);
		    }
		} catch (EOFException e) {
		    break;
		} 
	    } 
	    
	    reader.close ();
	    // It would be better to propagate these exceptions to the caller,
	    // but for PS2, we avoid the use of exceptions.
	} catch (IOException e) {
	    System.err.println ("Error reading names file: " + e);
	} catch (NoNumberException e) {
	    System.err.println ("Error reading number in names file: " + e);
	} catch (NumberFormatException e) {
	    System.err.println ("Number format error reading names file: " + e);
	}
    }

    // To avoid duplicate code, we implement a helper method for looking 
    // up an entry in the table.  Note that it is declared as "private"
    // so is not visible to callers.  Because it is private, we can
    // return a mutable component of the rep here.  If callers were
    // allowed to call lookupEntry, that would expose the rep.

    private TableEntry lookupEntry (String name)
	// effects: If there is an entry in the table where entry.name.equals (name),
	//    returns that entry.  Otherwise, returns null.

    {
	for (Enumeration e = entries.elements (); e.hasMoreElements (); ) {
	    TableEntry entry = (TableEntry) e.nextElement ();
	    if (entry.name.equals (name)) {
		return entry;
	    }
	}
	
	return null;
    }

    public void addName (/*@non_null@*/ String name, double value)
       // requires: The parameter name is not null.  (This is what the
       //    ESC/Java /*@non_null@*/ annotation means.)
       // modifies: this   
       // effects:  If key matches the value of String in this, replaces the value associated
       //    with that key with value.  Otherwise, inserts <key, value> into this.
       //           e.g., if this_pre = {<s0, d0>, <s1, d1>, <s2, d2>}
       //                     and s0, s1 and s2 are all different from key
       //                 then this_post = {<s0, d0>, <s1, d1>, <s2, d2>, <key: double>}.
       //                 if this_pre = {<s0, d0>, <s1, d1>, <s2, d2>}
       //                     and s1 is the same string as key
       //                 then this_post = {<s0, d0>, <s1, value>, <s2, d2>}
       //                 
       //@modifies numEntries
       //@ensures  numEntries >= \old(numEntries);
    {
	TableEntry oldentry = lookupEntry (name);
	
	if (oldentry != null) {
	    // We can't just replace the value - that would break the rep invariant.
	    // Instead, we remove it and add the entry again.
	    entries.removeElement (oldentry);
	} 
	  
	// Linear search to find the location to insert (before the first
	// element whose value is greater than this one)

	int index;
	
	for (index = 0; index < entries.size (); index++) {
	    TableEntry entry = (TableEntry) entries.elementAt (index);
	    if (entry.value > value) {
		break;
	    }
	}

	entries.insertElementAt (new TableEntry (name, value), index);
    }
    
    public double getValue (String name)
    {
	TableEntry entry = lookupEntry (name);
	
	if (entry != null) {
	    return entry.value;
	} else {
	    return 0;
	}
    }
    
    public /*@non_null@*/ String getNthLowest (int index)
       //@requires index >= 0;
       //@requires index < numEntries;
    {
	TableEntry entry = (TableEntry) entries.elementAt (index);
	return entry.name;
    }
    
    public int size ()
       // EFFECTS: Returns the number of entries in this.
       //@ensures \result == numEntries;
    {
	return entries.size ();
    }
    
    public /*@non_null@*/ StringIterator keys () 
       // EFFECTS: Returns a StringIterator that will iterate through all the keys in this in 
       //    order from lowest to highest.
    {
	Vector allNames = new java.util.Vector ();
	//@set allNames.elementType = \type(String) ;
	//@set allNames.containsNull = false ;
	
	for (Enumeration e = entries.elements (); e.hasMoreElements (); ) {
	    allNames.addElement (((TableEntry) (e.nextElement ())).name);
	}
	
	return new StringIterator (allNames.elements ());
    } //@nowarn Invariant
    /* ESC/Java has a bug in checking the invariant here, but we know it is
       preserved since the code doesn't modify the representation at all. 
       The problem is ESC/Java cannot distinguish between a Vector that is
       used to represent a StringTable, and the local Vector variable that
       happens to have the same type.
    */
    
    public String toString ()
    {
	// Using the mutable StringBuffer type instead of the immutable String
	// type will save the trouble of making lots of new String objects.
	StringBuffer res = new StringBuffer ("{ ");
	boolean firstone = true;

	for (Enumeration e = entries.elements (); e.hasMoreElements (); firstone = false) {
	    TableEntry entry = (TableEntry) e.nextElement ();
	    if (!firstone) { res.append (", "); }
	    res.append (entry.toString ());
	}
	
	res.append (" }");
	return new String (res);
    }
}

Question 8: (6.6 / 10) Devise a testing strategy for your implementation. You should add a main method to your StringTable class that runs the test cases. Describe your testing strategy in general, and any problems in your implementation that were discovered in testing.

You can (and should) devise and describe a testing strategy even if you don't get the implementation wokring. You should always be testing your code as you develop it, instead of waiting until you have written all the code before starting to test.

A good strategy would be to consider the boundary and interesting cases for the data abstraction, and test all methods on those cases. Interesting tables include the empty table, a table with one entry, and a table with many entries. We should focus on the most complicated methods: in our implementation, that would be addName. We should try adding names to each of the three possible tables. Black box testing of addName also demands testing the case where we add a name that already exists. We should test both cases where the name has a value equal to some other element in the table, where the value is lower than the lowest element in the table, and where the value is higher than the highest element in the table. After each modification, we should check the state of the table using the observers (getValue, getNthLowest and toString). A simpler implementation will make our testing easier. Many of your programs had fiendishly complicated addName or getNthLowest methods with many different paths. Glass box testing of those methods would require considerable effort to convince yourself they work correctly. Since our implmentation only has one path for getNthLowest, there are no new glass box tests to consider. Our addName implementation has separate paths for when the entry already exists and when it doesn't, and then a loop for the number of entries. Our black box tests already cover all of those paths.

Question 9: (7.0 / 10 — 13 out of 25 students were able to pass all the test cases) Submit your code using the Submission Form. Your program will be tested on public test cases, as well as secret test cases (which will not be revealed until after the assignment is due). You receive 10 points if your code passes all the test cases.

Here are the tests we used: SecretTest.java.


CS201J University of Virginia
Department of Computer Science
CS 201J: Engineering Software
Sponsored by the
National Science Foundation
cs201j-staff@cs.virginia.edu