cs230 Lecture Notes
Week 11, Thursday

Homework 9 was due on Tuesday.
Homework 10 is due on Monday.
I hope they are going well.


Today's topics
--------------

0) exam comments

1) hashtable implementation of the Table or Map ADT

2) computer science classes next year


Exam comments
-------------

Median and mean were 77-78, which I consider good performance.

Part One:

For short answer questions, good answers must be factually
correct, but just as importantly, they have to express ideas
clearly, concisely, and in the vocabulary of this class.

Making good design decisions is important, but so is your
ability to talk about designs.

Part Two:

polymorphic, interface, implementations, queueing discipline
(or semantics for half credit), preconditions, tokens,
delimiters.

Grading in this section was mostly digital, since the objective
is to retrieve specific vocabulary words.

Part Three:

Successful answers were the ones that used

1) familiar idioms (traverse the list, remove the elements
   from a PQ, etc.)

2) data structures (PriorityQueue, Stack)

If you tried to implement the solution explicitly, the
result was almost always incomplete and error-prone.


Part Four:

Pretty good.  Most of you are seeing the recursive solution to
problems like this, which is great.

Some confusion about clone.

1) clone() is an object method in the Object class.

2) By default it performs a shallow copy.

3) By convention, classes should override it to perform whatever
   kind of copy is appropriate, usually deep (or recursive) copy.

As an exercise, write an implementation of clone for the Tree
class that does a deep copy:


Simple implementation of the Table ADT
--------------------------------------

Handy to have an object for key-value pairs...

class KeyValuePair {
    Object key, value;

    public KeyValuePair (Object key, Object value) {
        this.key = key;
        this.value = value;
    }

    public String toString () {
        return "{ " + key + ", " + value + " }";
    }
}

If you looked ahead, you might have found this useful for
finding the 20 most common words.

1       { the, 5219 }
2       { a, 2145 }
3       { to, 2037 }
4       { of, 2015 }
5       { is, 1920 }
6       { and, 1369 }
7       { that, 1332 }
8       { in, 1116 }
9       { you, 1048 }
10      { it, 968 }
11      { we, 880 }
12      { verbatim, 861 }    (typesetting artefact)
13      { this, 697 }
14      { for, 688 }
15      { are, 681 }
16      { an, 653 }
17      { if, 544 }
18      { as, 538 }
19      { method, 501 }
20      { can, 442 }

Might want to make KeyValuePair implement Comparable.

Vector implementation of a Hashtable.

public class Table {
    Vector v;

    public void put (Object key, Object value) {
        // incomplete implementation
        KeyValuePair pair = new KeyValuePair (key, value);
        v.add (pair);
    }

    public Object get (Object key) {
        Iterator iterator = v.iterator ();
        while (iterator.hasNext ()) {
            KeyValuePair pair = (KeyValuePair) iterator.next ();
            if (key.equals (pair.key)) {
                return pair.value;
            }
        }
        return null;
    }

    public Enumeration keys () {
        return v.elements ();
    }
}

Actually, put() has to update the KVP if it already exists.


Hashtable performance
---------------------

The problem with this implementation is that both put and get
are linear.

Solution

1) divide the vector into many smaller vectors.

   if the length of the vectors is bounded, then the
   performance of put and get is constant time

2) use a hash function to figure out which vector
   a given key is in.

The Object class provides hashCode

public int hashCode()

     Returns a hash code value for the object.  This method is
     supported for the benefit of hashtables such as those provided by
     java.util.Hashtable.

     The general contract of hashCode is: 

         Whenever it is invoked on the same object more than once
         during an execution of a Java application, the hashCode
         method must consistently return the same integer, provided no
         information used in equals comparisons on the object is
         modified. This integer need not remain consistent from one
         execution of an application to another execution of the same
         application.

         If two objects are equal according to the equals(Object)
         method, then calling the hashCode method on each of the two
         objects must produce the same integer result.

         It is not required that if two objects are unequal according
         to the equals(java.lang.Object) method, then calling the
         hashCode method on each of the two objects must produce
         distinct integer results. However, the programmer should be
         aware that producing distinct integer results for unequal
         objects may improve the performance of hashtables.

     As much as is reasonably practical, the hashCode method defined
     by class Object does return distinct integers for distinct
     objects. (This is typically implemented by converting the
     internal address of the object into an integer, but this
     implementation technique is not required by the Java
     programming language.)


The values returned by hashCode are integers, but they can
be very large integers.

To make them into indices, we need to restrict their range,
for which the modulus operator is handy

    obj.hashCode() % v.size()

Produces an integer that is a legal index into Vector v.

More than one key can map to a given index, so we still might
have to search a vector.

Next time: resizing the hashtable to satisfy performance guarantees.


Computer Science Overview
-------------------------

Hardware

240: Machine Org
340: Architecture
349: Networks

Software

231: Algorithms
251: Programming Languages
301: Compiler Design
341: Operating Systems

Theory

235: Languages and Automata
231: Algorithms
310: Theory of Computation
331: Parallel Machines and Algorithms

Applications

232: Artificial Intelligence
332: Visual Processing
303: Bioinformatics
307: Computer Graphics
349: Databases

Classes you can take with just cs230

235, 231, 232, 251, 332, 307, Databases

Classes you can take with cs230 and cs240

301, 341, 340, 349