CS231 lecture notes, Fall 1999
Week 14, Monday

Uses for a table
----------------

Accumulating information about a set of things, when the
set of things is not know ahead of time.

Associative array.

Compression of big tables.

Ever wonder how the switch statement works?

It's actually better than the corresponding chained conditional.

Since the cases are known at compile time, it can generate a
perfect hash function!


Table ADT implementations
-------------------------

The discussion of this issue on page 362 of the book is
brain dead.  They compare three implementations:

sorted array
AVL tree
hashtable

by looking at 6 operations:

initialization
determine if full
get
put
delete
enumerate in order


Problems with their discussion

1) I have no idea what they mean by initialization, and I
   cannot think of a reason either the sorted array or the
   hashtable would take O(n) time to initialize.

2) "determine if full" is not an operation client code performs

3) it is not obvious that enumerating in order is a more essential
   operation than just enumerating.


If you get rid of all the crap, then basically hashtables win
across the board, and there is no reason you would choose
something else,

	  provided that all you care about is order of growth.


The other big piece of wool in this discussion is that ultimately
order of growth is not a reliable way to choose an implementation.


Performance measurement
-----------------------

Often a good way to choose an implementation is to assemble a
test suite that is characteristic of the workload you expect.

This includes not only

1) the relative frequencies of the various operations

but also things like

2) the dispersion of keys and the order they arrive

3) the tendency of the hash function to disperse keys

etc.


If you write code using the Table ADT, without violating the
wiggly line of abstraction, then it is easy to test multiple
implementations and switch.


Dynamic hybrids
---------------

Using ADTs also facilitates the possibility of dynamic hybrid
data structures.

For example, if you expect to insert a lot of things into
the Table, and then perform many iterations of "enumerate in
order," you might build a hashtable, sort it once, and store
the result in a sorted array.

In theory, an implementation could switch from one form to
another dynamically, all without the knowledge of the client
code.

In practice, that sort of thing is not common, except in
memory-management land

		  (quick plug for Section 7.6)


Performance measurement
-----------------------

If you want to know how the java.util.HashTable is implemented,
how could you find out?

First of all, what hint do we get from the documentation?

Do keys and elements return the keys/elements in order?

What experiment would we perform?