Interfaces: Commentary

Steven Zeil

Last modified: Sep 21, 2016
Contents:

The Collections framework of the Java API corresponds roughly to what is sometimes called the STL (Standard Template Library) portion of the C++ std. This is the collection of utility data structures and algorithms that form the basic toolkit for almost any kind of programming activity.

Early versions of the Java API had rather limited offerings here: Vector, LinkedList, Hashtable, and Enumeration were pretty much it. Over time, this limited set has grown to rival, perhaps even exceed the content of std. Of particular note is the addition of the tree-based structures (similar to the ones in std) to the API in addition to the earlier hash-based structures. Because hashed structures, though very fast for random access, are useless for sorted access (and range searches), this is a significant addition.

The use of interfaces to describe the basic data abstractions, which are then implemented by specific classes, lends flexibility to the overall framework. It means that you can, for example, design an application to manipulate data in a Map (a lookup table) and later select the specific class (data structure) implementing the Map interface that gives you the performance characteristics you need.

But if you are going to provide, as the API does, multiple variants on the same basic data abstraction, you need to decide how to handle the fact that some variants might provide useful operations that other cannot. The “obvious” thing to do would be to group the optional operations into separate interfaces and have the variants declare which combinations of those interfaces are supported. For example, Iterator includes a function remove() that might or might not be supported by different kinds of iterators. My own instinct would have been to split this into two concepts:

package java.util;

public interface Iterator<E> {
   boolean hasNext();
   E next();
}

public interface RemovableIterator<E> extends Iterator<E> {
   void remove();
}

but instead the Java API designers went with

package java.util;

public interface Iterator<E> {
   boolean hasNext();
   E next();
   void remove();
}

with a note that calls to remove() might throw UnsupportedOperationException, which they made an unchecked RunTimeException, in direct contradiction to their earlier recommendations. I find these choices curious, to say the least.

1 The Collection Interface

A Collection is basically something that you can add several elements to and later check to see if they are there. Nothing is said about how those elements might be ordered as you add them. Those details get added in subtypes of Collection.

However, even when you start looking at those subtypes, you as an application writer can be assured that operations like add(E), remove(Object), and contains(Object) will always be available.

Iterators

The Java Iterator is, like it’s C++ counterpart, an abstraction for a position within a collection. The Java Iterator is slightly enhanced, in that it feeds into the enhanced for-each construct. (That same for-each has, BTW, been proposed as an addition to the C++ language in the draft for the next international standard.)

But Java iterators can’t be compared to one another to see if they denote the same position within a container, as C++ iterators can. That is, in my opinion, a significant limitation. In C++, the iterator is often returned as a result from searches to denote the position where something was found. This is possible, in Java, but considerably less useful. For example, in C++ we can iterate over an interesting range of positions:

Collection<E>::iterator start = collection.findFirstInterestingItem (x);
Collection<E>::iterator stop = collection.findLastInterestingItem (x);
for (list<E>::iterator interesting = start; interesting != stop; ++interesting)
{
   doSomethingTo (*interesting);
}

I can’t think of an elegant way to express the same code in Java, because there’s no real way to compare an iterator used in a loop to a stopping position. The closest that I can come is:

Iterator<E> interesting = collection.findFirstInterestingItem (x);
Iterator<E> stop = collection.findLastInterestingItem (x);
E stopItem = stop.next();
while (true) {
   E interestingItem = interesting.next();
   if (interestingItem == stopItem) { // comparing addresses of the items
      break;
   }
   doSomethingTo (*interesting);
}

and that only works if collection does not contain any duplicate items.

I also find it awkward that one cannot, in Java, get the value “at” an iterator/position without advancing the position. This makes it hard to write functions like the find… functions above because one cannot actually look at the item at a position before deciding whether the position is something you want to return. For example, in C++, this is a commonly used std function that performs a sequential search:

namespace std {

template <typename Iterator, typename T>
Iterator find (Iterator start, Iterator stop, T key)
{
   for (; start != stop; ++start)
      if (*start == key)
         return start;
   return stop;
}

}

Again, this is all but impossible to write in Java. Not only do we lack a nice way to compare start and stop in the loop condition, but the very essence of the algorithm is that we need to look at the element (*start) in order to determine if start is the position we want. In Java, the only way to look at the element is to call next(), which changes the position when it is called, making it impossible to return that position when we find what we want.

Consequently, it’s fairly rare to see searching functions in Java that return iterators. Iterators are usually employed only to traverse entire collections from beginning to end, making them much less flexible than their C++ counterparts.

2 The Set Interface

TreeSet<E> is very similar to the C++ set<E> in function and performance. HashSet<E> is faster for random access, but does not allow fast iteration over the entire set and, if you do iterate it, visits the elements in an unpredictable order.

Java has no equivalent to the multiset<E>, a set that can contain multiple copies of an element (a.k.a. a “bag”).

3 The List Interface

List<E> in Java is not the direct equivalent of std::list<E> in C++.

Instead, List<E> has two subtypes, ArrayList<E> and LinkedList<E>. ArrayList<E> is an array-like structure (that expands its size as necessary) and is therefore the equivalent of std::vector<E> in C++. LinkedList<E> is a doubly linked list and is the equivalent of std::list<E>.

4 The Queue Interface

5 The Map Interface

TreeMap<K,E> is very similar to the C++ map<K,E> in function and performance. HashMap<K,E> is faster for random access, but does not allow fast iteration over the entire set of keys and, if you do iterate it, visits the elements in an unpredictable order.

Java has no equivalent to the multimap<K,E>, a map where each key can be associated with multiple values..

6 Object Ordering

The Comparable interface is Java’s fix for the fact that one cannot overload the < operator in Java as is commonly done in C++ to affect the order in which data elements would be sorted.

7 The SortedSet Interface

8 The SortedMap Interface

9 Summary of Interfaces