Iterators: an ADT for Positions

Steven J. Zeil

Last modified: Apr 29, 2024
Contents:

Many of the polymorphic classes that we will be looking at this semester are containers of multiple instances of other data types.

A challenge is how to design those class interfaces so that they are as general as possible while remaining efficient.

A difficult problem faced by the designers of many container classes is how to provide access to the data stored inside. In this lesson, we will look at the iterator, a very popular pattern for this purpose.

1 Getting Inside a Container

In our OrderedSequence, we used an insert function to put data into the sequence.

OrderedSequence<String> seq = new OrderedSequence<>(100);
seq.insert("abc");

To get data out, we used a get function:

String first = seq.get(0);
String last = seq.get(seq.size()-1);

This function retrieves values by their integer positions within the sequence. Now that’s a bit of a disconnect. When we insert a string into a sequence, we know that it will go into the appropriate position so that the entire sequence is alphabetically ordered. But without looking at all the data already inside the sequence, we don’t know what numerical position it will end up at. So, except for a couple of special cases (first and last), how would we know what integer position to give to retrieve a particular string? Do we even care? We might argue that the most likely application of an ordered sequence would look like this:

OrderedSequence<String> seq = new OrderedSequence<>(100);
    ⋮
for (int i = 0; i < seq.size(); ++i) {
    doSomethingWith(seq.get(i));
}

So we might ask if get-from-an-integer-position is really the best way to go about this.

A related factor: get-from-an-integer-position is easy to implement and fast to perform if we are using an array or array-like structure to implement the sequence. But we will eventually be looking at some other data structures that can support the idea of “things kept in an ordered sequence” for which no get-from-an-integer-position function is available or, if it is available, for which that function is terribly slow.

2 Iterators

If our main concern is to support loops that hand out the contents of a container, one at a time, we gan generalize the idea of the position within a container. To support the idea of looping through the contents of a container, the operations we need are

  1. Get a starting position within the container.
  2. Get the item at a position.
  3. Move a position forward to the next item.
  4. Check to see if a position is at the end of a container.

If we have those operations, we can support code something like this:

for (position pos = start of container; pos is not at the end of seq; move pos forward) {
    doSomethingWith(get the data at pos);
}

The Java pattern for this kind of position within a container is called Iterator.

package java.util;

public interface Iterator <E>➀ {

	/**
	 * Returns true if there are more elements to be visited.
	 */
    public boolean hasNext();➁

	/**
	 * Returns the element at this position and moves the position forward one.
	 */
	public E next();➂

	/**
	 * Optional: remove the element most recently returned by next()
	 */
	public void remove();➃

	public void forEachRemaining(Consumer<? super E> action); ➄

}

What we have just discussed are the functions provided by the iterator. Equally important, the container must also provide some function to provide an iterator denoting the starting position of its data. This is most often done with a function named (surprise!) iterator(). By convention, all containers that provide this function are also said to implement the interface `Iterable’.

2.1 Using Iterators

2.1.1 Example: Books and Authors

For example, we have earlier used a class Book. each book has one or more authors. how would we provide access to those authors? We had been doing it through these functions:

public class Book implements Comparable<Book> {
	⋮
	public int numberOfAuthors() { return numAuthors; }

	public Author getAuthor(int i) {
        return authors[i];
    }
	⋮
}

allowing us to write code like this:

for (int i = 0; i < book.numberOfAuthors(); ++i) {
	Author au = book.get(i);
	doSomethingWith(au);
}

but that’s actually a bit of a problem:

A more typical, more Java-styled approach, would be to provide an iterator for this purpose:

public class Book implements Comparable<Book>, Iterable<Author> {
	⋮
	public Iterator<Author> iterator() { ... }
	⋮
}

allowing us to write code like this:

for (Iterator<Author> it = book.iterator(); iter.hasNext();) {
	Author au = it.next();
	doSomethingWith(au);
}

or

Iterator<Author> iter = book.iterator();
while (iter.hasNext()) {
	Author au = iter.next();
	doSomethingWith(au);
}

I actually prefer the for loop version, because the loop variable iter is limited to the loop body and does not persist into the surrounding code. But some people seem uncomfortable with the empty part of the for loop header after the second semi-colon (‘;’).

2.2 Example: Iterators and OrderedSequence

We had a similar issue providing access to the elements of an OrderedSequence.

Our original approach was

public class OrderedSequence <T extends Comparable<T>> {
     ⋮ 
  public T get(int position) {
    if (position >= theSize) {
      throw new ArrayIndexOutOfBoundsException();
    }
    return (T)data[position];
  }

  /**
   * How big is this sequence?
   * 
   * @return the number of items in the sequence
   */
  public int size() {
    return theSize;
  }
     ⋮

But a more elegant approach is to provide an iterator

public class OrderedSequence <T extends Comparable<T>> implements Iterable<T> {
	⋮

  /**/public Iterator<T> iterator() {
    return ...
  }
}

allowing us to write code like:

OrderedSequence<String> seq = new OrderedSequence<>();
    ⋮
for (Iterator<String> it = seq.iterator(); it.hasNext();) {
	String s = it.next();
	doSomethingWith(s);
}

We’ll look at how to implement this shortly, but first there’s one more refinement to how we can use these iterators.

2.3 The For-Each Loop.

The canonical example of looping with iterators is this:

for (Iterator<Author> it = book.iterator(); iter.hasNext();) {
	Author au = it.next();
	doSomethingWith(au);
}

Actually, there’s a more elegant way to write this loop. Java allows arrays and any class that implements Iterable to be used in a for-each loop like this one:

for (Author au: book) {
	doSomethingWith(au);
}

In effect, the first two lines of

for (Iterator<Author> it = book.iterator(); iter.hasNext();) {
	Author au = it.next();
	doSomethingWith(au);
}

are collapsed into a single abbreviated for header.

For-each loops are wonderful! In effect, you can use iterators without even mentioning them.

There are a couple of limitations to the for-each loop, however:

  1. You can’t use it if, inside the loop, you are adding or removing elements form the container. However, most containers won’t permit that anyway even when using the longer form of loop – their old iterators become invalid as soon as an element is added to or removed from the container.
  2. We will eventually encounter classes that contain multiple sequences of data and therefore provide multiple iterators to access them. That doesn’t follow the Iterable interface, and so we cannot use the for-each loop with those.

3 Implementing Iterators

OK, that’s all very nice. But none of it happens if our containers do not provide an iterator.

When a container says that it provides an iterator through a function like

public Iterator<T> iterator() { ... }

it does not really return an object of the exact type Iterator<T>. What it will actually do is to use inheritance to create a subclass of Iterator<T>. By the rule of subtyping, such an object can be used any place where we would expect an Iterator<T>.

This subclass could be declared in a separate file, but because its implementation will usually require direct access to the private data members of the collection, it is more common to see this done as a nested class within the container class.

3.1 Example: Adding an Iterator to OrderedSequence

If we want to support

public class OrderedSequence <T extends Comparable<T>> implements Iterable<T> {
  private Comparable[] data; // Array holding elements
  private static final int DEFAULT_SIZE = 10; // Default max size
  private int maxSize; // Maximum size of seq
  private int theSize; // Current # of seq items
	⋮

  /**/public Iterator<T> iterator() {
    return ...
  }
}

we can declare a new class, say, OrdSeqIterator, within the OrderedSequence class.

OrdSeqIterator will need to

  1. Be declared as a subtype of Iterator<T>.
  2. Provide the hasNext() and next() functions expected of any iterator.
  3. Have, as its own data member, some indicator of a position within the sequence.

In this case, because the container (OrderedSequence) stores its data in arrays, the position can be saved as a simple integer.

public class OrderedSequence <T extends Comparable<T>> implements Iterable<T> {
  private Comparable[] data; // Array holding elements
  private static final int DEFAULT_SIZE = 10; // Default max size
  private int maxSize; // Maximum size of seq
  private int theSize; // Current # of seq items
	⋮

  private class OrdSeqIterator <T> ➀ implements Iterator<T>➁ {

    private int pos;➂

    public OrdSeqIterator() {
      pos = 0;➃
    }

    @Override
    public boolean hasNext() { ➄
      return pos < theSize;
    }

    @Override
    public T next() {
      ++pos; ➅
      return (T)data[pos-1]; ➆
    }

  }

  public Iterator<T> iterator() {
    return new OrdSeqIterator<>(); ➇
  }
}

4 Iterators are Universal

As we move forward in this course, we will begin collecting data in containers other than arrays. Nearly all of of these will provide iterators following the guidelines we have discussed in this lesson. Those iterators might not be the primary way that we get data out of the container, and might not be the fastest way to do so, but they are there.

It is pretty much accepted practice in the Java programming community that, when you design your own classes that hold multiple instances of other data, you should provide an iterator for that data. An example of this would be keeping track of Authors in a Book.

Overall, iterators are just a pervasive part of the Java programming style.

5 java.utils.Collections

Because iterators are so pervasive, it is possible to write some functions that operate on most containers, including ones that we haven’t written yet.

Look at java.utils.Collections. You’ll see versions of some of the functions that we saw earlier in java.utils.Arrays, including binarySearch, fill, sort, and a variety of functions that we aren’t ready to deal with yet but that copy data from one container into another.

As we do so, you might want to check out java.utils.Collections, which provides similar utility functions for non-array collections of data.