Sunday, March 7, 2010

Java's Iterators and Iterables

What is an Iterator?

From Wikipedia:


In computer science, an iterator is an object that allows a programmer to traverse through all the elements of a collection, regardless of its specific implementation.


The point of using an iterator is to allow the the client to iterate over the collection of objects, without exposing implementation details. This gives you the benefit to change what collection the Iterator iterates over and how each element is served, without making any change the client's code.

Making custom collections iterable

Let's say we have our own custom Vector-based collection, GameCollection, that stores Games and we want to make this collection iterable (over its stored Games) by making it implement the Iterator inteface:

public class GameCollection implements Iterable<Game> {
 private Vector<Game> games;
 
 public GameCollection() {
  games = new Vector<Game>();
 }
 
 public void add(Game game) {
  games.add(game);
 }

 @Override
 public Iterator<Game> iterator() {
  return games.iterator();
 }
}

The client can then use the above collection as follows:

GameCollection gc = new GameCollection();
Iterator<game> gameIterator = gc.iterator();

while (gameIterator.hasNext()) {
 Game g = gameIterator.next();
 System.out.println(g.getName());
}

Or better yet, make use of Java's for-each statement:

GameCollection gc = new GameCollection();
for (Game g : gc) {
 System.out.println(g.getName());
}

As you can see from the above code, the client doesn't know that we are using a Vector to store our Games in the GameCollection. Infact, we can later change the GameCollection to store the Games in a LinkedList or an Array, and yet the client wouldn't need to change his iteration code (code above) to suit for our changes.

Multiple iterators for a custom collection

What if we now want our GameCollection to offer two iterable solutions? Say we want our GameCollection to iterate over both Games and also GameConsoles?

The first thing that comes to mind is to try and make the GameCollection implement two Iterator interfaces using different generic arguments:

public class GameCollection implements Iterable<Game>, Iterable<GameConsole>

but unfortunately, the above doesn't compile. Therefore, as a workaround to such an issue, we can make use of Java's ability to allow for inner classes:

public class GameCollection  {
 private Vector<Game> games;
 private Vector<GameConsole> consoles;
 
 private class Games implements Iterable<Game> {
  @Override
  public Iterator<Game> iterator() {
   return games.iterator();
  }
 }
 
 private class Consoles implements Iterable<GameConsole> {
  @Override
  public Iterator<GameConsole> iterator() {
   return consoles.iterator();
  }
  
 }
 
 public GameCollection() {
  games = new Vector<Game>();
  consoles = new Vector<GameConsole>();
 }
 
 public void add(Game game) {
  games.add(game);
 }
 
 public void add(GameConsole console) {
  consoles.add(console);
 }

 public Games games() {
  return new Games();
 }
 
 public Consoles consoles() {
  return new Consoles();
 }
}

With the above two inner classes and the public methods games() and consoles(), the client can iterate over the collections like such:

GameCollection gc = new GameCollection();

//Add games and consoles with gc.add()

for (Game g : gc.games()) {
 System.out.println(g.getName());
}

for (GameConsole g : gc.consoles()) {
 System.out.println(g.getName());
}

Creating custom iterators

Up till this point, we have only used iterators that are in built with Java's existing collections; but what if we want our own custom iterator? We can use Java's java.util.Iterator interface to build our own iterator.

In the following example, I have created a Circular Iterator:

public class CircularGamesIterator implements Iterator<Game> {

 private Vector<Game> list;
 private int currentPosition;
 
 public CircularGamesIterator(Vector<Game> games) {
  list = games;
  currentPosition = 0;
 }
 
 @Override
 public boolean hasNext() {
  return currentPosition < list.size();
 }

 @Override
 public Game next() {
  Game el = list.elementAt(currentPosition);
  currentPosition = (currentPosition + 1) % list.size(); 
  return el;
 }

 @Override
 public void remove() { }
}

The iterator() method of the GameCollection class can then be modified to return an instance of the CircularGamesIterator:

public class GameCollection implements Iterable<Game> {
 private Vector<Game> games;
 
 public GameCollection() {
  games = new Vector<Game>();
 }
 
 public void add(Game game) {
  games.add(game);
 }

 @Override
 public Iterator<Game> iterator() {
  return new CircularGamesIterator(games);
 }
}

An Iterator should not be Iterable!

I have seen some examples on the internet where people make Iterators iterable by making their Iterators implement the Iterator interface and returning this in the iterator() method:
I have removed the implementation in the methods for the following code for brevity.

public class CustomIterator implements Iterator<Game>, Iterable<Game> {

 public CustomIterator(Vector<Game> games) {
  list = games;
 }

 @Override
 public Iterator<Game> iterator() {
  return this;
 }

 @Override
 public boolean hasNext() {
   //Implementation
 }

 @Override
 public Game next() {
   //Implementation
 }

 @Override
 public void remove() { }
}

The problem with the above code can be clearly illustrated with such a method:

public static void iterateTwice(Iterable<Game> games) {
 for (Game g : games) {
  System.out.println(g.getName());
 }
 
 for (Game g : games) {
  System.out.println(g.getName());
 }
 System.out.println();
}

If you directly pass in your iterable iterator to the above method, the Games will only be printed once.

To illustrate this, let's assume that we will use the custom iterator mentioned above, CustomIterator, that implements both Iterator<Game> and Iterable<Game>. Our GameCollection implements Iterable<Game>, and its iterator() method returns new CustomIterator(games);. Now, let's also say that GameCollection also provides a method games() that also returns new CustomIterator(games);.

Here's an example that illustrates this:
GameCollection gc = new GameCollection();

gc.add(new Game("Broken Sword"));
gc.add(new Game("Grim Fandango"));
gc.add(new Game("Maniac Mansion"));

System.out.println("Printing the iterable collection, GameCollection");
iterateTwice(gc);

System.out.println("Printing the iterable iterator, CustomIterator");
CustomIterator gamesIterator = gc.games();
iterateTwice(gamesIterator);

The output of the above code is as follows:
Printing the iterable collection, GameCollection
Broken Sword
Grim Fandango
Maniac Mansion
Broken Sword
Grim Fandango
Maniac  Mansion

Printing the iterable iterator, CustomIterator
Broken Sword
Grim Fandango
Maniac Mansion

Notice how when passing the iterable CustomIterator iterator to the method, the Games were only printed once, whereas when we passed in the GameCollection that implements Iterable, the Games were printed twice.

This is because when CustomIterator was passed in, the second for loop terminated immediately.