Data Structures and Algorithms using Python
Copyright © 2023 by Rance Necaise
Table Of Contents
Data Structures and Algorithms using Python
Copyright © 2023
by Rance Necaise

2.4 Iterators

Traversals are very common operations, especially on containers. A traversal iterates over the entire collection, providing access to each individual element. Traversals can be used for a number of operations, including searching for a specific item or printing an entire collection.

Python's container types---strings, tuples, lists, and dictionaries---can be traversed using the for loop construct. For our user-defined abstract data types, we can add methods that perform specific traversal operations when necessary. For example, if we wanted to save every item contained in a bag to a text file, we could add a saveElements method that traverses over the vector and writes each value to a file. But this would limit the format of the resulting text file to that specified in the new method. In addition to saving the items, perhaps we would like to simply print the items to the screen in a specific way. To perform the latter, we would have to add yet another operation to our ADT.

Not all abstract data types should provide a traversal operation, but it is appropriate for most container types. Thus, we need a way to allow generic traversals to be performed. One way would be to provide the user with access to the underlying data structure used to implement the ADT. But this would violate the abstraction principle and defeat the purpose of defining new abstract data types.

Python, like many of today's object-oriented languages, provides a built-in iterator construct that can be used to perform traversals on user-defined AD Ts. An iterator is an object that provides a mechanism for performing generic traversals through a container without having to expose the underlying implementation. Iterators are used with Python's for loop construct to provide a traversal mechanism for both built-in and user-defined containers. Consider the code segment from the checktimes2.py program in Section 2.1 that uses the for loop to traverse the collection of time

 # Iterate over the bag and check the ages.
for time in bag :
  if time >= startTime and time <= endTime :
    numValid = numValid + 1

Designing an Iterator

To use Python's traversal mechanism with our own abstract data types, we must define an iterator class, which is a class in Python containing two special methods, __iter__ and __next__. Iterator classes are commonly defined in the same module as the corresponding container class.

The constructor defines two data fields. One is an alias to the list used to store the items in the bag, and the other is a loop index variable that will be used to iterate over that list. The loop variable is initialized to zero in order to start from the beginning of the list.

class BagIterator :
  def __init__(self, theList) :
    self._bagItems = theList
    self._curItem = 0

The __iter__ method simply returns a reference to the object itself and is always implemented to do so.

def __iter__(self) :
  return self

The __next__ method is called by Python's iterator mechanism to return the next item in the container.

def __next__(self) :
  if self._curItem < len(self._bagItems) :
    item = self._bagItems[ self._curItem ]
    self._curItem += 1
    return item
  else :
    raise StopIteration

The method first saves a reference to the current item indicated by the loop variable. The loop variable is then incremented by one to prepare it for the next invocation of the __next__ method. If there are no additional items, the method must raise a StopIteration exception that flags the for loop to terminate.

Finally, we must add an __iter__ method to our __Bag__ class, as shown here:

def __iter__(self) :
  return BagIterator(self._theItems)

This method, which is responsible for creating and returning an instance of the BagIterator class, is automatically called at the beginning of the for loop to create an iterator object for use with the loop construct.

The complete implementation of the BagIterator class is shown in the listing below.

Program Listing
Program: itbag.py
  1. # An iterator for the Bag ADT implemented as a Python list.
  2. class BagIterator :
  3.    # Create an interator instance.
  4.   def __init__(self, theList) :
  5.     self._bagItems = theList
  6.     self._curItem = 0
  7.    
  8.    # Return a reference to the iterator object itself.
  9.   def __iter__(self) :
  10.     return self
  11.    
  12.    # Return a reference to the next item in the container.
  13.   def __next__(self) :
  14.     if self._curItem < len(self._bagItems) :
  15.       item = self._bagItems[ self._curItem ]
  16.       self._curItem += 1
  17.       return item
  18.     else :
  19.       raise StopIteration

Using Iterators

With the definition of the BagIterator class and the modifications to the Bag class, we can now use Python's for loop with a Bag instance. When the for loop

for item in bag :
  print(item)

is executed, Python automatically calls the __iter__ method on the bag object to create an iterator object. Figure 2.4.1 illustrates the state of the BagIterator object immediately after being created. Notice the _bagItems field of the iterator object references _theItems field of the bag object. This reference was assigned by the constructor when the BagIterator object was created.

Figure 2.4.1: The Bag and BagIterator objects before the first loop iteration.

The for loop then automatically calls the __next__ method on the iterator object to access the next item in the container. The state of the iterator object changes with the _curItem field having been incremented by one. This process continues until a StopIteration exception is raised by the _next method when the items have been exhausted as indicated by the value in _curItem. After all of the items have been processed, the iteration is terminated and execution continues with the next statement following the loop.

The following code segment illustrates how Python actually performs the iteration when a for loop is used with an instance of the Bag class:

 # Create a BagIterator object for myBag.
iterator = iter(myBag)

 # Repeat the while loop until break is called.
while True :
  try:
     # Get the next item from the bag. If there are no
     # more items, the StopIteration exception is raised.
    item = next(iterator)

     # Perform the body of the for loop.
    print(item)

   # Catch the exception and break from the loop when we are done.
  except StopIteration:
    break
Page last modified on July 30, 2023, at 08:21 PM