Traversals are very common operations, especially on containers. A traversal iterates over the entire collection, providing access to each individual element. Traversals can be used for a number of operations, including searching for a specific item or printing an entire collection.
Python's container types---strings, tuples, lists, and dictionaries---can be traversed using the for
loop construct. For our user-defined abstract data types, we can add methods that perform specific traversal operations when necessary. For example, if we wanted to save every item contained in a bag to a text file, we could add a saveElements
method that traverses over the vector and writes each value to a file. But this would limit the format of the resulting text file to that specified in the new method. In addition to saving the items, perhaps we would like to simply print the items to the screen in a specific way. To perform the latter, we would have to add yet another operation to our ADT.
Not all abstract data types should provide a traversal operation, but it is appropriate for most container types. Thus, we need a way to allow generic traversals to be performed. One way would be to provide the user with access to the underlying data structure used to implement the ADT. But this would violate the abstraction principle and defeat the purpose of defining new abstract data types.
Python, like many of today's object-oriented languages, provides a built-in iterator construct that can be used to perform traversals on user-defined AD Ts. An iterator is an object that provides a mechanism for performing generic traversals through a container without having to expose the underlying implementation. Iterators are used with Python's for
loop construct to provide a traversal mechanism for both built-in and user-defined containers. Consider the code segment from the checktimes2.py
program in Section 2.1 that uses the for
loop to traverse the collection of time
# Iterate over the bag and check the ages.
for time in bag :
if time >= startTime and time <= endTime :
numValid = numValid + 1
Designing an Iterator
To use Python's traversal mechanism with our own abstract data types, we must define an iterator class, which is a class in Python containing two special methods, __iter__
and __next__
. Iterator classes are commonly defined in the same module as the corresponding container class.
The constructor defines two data fields. One is an alias to the list used to store the items in the bag, and the other is a loop index variable that will be used to iterate over that list. The loop variable is initialized to zero in order to start from the beginning of the list.
class BagIterator :
def __init__(self, theList) :
self._bagItems = theList
self._curItem = 0
The __iter__
method simply returns a reference to the object itself and is always implemented to do so.
def __iter__(self) :
return self
The __next__
method is called by Python's iterator mechanism to return the next item in the container.
def __next__(self) :
if self._curItem < len(self._bagItems) :
item = self._bagItems[ self._curItem ]
self._curItem += 1
return item
else :
raise StopIteration
The method first saves a reference to the current item indicated by the loop variable. The loop variable is then incremented by one to prepare it for the next invocation of the __next__
method. If there are no additional items, the method must raise a StopIteration
exception that flags the for
loop to terminate.
Finally, we must add an __iter__
method to our __Bag__
class, as shown here:
def __iter__(self) :
return BagIterator(self._theItems)
This method, which is responsible for creating and returning an instance of the BagIterator
class, is automatically called at the beginning of the for
loop to create an iterator object for use with the loop construct.
The complete implementation of the BagIterator
class is shown in the listing below.
Program Listing
# An iterator for the Bag ADT implemented as a Python list.
class BagIterator :
# Create an interator instance.
def __init__(self, theList) :
self._bagItems = theList
self._curItem = 0
# Return a reference to the iterator object itself.
def __iter__(self) :
return self
# Return a reference to the next item in the container.
def __next__(self) :
if self._curItem < len(self._bagItems) :
item = self._bagItems[ self._curItem ]
self._curItem += 1
return item
else :
raise StopIteration
|
Using Iterators
With the definition of the BagIterator
class and the modifications to the Bag
class, we can now use Python's for
loop with a Bag
instance. When the for
loop
for item in bag :
print(item)
is executed, Python automatically calls the __iter__
method on the bag
object to create an iterator object. Figure 2.4.1 illustrates the state of the BagIterator
object immediately after being created. Notice the _bagItems
field of the iterator object references _theItems
field of the bag
object. This reference was assigned by the constructor when the BagIterator
object was created.
Figure 2.4.1: The Bag
and BagIterator
objects before the first loop iteration.
The for
loop then automatically calls the __next__
method on the iterator object to access the next item in the container. The state of the iterator object changes with the _curItem
field having been incremented by one. This process continues until a StopIteration
exception is raised by the _next
method when the items have been exhausted as indicated by the value in _curItem
. After all of the items have been processed, the iteration is terminated and execution continues with the next statement following the loop.
The following code segment illustrates how Python actually performs the iteration when a for
loop is used with an instance of the Bag
class:
# Create a BagIterator object for myBag.
iterator = iter(myBag)
# Repeat the while loop until break is called.
while True :
try:
# Get the next item from the bag. If there are no
# more items, the StopIteration exception is raised.
item = next(iterator)
# Perform the body of the for loop.
print(item)
# Catch the exception and break from the loop when we are done.
except StopIteration:
break