Data Structures and Algorithms using Python
Copyright © 2023 by Rance Necaise
Table Of Contents
Data Structures and Algorithms using Python
Copyright © 2023
by Rance Necaise

2.1 Bags

The Time and Line Segment ADTs from Chapter 1 provided examples of simple abstract data types. To illustrate the design and implementation of a complex abstract data type, we define the Bag ADT. A bag is a simple container like a shopping bag that can be used to store a collection of items. The bag container restricts access to the individual items by only defining operations for adding and removing individual items, for determining if an item is in the bag, and for traversing over the collection of items.

The Bag ADT

There are several variations of the Bag ADT with the one described here being a simple bag. A grab bag is similar to the simple bag but the items are removed from the bag at random. Another common variation is the counting bag, which includes an operation that returns the number of occurrences in the bag of a given item. Implementations of the grab bag and counting bag are left as exercises.

ADT Definition

The Bag ADT

A bag is a container that stores a collection in which duplicate values are allowed. The items, each of which is individually stored, have no particular order but they must be comparable.

  • Bag()

    Creates a bag that is initially empty.

  • length()

    Returns the number of items stored in the bag. Accessed using the len function.

  • contains()

    Determines if the given target item is stored in the bag and returns the appropriate Boolean value. Accessed using the in operator.

  • add(item)

    Adds the given item to the bag.

  • remove(item)

    Removes and returns an occurrence of item from the bag. An exception is raised if the element is not in the bag.

  • iterator()

    Creates and returns an iterator that can be used to iterate over the collection of items.

Why do we need the Bag ADT when we could simply use the list to store the items? For a small program and a small collection of data, using a list would be appropriate. When working with large programs and multiple team members, however, abstract data types provide several advantages as described earlier in Section 1.3. By working with the abstraction of a bag, we can: a) focus on solving the problem at hand instead of worrying about the implementation of the container, b) reduce the chance of introducing errors from misuse of the list since it provides additional operations that are not appropriate for a bag, c) provide better coordination between different modules and designers, and d) easily swap out our current implementation of the Bag ADT for a different, possibly more efficient, version later.

PROGRAMMING TIP
PROGRAMMING TIP

You may have noticed our definition of the Bag ADT does not include an operation to convert the container to a string. We could include such an operation, but creating a string for a large collection is time consuming and requires a large amount of memory. Such an operation can be beneficial when debugging a program that uses an instance of the Bag ADT. Thus, it's not uncommon to include the __repr__ special method for debugging purposes, but it would not typically be used in production software. We will usually omit the inclusion of a __repr__ special method in the definition of our abstract data types, except in those cases where it's meaningful, but you may want to include one temporarily for debugging purposes.

Using a Bag

Given the abstract definition of the Bag ADT, we can create and use a bag without knowing how it is actually implemented. Consider the following simple example, which creates a bag and asks the user to guess one of the values it contains.

myBag = Bag()
myBag.add( 19 )
myBag.add( 74 )
myBag.add( 23 )
myBag.add( 19 )
myBag.add( 12 )

value = int( input("Guess a value contained in the bag.") )
if value in myBag:
  print( "The bag contains the value", value )
else :
  print( "The bag does not contain the value", value )

Next, consider the checktimes sample program from the last chapter where we extracted times from the user and determined how many occurred during working hours. Suppose we want to keep the collection of times for later use. It wouldn't make sense to require the user to re-enter the times, when they are needed again. Instead we can store the times in a bag as they are entered and access them later. The Bag ADT is a perfect container for storing items when the position or order of a specific item does not matter. The following is a new version of the main routine for our time checking program from Section 1.4:

# checktimes2.py
# A new version of the checktime.py from Chapter 1 that uses a Bag
# to store the times for later processing.

from mytime import Time
from listbag import Bag

def main():
  startTime = Time(8, 0, 0)   # 8am
  endTime = Time(17, 0, 0)    # 5pm

   # Read times from the user and place them in a bag.
  bag = Bag()
  time = promptAndReadTime()    # same function from Chapter 1
  while time is not None :
    bag.add(time)
    time = promptAndReadTime()

   # Iterate over the bag and check the time.
  numValid = 0  
  for time in bag :
    if time >= startTime and time <= endTime :
      numValid = numValid + 1

   # Print the results.
  print(numValid, "of the times occurred during the working day.")

# The promptandReadTime() function has the same definition as
# provided in Listing 1.1. It is omitted here to save space.
Page last modified on August 19, 2023, at 01:15 PM