CSci 556 (Multiparadigm Programming) Lecture Notes on the Sequence ADT, Spring 2012

CSci 556 Multiparadigm Programming
Spring 2012
Lecture Notes on the Sequence ADT

Ranked Sequence Abstract Data Type

Design notes
Javadoc documentation
Source code
- Interface to the Ranked Sequence ADT
  [ RankedSequence interface]
- Dynamic array-based implementation
  [ ArrayRankedSeq class] [ ArrayRankedSeqEnum iterator class]
- Doubly linked list implementation
  [ DoubleLinkRankedSeq class] [ DoubleLinkRankedSeqEnum iterator class]
  [ DoubleLink linked list class]
- Java Vector adapter implementation
  [ VectorRankedSeq class]
- Assertion checking mechanism
  [ Assert class] [ AssertionViolation exception base class]
  [ PreconditionViolation exception class] [ PostconditionViolation exception class]
  [ InvariantViolation exception class]

Design Notes on the Ranked Sequence ADT

General Issues

The ranked sequence abstract data type in this case study was inspired, in part, by Chapter 4 of the book Data Structures and Algorithms in Java by Michael T. Goodrich and Roberto Tamassia (Wiley, 1998). The design and implementation discussed here was done independently by your instructor in May-June 1998.

The very useful Vector class from the Java API implements an ADT similar to the concept of the ranked sequence discussed in this case study.

A ranked sequence is a sequence of elements, each of which has an associated integer rank. The rank is the offset of the element from the beginning of the sequence. The head (i.e., first) element of the sequence has rank 0, the next element has rank 1, and so forth. If there are N elements in the sequence, then the indices are 0 through N-1; there are no "holes" in the sequence.

This ADT must provide operations for accessing, replacing, inserting, and deleting elements at any valid rank in the sequence. If an element is inserted at some rank, then the indices for existing elements at that rank and higher are all increased by one. Deletion is handled in a similar fashion.

The ranked sequence ADT is an unbounded structure. There is no conceptual limit on the length of the sequence. An implementation must extend the length of the sequence as needed.

In an implementation, each element is a Java object; primitive data must be wrapped with instances of the appropriate wrapper classes (e.g., Integer).

ADT Specification

See also the javadoc documentation for the RankedSequence interface or the source code.

Interface Invariant

Once created and initialized properly, each instance contains a valid representation of a ranked sequence.

Mutators

void insertAtRank(int r, Object e) Preconditon: 0 <= r <= length(). Postcondition: Object e added as element at rank r. Ranks of all elements (if any) with old ranks >= r are increased by one.
void deleteAtRank(int r) Preconditon: 0 <= r < length() Postcondition: Element at rank r is removed. Ranks of all elements (if any) with old ranks > r are decreased by one.
void replaceAtRank(int r, Object e) Preconditon: 0 <= r < length() Postcondition: Element at rank r is replaced by object e.

Accessors

Object elementAtRank(int r) Preconditon: 0 <= r < length() Postcondition: return = the element at rank r.
int length() Preconditon: true Postcondition: return = the number of elements in the sequence.
boolean isEmpty() Preconditon: true Postcondition: return = true if and only if the sequence contains no elements.
Enumeration elements() Preconditon: true Postcondition: return = an enumeration iterator for the sequence. Note: The elements() method should be changed to return an object that implements the java.util.Iterator interface to be consistent with the Collections framework of Java 1.2 and beyond. The work described in this document was done with Java 1.1.

Use of Java Features for Specification and Implementation

In contrast to the simpler ADT specifications and implementations we gave for the Stack, Day, and Queue ADTs, in this design we make better use of the facilities of Java. The design and implementation of the Java structures for the ranked sequence ADT:

Use the Java interface construct to specify the set of methods (i.e., operations) that are part of the ADT interface.
A Java interface gives the signatures of the methods that must be implemented in any realization of the ADT. Each concrete class then implements this interface and provides appropriate data components and code for each method.
The interface for the ranked sequences is named RankedSequence (as you might expect).
Use the Java Exception mechanism to handle errors occurring during method calls.
If an error (e.g., the violation of a precondition input assumption) is detected, then the method throws an appropriate instance from the Exception class hierarchy. It is the responsibility of the method's user (i.e., client) to respond appropriately to the error condition---for example, by using a try construct to trap an exception resulting from the call of a method.
In particular, the methods in the ranked sequence ADT implementations can test for violations of preconditions, postconditions, and invariants and throw appropriate exceptions from the provided AssertionViolation hierarchy. This hierarchy includes the PreconditionViolation, PostconditionViolation, and InvariantViolation exception classes.
Provide a utility class Assert as a means for implementations to check preconditions, postconditions, invariants, and other assertions in a standard way.
An implementation can call the methods pre, post, inv, and cond from the Assert class to check preconditions, postconditions, invariants, and general conditions. The arguments of these methods are the boolean expressions that must evaluate to True; if they do not, then the matching exception is thrown.
Put the group of classes in a Java package separate from the code that uses the ADT.
In this case a package SoftwareInterfaces is used.
Specify the implementation of an iterator class associated with the sequence---in particular, an iterator class that extends the java.util.Enumeration interface.
An iterator is a class provides a systematic and safe way to step through the elements of a data structure (represented by a different class).
The elements() method of the RankedSequence ADT returns an object that implements the Enumeration interface. The Enumeration methods, nextElement() and hasMoreElements(), allow a user of the sequence ADT to step through the elements of the sequence one by one while still preserving the encapsulation of the ADT implementation.
Document the Java interfaces, classes, and methods using the javadoc annotations and documentation-generation tool.
In this approach, we can insert annotations and HTML markup into special comments in the Java source. Using the javadoc tool from Sun's JDK, we can then generate a hyperlinked hierarchy of HTML files giving information about the classes, interfaces, methods, and data elements of a collection of Java source files (e.g., a Java package).
The special annotations include @param to document a method's parameters, @return to document the return value of a method, @throws to document exceptions potentially thrown by a method, @author to document the author of a module, and so forth.
The javadoc tool does not support documentation of preconditions, postconditions, and invariants. In addition to the standard javadoc annotations, your instructor has "hacked in" HTML markup to generate documentation of these aspects of the specification.
An example of the javadoc markup can be found in the *-boxed comments in the RankedSequence interface. The documentation generated can be found into the javadoc documentation structure, specifically in the documentation for the RankedSequence interface.

Alternative Implementations of the ADT

We provide three separate implementations of the RankedSequence ADT:

ArrayRankedSeq, which implements the sequence by a dynamically allocated array of objects.
This implementation behaves much like the built-in Vector class. Whenever the array used to represent the sequence overflows, it allocates a larger one and copies the previous array into it.
DoubleLinkRankedSeq, which implements the sequence by a doubly linked list.
VectorRankedSeq, which implements the sequence as an adapter of the Vector class from the Java API.
An adapter is an abstraction (e.g., a class) that adapts one abstraction to behave like another. It translates the operations on the interface of one interface into the appropriate operations on a second abstraction.
VectorRankedSeq is a class that implements the RankedSequence interface, that is, its instances are instances of the ranked sequence abstraction. However, these operations are implemented directly in terms of similar operations on an instance of the built-in Vector class in the Java API. Thus VectorRankedSeq adapts a Vector to be a RankedSequence.

Iterator Design Issues

In general, it is a dangerous practice to use an iterator while allowing independent modification of the data structure by other parts of the program.

In a sequence iterator, what happens if the base sequence is modified by other parts of the code while an iterator (enumerator) is in use? In particular, what happens if an element of the sequence is inserted, deleted or replaced?

The answer, of course, depends upon how both the base sequence and the iterator are implemented.

Depending upon the implementation details, insertions and deletions of elements might cause some elements to be seem multiple times, others skipped. Some insertions and deletions might be visible to the iterator, others might cause a reorganization of the underlying data structure that makes all subsequent changes invisible. Some insertions and deletions might leave the iterator positioned to an invalid position, outside the sequence or to an element that has been "deleted" from the sequence.

There seems to be several different ways to approach the problem of changes to a data structure while an iterator is in use on that data structure. (We are just considering singly threaded programs here.) These include:

Ignore the problem. Assume the data structure is not modified in any undesirable way while the iterator is in use.
Create a copy of the data structure for use by the iterator. The iterator will then work on a "snapshot" of the data structure at the point it was created.
Give the enumerator direct access to the internal state of the classes that implement the structure. Make it adapt to changes in the internal state as much as possible.
Cause the iterator to abort if an undesirable modification occurs.
Lock the data structure to prevent any undesirable modification.

In the design of the ranked sequence classes, we experimented with two different approaches.

The design of the ArrayRankedSeq enumerator experimented with the second approach. It makes a copy of the array that represents the sequence. After an elements() call, any subsequent insertions, deletions, or replacements of elements will not be seen by the user of the enumerator---the "snapshot" sequence will be seen.
The elements() method does a relatively "shallow copy" of the sequence; it copies the array of references but does not create copies of the elements themselves. Thus, any changes to the elements themselves will, however, be seen by a user of the iterator.
The "snapshot" approach to the enumerator design was not completely satisfying. The approach is safe. However, the creation of the array copies incurs considerable overhead if the sequence is large and/or many iterators are needed. The approach also does not work if we wish to implement a more general iterator that allows modification of the sequence by iterator methods.
Approach three might be better solution. In this approach the internal data of the sequence class could be made visible to the enumerator class. The enumerator could then directly access those data fields and adapt to any changes appropriately.
The data fields could be made visible to the enumerator by giving them friendly (i.e., default or package-scope) access rather than private access. The disadvantage to this implementation technique is that it breaks the encapsulation of the sequence ADT.
A better way to implement approach three might be to define the enumerator class as an inner class within the sequence class. The enumerator could have access to the internal state of the sequence without causing that state to be exposed to other classes in the package.
However, in the presence of insertions and deletions independently of an iterator, it is possible a simple implementation of approach three might have the problems described above (e.g., elements skipped or returned multiple times by the enumerator).
The design of the DoubleLinkRankedSeq takes the first approach. It simply ignores the problem.
For the linked list implementation we chose, ignoring the problem seems to be safe but perhaps a bit confusing. From the standpoint of the enumerator, insertions or deletions at a lower index than the current enumerator position will be treated as if they did not occur; those at a higher index will be seen as the enumerator advances. An insertion at the current index will also be ignored.
A deletion at the current index will cause the biggest problem. Because the implementation disconnects a deleted element from the sequence, the enumerator positioned at a deleted element will detect that the sequence ends at that element. Perhaps that aspect of the linked list implementation should be reconsidered.
The design of the VectorRankedSeq also takes the first approach. The adapter for this class is directly implemented in terms of the enumerator for the Vector class.
The elements() method of VectorRankedSeq simply returns the enumerator returned by the elements() method of the underlying Vector. The implementation of the Vector enumerator seems to assume that the data structure is not modified while an enumerator is in use, but this has not been investigated thoroughly.

The Java JDK 1.2 includes new kinds of iterators that take approach four above. The data structures in its new Collection hierarchy includes iterators that are fail-fast. If a structure has been modified in an unacceptable way, subsequent calls of the iterator methods will cause an immediate abort.

The Vector has been retrofit to be a member of this hierarchy. Its elements() method, however, continues to return an enumerator with the old semantics.

UP to CSci 556 Lecture Notes root document?

CSci 556 Multiparadigm Programming Spring 2012 Lecture Notes on the Sequence ADT