The ranked sequence abstract data type in this case study was inspired, in part, by Chapter 4 of the book Data Structures and Algorithms in Java by Michael T. Goodrich and Roberto Tamassia (Wiley, 1998) [ go to textbook Web site?]. The design and implementation discussed here was done independently by your instructor.
The very useful Vector
class from the Java API implements
an ADT similar to the concept of the ranked sequence discussed in
this case study.
A ranked sequence is a sequence of elements, each of which has an
associated integer rank. The rank is the offset of the
element from the beginning of the sequence. The head (i.e., first)
element of the sequence has rank 0, the next element has rank 1, and
so forth. If there are N
elements in the sequence, then
the indices are 0
through N-1
; there are no
"holes" in the sequence.
This ADT must provide operations for accessing, replacing, inserting, and deleting elements at any valid rank in the sequence. If an element is inserted at some rank, then the indices for existing elements at that rank and higher are all increased by one. Deletion is handled in a similar fashion.
The ranked sequence ADT is an unbounded structure. There is no conceptual limit on the length of the sequence. An implementation must extend the length of the sequence as needed.
In an implementation, each element is a Java object; primitive data
must be wrapped with instances of the appropriate wrapper classes
(e.g., Integer
).
Instructor's Note: In a future enhancement of these notes, I should
discuss the ADT interface here. For now, see the
javadoc documentation for the RankedSequence
interface or the source code.
In contrast to the simpler ADT specifications and implementations we gave for the Stack, Day, and Queue ADTs, in this design we make better use of the facilities of Java. The design and implementation of the Java structures for the ranked sequence ADT:
interface
construct to specify the set
of methods (i.e., operations) that are part of the ADT interface.
A Java interface
gives the signatures of the methods that
must be implemented in any realization of the ADT. Each concrete
class then implements
this interface and provides
appropriate data components and code for each method.
The interface for the ranked sequences is named
RankedSequence
(as you might expect).
Exception
mechanism to handle errors
occurring during method calls.
If an error (e.g., the violation of a precondition input assumption)
is detected, then the method throws
an appropriate
instance from the Exception
class hierarchy. It is the
responsibility of the method's user (i.e., client) to respond
appropriately to the error condition---for example, by using a
try
construct to trap an exception resulting from the
call of a method.
In particular, the methods in the ranked sequence ADT implementations
can test for violations of preconditions, postconditions, and
invariants and throw appropriate exceptions from the provided
AssertionViolation
hierarchy. This hierarchy
includes the
PreconditionViolation
,
PostconditionViolation
, and
InvariantViolation
exception classes.
Assert
as a
means for implementations to check preconditions, postconditions,
invariants, and other assertions in a standard way.
An implementation can call the methods pre
,
post
, inv
, and cond
from the Assert
class to
check preconditions, postconditions, invariants, and general
conditions. The arguments of these methods are the boolean
expressions that must evaluate to True
; if they do not,
then the matching exception is thrown.
package
separate
from the code that uses the ADT.
In this case a package SoftwareInterfaces
is used.
java.util.Enumeration
interface.
An iterator is a class provides a systematic and safe way to step through the elements of a data structure (represented by a different class).
The elements()
method of the RankedSequence
ADT returns an object that implements the Enumeration
interface. The Enumeration
methods,
nextElement()
and hasMoreElements()
, allow
a user of the sequence ADT to step through the elements of the
sequence one by one while still preserving the encapsulation of the
ADT implementation.
javadoc
annotations and documentation-generation tool.
In this approach, we can insert annotations and HTML markup into
special comments in the Java source. Using the javadoc
tool from Sun's JDK, we can then generate a hyperlinked hierarchy of
HTML files giving information about the classes, interfaces, methods,
and data elements of a collection of Java source files (e.g., a Java
package).
The special annotations include @param
to document a
method's parameters, @return
to document the return value
of a method, @throws
to document exceptions potentially
thrown by a method, @author
to document the author of
a module, and so forth.
The javadoc tool does not support documentation of preconditions, postconditions, and invariants. In addition to the standard javadoc annotations, your instructor has "hacked in" HTML markup to generate documentation of these aspects of the specification.
An example of the javadoc markup can be found in the
*
-boxed comments in the
RankedSequence
interface. The documentation
generated can be found into the
javadoc documentation structure, specifically in the
documentation for the
RankedSequence
interface.
We provide three separate implementations of the
RankedSequence
ADT:
ArrayRankedSeq
, which implements the sequence by a
dynamically allocated array of objects.
This implementation behaves much like the built-in Vector
class. Whenever the array used to represent the sequence overflows,
it allocates a larger one and copies the previous array into it.
DoubleLinkRankedSeq
, which implements the sequence by
a doubly linked list.
VectorRankedSeq
, which implements the sequence as an
adapter of the Vector
class from the Java API.
An adapter is an abstraction (e.g., a class) that adapts one abstraction to behave like another. It translates the operations on the interface of one interface into the appropriate operations on a second abstraction.
VectorRankedSeq
is a class that implements the
RankedSequence
interface, that is, its instances are
instances of the ranked sequence abstraction. However, these
operations are implemented directly in terms of similar operations on
an instance of the built-in Vector
class in the Java API.
Thus VectorRankedSeq
adapts a Vector
to be a
RankedSequence
.
In general, it is a dangerous practice to use an iterator while allowing independent modification of the data structure by other parts of the program.
In a sequence iterator, what happens if the base sequence is modified by other parts of the code while an iterator (enumerator) is in use? In particular, what happens if an element of the sequence is inserted, deleted or replaced?
The answer, of course, depends upon how both the base sequence and the iterator are implemented.
Depending upon the implementation details, insertions and deletions of elements might cause some elements to be seem multiple times, others skipped. Some insertions and deletions might be visible to the iterator, others might cause a reorganization of the underlying data structure that makes all subsequent changes invisible. Some insertions and deletions might leave the iterator positioned to an invalid position, outside the sequence or to an element that has been "deleted" from the sequence.
There seems to be several different ways to approach the problem of changes to a data structure while an iterator is in use on that data structure. (We are just considering singly threaded programs here.) These include:
In the design of the ranked sequence classes, we experimented with two different approaches.
ArrayRankedSeq
enumerator experimented with the
second approach. It makes a copy of the array that represents the
sequence. After an elements()
call, any subsequent
insertions, deletions, or replacements of elements will not be seen by
the user of the enumerator---the "snapshot" sequence will be seen.
The elements()
method does a relatively "shallow copy" of
the sequence; it copies the array of references but does not create
copies of the elements themselves. Thus, any changes to the elements
themselves will, however, be seen by a user of the iterator.
The "snapshot" approach to the enumerator design was not completely satisfying. The approach is safe. However, the creation of the array copies incurs considerable overhead if the sequence is large and/or many iterators are needed. The approach also does not work if we wish to implement a more general iterator that allows modification of the sequence by iterator methods.
Approach three might be better solution. In this approach the internal data of the sequence class could be made visible to the enumerator class. The enumerator could then directly access those data fields and adapt to any changes appropriately.
The data fields could be made visible to the enumerator by giving them friendly (i.e., default or package-scope) access rather than private access. The disadvantage to this implementation technique is that it breaks the encapsulation of the sequence ADT.
A better way to implement approach three might be to define the enumerator class as an inner class within the sequence class. The enumerator could have access to the internal state of the sequence without causing that state to be exposed to other classes in the package.
However, in the presence of insertions and deletions independently of an iterator, it is possible a simple implementation of approach three might have the problems described above (e.g., elements skipped or returned multiple times by the enumerator).
DoubleLinkRankedSeq
takes the first approach. It simply
ignores the problem.
For the linked list implementation we chose, ignoring the problem seems to be safe but perhaps a bit confusing. From the standpoint of the enumerator, insertions or deletions at a lower index than the current enumerator position will be treated as if they did not occur; those at a higher index will be seen as the enumerator advances. An insertion at the current index will also be ignored.
A deletion at the current index will cause the biggest problem. Because the implementation disconnects a deleted element from the sequence, the enumerator positioned at a deleted element will detect that the sequence ends at that element. Perhaps that aspect of the linked list implementation should be reconsidered.
VectorRankedSeq
also takes the first approach. The adapter
for this class is directly implemented in terms of the enumerator for
the Vector
class.
The elements()
method of VectorRankedSeq
simply
returns the enumerator returned by the elements()
method
of the underlying Vector
. The implementation of the
Vector
enumerator seems to assume that the data structure
is not modified while an enumerator is in use, but this has not been
investigated thoroughly.
The Java JDK 1.2 includes new kinds of iterators that take approach
four above. The data structures in its new Collection
hierarchy includes iterators that are fail-fast. If a
structure has been modified in an unacceptable way, subsequent calls
of the iterator methods will cause an immediate abort.
The Vector
has been retrofit to be a member of this
hierarchy. Its elements()
method, however, continues to
return an enumerator with the old semantics.
UP to CSCI 581 Lecture Notes root document?