CSci 450-01: Organization of Programming Languages
CSci 503-01: Fundamental Concepts in Languages
Fall 2014


Introduction to Object Orientation

Simplified Software Development Lifecycle

For the purposes of this discussion, assume that software development proceeds through the four lifecycle phases shown in the following diagram.

  .__________________.
  |                  |
  |     Analysis     |<--:
  |__________________|   |
           |             |
           |             ^
  .________V_________.   |
  |                  |-->:
  |      Design      |   |
  |__________________|<--.
           |             |
           |             ^
  .________V_________.   |
  |                  |-->:
  |  Implementation  |   |
  |__________________|<--:
           |             |
           |             |
  .________V_________.   ^
  |                  |   |
  |   Maintenance    |-->:
  |__________________|

Analysis

In the analysis phase we move from a vague description of the problem to be solved to a precise and unambiguous requirements specification.

The requirements specification might be a precise, but informal description written in careful natural language text and diagrams. Alternatively, the specification might be a formal description written in a mathematically precise language. Or the requirements specification might be something in between these.

The requirements specification should be:

Design

In the design phase, we move from a requirements specification to a design specification. The design specification gives the system structure. The design tasks are to:

Implementation

In the implementation phase, we move from a design specification to a tested executable system. The implementation tasks are to:

Maintenance

In the maintenance phase, we move from a complete "working" system to a modified system. The maintenance tasks are to:

Observations:

Conclusions:

"Programming in the Small" and "Programming in the Large"

The type of software development projects familiar to most students can be described as programming in the small. Such projects have the following attributes:

Programming in the large characterizes projects with the following attributes:

The techniques of object-oriented design and programming are useful in both programming in the small and programming in the large situations. However, some of the techniques are best appreciated when the difficulties of programming in the large are understood.

Object Orientation

In contemporary practice, most software engineers approach the design of programs from an object-oriented perspective.

Key idea (notion?) in object orientation:
The real world can be accurately described as a collection of objects that interact.

Assumptions:

  1. Describing large, complex systems as interacting objects make them easier to understand than otherwise.
  2. The behaviors of real world objects tend to be stable over time.
  3. The different kinds of real world objects tend to be stable. (That is, new kinds appear slowly; old kinds disappear slowly.)
  4. Changes tend to be localized to a few objects.

Assumption 1 simplifies analysis, design, and implementation--makes them more reliable.

Assumptions 2 and 3 support reuse of code, prototyping, and incremental development.

Assumption 4 supports design for change.

The object-oriented approach:

An Object Model

Our object model includes four basic components:

  1. objects (i.e., abstract data structures)
  2. classes (i.e., abstract data types)
  3. inheritance (hierarchical relationships among ADTs)
  4. polymorphism by inheritance

Objects

Objects are characterized by:

An object is a separately identifiable entity that has a set of operations and a state that records the effects of the operations. That is, an object is essentially the same as an abstract data structure as discussed in this author's notes on Data Abstraction.

state:
the collection of information held (i.e., stored) by the object. The state is encapsulated within the object--is not directly visible.

The various components of the state are sometimes called the attributes of the object.

operation:
a procedure that takes the state of the object and zero or more arguments and changes the state and/or returns one or more values. Objects permit certain operations and not others.

identity:
a way to distinguish between two distinct objects (even if they have the same state and operations).

As an example, consider an object for a student desk in a simulation of a classroom. The relevant state might be attributes like location, orientation, person using, items in basket, items on top, etc. The relevant operations might be state-changing operations (mutators) such as "move the desk", "seat student", or "remove from basket" or might be state-observing operations (accessors) such as "is occupied" or "report items on desktop".

Note: This section describes objects that are mutable, that is, whose states may change. Objects may be immutable, in which case the state cannot change after the objects are initialized. However, some operations may "change" the state by creating a copy of the object whose state is modified in some way.

A language is object-based if it supports objects as a language feature.

Object-based languages include Ada, Modula, Clu, C++, Java, Scala, C#, and Smalltalk. Pascal (without module extensions), Algol, Fortran. and C are not inherently object-based.

Classes

A class is a template for creating objects.

A class description includes definitions of

As an example, again consider a simulation of a classroom. There might be a class of StudentDesk objects from which specific instances can be created as needed.

An object-based language is class-based if the concept of class occurs as a language feature and every object has a class.

Class-based languages include Clu, C++, Java, Scala, C#, Smalltalk, and Ada 95. Ada 83 and Modula are not class-based.

Inheritance

A class C inherits from class P if C's objects form a subset of P's objects.

The importance of inheritance is that it encourages sharing and reuse of both design information and program code. The shared state and operations can be described and implemented in base classes and shared among the subclasses.

As an example, again consider the student desks in a simulation of a classroom. The StudentDesk class might be derived (i.e., inherit) from a class Desk, which in turn might be derived from a class Furniture. In diagrams, it is the convention to draw arrows from the subclass to the superclass.

    Furniture <-- Desk <-- StudentDesk

The simulation might also include a ComputerDesk class that also derives from Desk.

    Furniture <-- Desk <-- ComputerDesk

In Java and Scala, we can express the above inheritance relationships using the extends keyword as follows.

    class Furniture  // extends cosmic root class for references by default
    {   ...          // (i.e., java.lang.Object or scala.AnyRef)
    }

    class Desk extends Furniture
    {   ...
    }

    class StudentDesk extends Desk
    {   ...
    }

    class ComputerDesk extends Desk
    {   ...
    }

Both StudentDesk and ComputerDesk objects will need operations to simulate a move of the entity in physical space. The move operation can thus be implemented in the Desk class and shared by objects of both classes. Invocation of operations to move either a StudentDesk or a ComputerDesk will be bound to the general move in the Desk class.

The StudentDesk class might inherit from a Chair class as well as the Desk class.

    Furniture <-- Chair <-- StudentDesk

Some languages support multiple inheritance as shown above for StudentDesk (e.g., C++, Eiffel). Other languages only support a single inheritance hierarchy.

Because multiple inheritance is both difficult to use correctly and to implement in a compiler, the designers of Java and Scala did not include multiple inheritance of classes as features. Java has a single inheritance hierarchy with a top-level class named Object from which all other classes derive (directly or indirectly). Scala is similar, with the corresponding top-level class named AnyRef.

    class StudentDesk extends Desk, Chair  // NOT VALID in Java or Scala
    {   ...
    }

To see some of the problems in implementing multiple inheritance, consider the above example. Class StudentDesk inherits from class Furniture through two different paths. Do the data fields of the class Furniture occur once or twice? What happens if the intermediate classes Desk and Chair have conflicting definitions for a data field or operation with the same name?

The difficulties with multiple inheritance are greatly decreased if we restrict ourselves to inheritance of class interfaces (i.e., the signatures of a set of operations) rather than a supporting the inheritance of the class implementations (i.e., the instance data fields and operation implementations). Since interface inheritance can be very useful in design and programming, the Java designers introduced a separate mechanism for that type of inheritance.

The Java interface construct can be used to define an interface for classes separately from the classes themselves. A Java interface may inherit from (i.e., extend) zero or more other interface definitions.

    interface Location3D
    {   ...
    }

    interface HumanHolder
    {   ...
    }

    interface Seat extends Location3D, HumanHolder
    {   ...
    }

A Java class may inherit from (i.e., implement) zero or more interfaces as well as inherit from (i.e., extend) exactly one other class.

    interface BookHolder
    {   ...
    }

    interface BookBasket extends Location3D, BookHolder
    {   ...
    }

    class StudentDesk extends Desk implements Seat, BookBasket
    { ...  
    }

This definition requires the StudentDesk class to provide actual implementations for all the operations from the Location3D, HumanHolder, BookHolder, Seat, and BookBasket interfaces. The Location3D operations will, of course, need to be implemented in such a way that they make sense as part of both the HumanHolder and BookHolder abstractions.

The Scala trait provides a more powerful, and more complex, mechanism than Java's interface. In addition to signatures, a trait can define method implementations and data fields. These traits can be added to a class in a controlled, linearized manner to avoid the semantic and implementation problems associated with multiple inheritance of classes. This is called mixin inheritance.

Polymorphism

The concept of polymorphism (literally "many forms") means the ability to hide different implementations behind a common interface.

Polymorphism appears in several forms.

Overloading (or ad hoc polymorphism)
means using the same name (or symbol) for several different procedures or functions (i.e., operations) and allowing the compiler to choose the correct one based on the signatures (i.e., on the number, types, and order of the parameters).

Consider the following Java definitions in the same class:

    void move () { ... }

    void move (Location l) { ... }

    void move (Desk d) { ... }

The choice among the alternatives can be done statically at the time the program is compiled.

Java supports overloading of method calls, but does not support user-defined overloading of operator symbols. (C++ supports both.)

In Scala, the usual operator symbols are just identifiers that denote methods defined on the various classes. Hence, they can be overloaded and overridden in a manner similar to other method names.

Parametric polymorphism
denotes the use of type (or class) definitions that are parameterized by other types.

In the context of an object-oriented language, this kind of polymorphishm enables the definition of generic classes, that is, classes in which methods and attributes can be defined in terms of type parameters of the class. When a class is instantiated into an object, the programmer must supply specific types for these parameters.

The original versions of Java did not support parameterized class definitions. However, Java 1.5 added a generic class mechanims. Scala enhanced this concept further. The template facility in C++ and the generic facility in Ada are other examples of parametric polymorphism.

Polymorphism by inheritance (sometimes called pure polymorphism, but a subset of what Timothy Budd's textbook calls pure polymorphism)
means the association at runtime (or the dynamic binding) of an operation invocation (i.e., procedure or function call) with the appropriate operation implementation in an inheritance hierarchy.

This form of polymorphism is carried out at run time. Given an object (i.e., class instance) to which an operation is applied, the system will first search for an implementation of the operation associated with the object's class. If no implementation is found in that class, the system will check the superclass, and so forth up the hierarchy until an appropriate implementation is found. Implementations of the operation may appear at several levels of the hierarchy.

The combination of dynamic binding with a well-chosen inheritance hierarchy allows the possibility of an instance of one subclass being substituted for an instance of a different subclass during execution. Of course, this can only be done when none of the extended operations of the subclass are being used.

As an example, again consider the simulation of a classroom. As in our discussion of inheritance, suppose that the StudentDesk and ComputerDesk classes are derived from the Desk class and that a general move operation is implemented as a part of the Desk class. This could be expressed in Java as follows:

    class Desk extends Furniture
    {   ...
        public void move(...)
        ...
    }

    class StudentDesk extends Desk
    {   ...
        // no move(...) operation here
        ...
    }

    class ComputerDesk extends Desk
    {   ...
        // no move(...) operation here
        ...
    }

As we noted before, invocation of operations to move either a StudentDesk or a ComputerDesk instance will be bound to the general move in the Desk class.

Extending the example, suppose that we need a special version of the move operation for ComputerDesk objects. For instance, we need to make sure that the computer is shut down and the power is disconnected before the entity is moved.

To do this, we can define this special version of the move operation and associate it with the ComputerDesk class. Now a call to move a ComputerDesk object will be bound to the special move operation, but a call to move a StudentDesk object will still be bound to the general move operation in the Desk class.

The definition of move in the ComputerDesk class is said to override the definition in the Desk class.

In Java, this can be expressed as follows:

    class Desk extends Furniture
    {   ...
        public void move(...)
        ...
    }

    class StudentDesk extends Desk
    {   ...
        // no move(...) operation here
        ...
    }

    class ComputerDesk extends Desk
    {   ...
        public void move(...)
        ...
    }

A class-based language is object-oriented if class hierarchies can be incrementally defined by an inheritance mechanism and the language supports polymorphism by inheritance along these class hierarchies.

Object-oriented languages include C++, Java, Scala, C#, Smalltalk, and Ada 95. The language Clu is class-based, but it does not include an inheritance facility.

Other object-oriented languages include Objective C, Object Pascal, Eiffel, and Oberon 2.

Requirements Analysis

The task of the analysis phase is to define the problem and write a clear, consistent, complete, and precise description of the system to be constructed. The analysts must elicit the requirements for the system from the clients and communicate the requirements to the system designers, who have the task of designing the new system.

The analysts must determine the scope of the system: What must the system accomplish? And, perhaps just as importantly, what behaviors are clearly outside the system in its environment?

In approaching the requirements analysis for a new system, the analysts should:

Good analysts are good detectives! Among the mass of detail, the analysts must find the clues that allow them to solve the mystery of what system the client needs. It is important that analysts keep a complete record of the information they have gathered and their reasoning on any "conclusions" they reach about that information.

The result of the analysts' work is a document called the requirements specification. Typically this will be a natural language (e.g., English) document. The writers of this document should use great care, using the language in a clear, consistent, and precise manner. The writers are establishing the vocabulary for communication with the clients and among the designers, implementers, and testers of the system.

Object-Oriented Design

The goals of the design phase are to:

The actual design process is iterative. Elaboration of a class may lead to identification of additional classes or changes to those already identified.

Classes should be crisply defined. It should be as easy as possible to determine what class an object belongs in. Coming up with a good categorization is often difficult in complex situations.

The responsibilities of a class are of two types:

Responsibilities should be defined precisely. Ambiguity and imprecision will likely cause problems later in the design and implementation.

The collaboration relationships among classes should not be excessively complex. A collaboration between two classes means that one class depends upon the other in some way. A change in one class may necessitate a change in the other.

The information gathered in the design phase is the basis for the implementation. A good design makes the implementation easier and faster and the resulting product more reliable.

The basic methods for identifying classes and responsibilities are as follows.

Method to find objects and their classes:
Begin with the nouns in the requirements specification.
Method to find responsibilities:
Begin with the verbs in the requirements specification.

Object-Oriented Design: Finding Classes and Responsibilities

As an example, consider the following telephone book example:
You are to build a computerized telephone book system for a university. The telephone book should contain entries for each person in the university community--student, professor, and staff member. Users of the directory can look up entries. In addition, the administrator of the telephone book can, after supplying a password, insert new entries, delete existing entries, modify existing entries, print the telephone book to a printer, and print a listing of all students or of all faculty. The entries in a listing are to be arranged in alphabetical order by family name.

To develop an object-oriented design model for this application, as designers we can carry out the following steps:

  1. Identify the candidate classes.

    Begin by listing the nouns, noun phrases, and pronoun antecedents from the requirements specification, changing all the plurals to singular.

    In the telephone book example, these include:
    you (the designer), computerized telephone book system, university, telephone book, entry, person, university community, student, professor, staff member, employee, user, directory, administrator, password, printer, listing, faculty, alphabetical order, family name

    Other classes may be be implicit in the specification or may emerge as the design of individual classes proceed and the designer's knowledge of the application domain increases. For example, the design may need classes corresponding to:

    Several, but not necessarily all, of these will become classes in our design.

  2. Identify the candidate responsibilities.

    Begin by listing the verbs from the specification. Listing the objects of transitive verbs may also be helpful.

    In the telephone book example, these include:
    build, contain (entry), look (up entry), supply (password), insert (new entry), delete (existing entry), modify (existing entry, print (telephone book), print (all students), print (all employees), set (telephone number field), get (telephone number field), compare (entries), to be arranged

    Transitive verbs become operations that respond to inputs to the object. All of the above verbs except "to be arranged" are transitive and, hence, will likely give rise to operations.

    Intransitive verbs typically specify attributes of classes. For example, "to be arranged in alphabetical order" in the above example denotes an ordering property of the entries in a listing. It is not an operation. Of course, in some cases, this kind of attribute might require a "sort" operation.

  3. Eliminate classes and responsibilities that are outside of the system scope.

    At this point, we may want to narrow the list of candidate classes by quickly dividing them into three categories based on their relevance to the system scope:

    1. critical classes (i.e., the "winners"), which we will definitely continue to consider. These are items that directly relate to the main entities of the application.

      In the telephone book example, these might include telephone book, entry, student, faculty, staff member, and so forth.

    2. irrelevant candidates (i.e., the "losers"), which we will definitely eliminate at this point. These are items that are clearly outside the system scope.

      In the telephone book example, these might include university which we do not need to explicitly model in this application) and printer (which is handled by other software/hardware outside of this application).

    3. undecided candidates (i.e., the "maybes"), which we will review further for categorization. These are items that we may not be able to categorize without first clarifying the system boundaries and definition

      In the telephone book example, these might include the computerized telephone book system, an item that definitely will be built (it is the whole system) but for which we may not need an explicit class.

  4. Combine synonym classes and synonym operations into single abstractions.

    For example, computerized telephone book, telephone book, and directory can be combined into a single PhoneBook class.

    Similarly, entry and person can be combined into a single Person class.

    In addition, all the print operations probably could be combined into a single print operation with the differing functionalities specified by the parameter.

  5. Distinguish attributes from classes.

    Some candidate classes may turn out to represent information held by other classes instead of being classes themselves.

    A candidate class may be an attribute (i.e., a responsibility) of another class rather than itself a class if:

    For example, we might choose to make the entity name an immutable string and make it an attribute of a class Person rather than have it a separate class.

  6. Be wary of adjectives.

    Adjectives modify nouns. In our technique, nouns give rise to classes or objects.

    Prepositional phrases may modify nouns in the requirements specification. Such phrases may lead to subclasses, objects, or instances as described above for adjectives.

    Adverbs may modify adjectives in the requirements specification. They provide other information about the adjective that should be considered in the analysis. For example, a reference to a "brilliantly red truck" might signal the need for another attribute, subclass, or instance that is different from a plain "red truck" or from a "dull red truck".

  7. Consider architectural design issues.

  8. Associate the operations with the appropriate classes.

    For example, in the telephone book example associate lookup, insert, delete, and modify entry operations with the PhoneBook class, associate compare, setPhoneNumber and getPhoneNumber with the Person class, etc.

    Sometimes there might be a choice on where to associate an operation. For example, the "insert person into telephone book" operation could be a operation of Person (that is, insert this person into the argument telephone book) or a operation of PhoneBook (that is, insert the argument person in this telephone book).

    Which is better? Associating the operation with Person would require the Person class to have access to the internal representation details of the PhoneBook. However, associating the operation with PhoneBook would not require the PhoneBook to know about the internal details of the Person class--except to be able to compare two entries. Thus making insert an operation of PhoneBook and compare an operation of Person would best maintain the encapsulation of data.

Principle:
An object cannot manipulate the internal data of another object directly; it must use an operation of that object.

During design we should not be concerned with the minute details of the implementation. However, it is appropriate to consider whether there is a "reasonable" implementation. In fact, it is better to make sure there are two different possible implementations to ensure flexibility and adaptability.

It is good to have more than one person involved in a design. A second designer can review the first's work more objectively and ask difficult questions -- and vice versa.

Coming Up with Names

The selection of names for classes and operations is an important task. Give it sufficient time and thought.

Object-Oriented Design: Finding Relationships Among Classes

Two classes in a design may be related in one of several ways. The three relationships that are common are:

Use Relationship

Class A uses B when:

If objects of A can carry out all operations without any awareness of B, then A does not use B.

From the telephone book example, PhoneBook uses Person objects (that is, it inserts, deletes, modifies, etc., the entries). Person objects do not use PhoneBook objects.

If class A uses class B, then a change to class B (particularly to its public interface) may necessitate changes to class A. The following principle will make modification of a design and implementation easier.

Principle:
Minimize the coupling between classes (that is, the number of classes used by a class.)

Aggregation Relationship

Class A uses class B for aggregation if objects of A contain objects of B.

Note: Object A contains (has an) object B if A has an instance variable that somehow designates object B--a Java reference to B, the index of B, the key of B, etc.

Aggregation is a one-to-N relationship; one object might contain several others.

Note: Aggregation is a special case of the use relationship. If asked to identify the aggregation and use relationships, identify an aggregation relationship as such rather than as a use relationship.

For example, objects of the Person class in the telephone book example contain (has) name, address, and phoneNumber objects.

The Pascal record and C structure are aggregations; they contain other data fields. They are not, however, objects.

Inheritance Relationship

Class D inherits from class B if all objects of D are also objects of B.

As noted previously,

In the telephone book example, we design Professor to inherit from Employee; similarly, we design Staff to inherit from Employee. We make Employee itself inherit from Person; similarly, we design Student to inherit from Person.

Inheritance can lead to powerful and extensible designs. However, use and aggregation are more common than inheritance.

Object-Oriented Implementation

The outputs of the object-oriented design phase include:

The purpose of the implementation phase is to code, test, and integrate the classes into the desired application system.

Object-oriented development involves an integration of testing with implementation:

Note: If we seek to evolve a prototype into a full program, we should never hesitate to reopen the analysis or design if new insight is discovered.

Acknowledgments

Some of the material here is based on the presentation in the following books:

The first version of these lecture notes were written for use in the CSci 490 section on object-oriented design and programming using C++ in the 1996 spring semester. The notes were subsequently expanded and put on the Web for the first Java-based version of CSci 211 (File Systems) during the 1996 fall semester. They were further modified for use in the CSci 581 (Object-Oriented Design and Programming) classes in the 1997 fall and 1999 spring semesters, in the CSci 405 (Computer Simulation) class in the 2000 spring semester, and the Engr 691 (Software Architecture) class in the 2000 fall and 2002 spring semesters.

The notes were updated for use in the Scala-based Multiparadigm Programming in Scala class in Fall 2008 and Spring 2012 and Software Families class in Fall 2011.

This version was updated for the Lua-based Fall 2013 offering of CSci 658 Software Language Engineering and the Lua component in the Fall 2014 offering of CSci 450 Organization of Programming Languages.


Copyright © 2014, H. Conrad Cunningham
Last modified: Sun Nov 9 16:22:01 CST 2014