11 September 2018
Browser Advisory: The HTML version of this textbook requires a browser that supports the display of MathML. A good choice as of September 2018 is a recent version of Firefox from Mozilla.
Chapter 2 introduced the concepts of procedural and data abstraction. Chapter 6 focuses on procedural abstraction and modular design and programming. This chapter focuses on data abstraction.
The goals of this chapter are to:
illustrate use of data abstraction
reinforce and extend the concepts of modular design and programming using Haskell modules
Data abstraction can help make a program robust with respect to change in the data. As in the previous chapter, letβs begin the study of this design technique with an example.
For this example, letβs implement a group of Haskell functions to perform rational number arithmetic, assuming that the Haskell library does not contain such a data type.
In mathematics we usually write rational numbers in the form  where x and y are integers and y  0.
For now, let us assume we have a special type Rat to represent rational numbers and a constructor function
to create a Haskell rational number instance from a numerator x and a denominator y. That is, makeRat x y constructs a Haskell rational number with mathematical value , where .
Let us also assume we have selector functions numer and denom with the signatures:
Functions numer and denom take a valid Haskell rational number and return its numerator and denominator, respectively.
Int values x and y where , there exists a Haskell rational number r such that makeRat x y == r and rational number values   .
Note: In this example, we use fraction notation like  to denote the mathematical value of the rational number. In constrast, r above denotes a Haskell value representing a rational number.
We consider how to implement rational numbers in Haskell later, but for now letβs look at rational arithmetic implemented using the constructor and selector functions specified above.
Given our knowledge of rational arithmetic from mathematics, we can define the operations for unary negation, addition, subtraction, multiplication, division, and equality as follows. We assume that the operands x and y are values created by the constructor makeRat.
    negRat :: Rat -> Rat 
    negRat x = makeRat (- numer x) (denom x)
    addRat, subRat, mulRat, divRat :: Rat -> Rat -> Rat  -- (1)
    addRat x y = makeRat (numer x * denom y + numer y * denom x)
                         (denom x * denom y) 
    subRat x y = makeRat (numer x * denom y - numer y * denom x)
                         (denom x * denom y) 
    mulRat x y = makeRat (numer x * numer y) (denom x * denom y) 
    divRat x y  -- (2) (3)
        | eqRat y zeroRat = error "Attempt to divide by 0"
        | otherwise       = makeRat (numer x * denom y)
                                    (denom x * numer y)
    eqRat :: Rat -> Rat -> Bool 
    eqRat x y = (numer x) * (denom y) == (numer y) * (denom x)The above code:
combines the type signatures for all four arithmetic operations into a single declaration by listing the names separated by commas
introduces the parameterless function zeroRat to abstract the constant rational number value 0
Note: We could represent zero as makeRat 0 1 but choose to introduce a separate abstraction.
calls the error function for an attempt to divide by zero
These arithmetic functions do not depend upon any specific representation for rational numbers. Instead, they use rational numbers as a data abstraction defined by the type Rat, constant zeroRat, constructor function makeRat, and selector functions numer and denom.
The goal of a data abstraction is to separate the logical properties of data from the details of how the data are represented.
Now, how can we represent rational numbers?
For this package, we define type synonym Rat to denote this type:
For example, (1,7), (-1,-7), (3,21), and (168,1176) all represent the value .
As with any value that can be expressed in many different ways, it is useful to define a single canonical (or normal) form for representing values in the rational number type Rat.
It is convenient for us to choose a Haskell rational number representation (x,y) that satisfies all parts of the following Rational Representation Property:
(x,y)  (Int,Int)
y > 0
if x == 0, then y == 1
x and y are relatively prime
rational number value is
By relatively prime, we mean that the two integers have no common divisors except 1.
This representation keeps the magnitudes of the numerator x and denominator y small, thus reducing problems with overflow arising during arithmetic operations.
This representation also gives a unique representation for zero. For convenience, we define the name zeroRat to represent this constant:
We can now define constructor function makeRat x y that takes two Int values (for the numerator and the denominator) and returns the corresponding Haskell rational number in this canonical form.
    makeRat :: Int -> Int -> Rat
    makeRat x 0 = error ( "Cannot construct a rational number "
                          ++ show x ++ "/0" )         -- (1)
    makeRat 0 _ = zeroRat
    makeRat x y = (x' `div` d, y' `div` d)            -- (2)
        where x' = (signum' y) * x                    -- (3,4)
              y' = abs' y 
              d  = gcd' x' y'In the definition of makeRat, we use features of Haskell we have not used in the previous examples. the above code:
uses the infix ++ (read βappendβ) operator to concatenate two strings
We discuss ++ in the chapter on infix operations.
puts backticks (`) around an alphanumeric function name to use that function as an infix operator
The function div denotes integer division. Above the div operator denotes the integer division function used in an infix manner.
uses a where clause to introduce x', y', and d as local definitions within the body of makeRat
These local definition can be accessed from within makeRat but not from outside the function. In contrast, sqrtIter in the Square Root example is at the same level as sqrt', so it can be called by other functions (in the same Haskell module at least).
The where feature allows us to introduce new definitions in a top-down mannerβfirst using a symbol and then defining it.
uses type inference for local variables x', y', and d instead of giving explicit type definitions
These parameterless functions could be declared
but it was not necessary because Haskell can infer the types from the types involved in their defining expressions.
Type inference can be used more broadly in Haskell, but explicit type declarations should be used for any function called from outside.
We require that makeRat x y satisfy the precondition:
The function generates an explicit error exception if it does not.
As a postcondition, we require makeRat x y to return a result (x',y') such that:
(x',y') satisfies the Rational Representation Property
rational number value is
Note: Together the two postcondition requirements imply that .
The function signum' (similar to the more general function signum in the Prelude) takes an integer and returns the integer -1, 0, or 1 when the number is negative, zero, or positive, respectively.
The function abs' (similar to the more general function abs in the Prelude) takes an integer and returns its absolute value.
The function gcd' (similar to the more general function gcd in the Prelude) takes two integers and returns their greatest common divisor.
    gcd' :: Int -> Int -> Int 
    gcd' x y = gcd'' (abs' x) (abs' y) 
        where gcd'' x 0 = x 
              gcd'' x y = gcd'' y (x `rem` y) Prelude operation rem returns the remainder from dividing its first operand by its second.
Given a tuple (x,y) constructed by makeRat as defined above, we can define numer (x,y) and denom (x,y) as follows:
The preconditions of both numer (x,y) and denom (x,y) are that their arguments (x,y) satisfy the Rational Representation Property.
The postcondition of numer (x,y) = x is that the rational number values   .
Similarly, the postcondition of denom (x,y) = y is that the rational number values   .
Finally, to allow rational numbers to be displayed in the normal fractional representation, we include function showRat in the package. We use function show, found in the Prelude, here to convert an integer to the usual string format and use the list operator ++ to concatenate the two strings into one.
Unlike Rat, zeroRat, makeRat, numer, and denom, function showRat (as implemented) does not use knowledge of the data representation. We could optimize it slightly by allowing it to access the structure of the tuple directly.
There are three groups of functions in this package:
the six public rational arithmetic functions negRat, addRat, subRat, mulRat, divRat, and eqRat
the public type Rat, constant zeroRat, public constructor function makeRat, public selector functions numer and denom, and string conversion function showRat
the private utility functions called only by the second group, but just reimplementations of Prelude functions anyway
RationalCoreAs we have seen, data type Rat; constant zeroRat; functions makeRat, numer, denom, and showRat; and the functionsβ preconditions and postconditions form the interface to the data abstraction.
The data abstraction hides the information about the representation of the data. We can encapsulate this group of functions in a Haskell module as follows. This source code must also be in a file named RationalCore.hs.
    module RationalCore
        (Rat, makeRat, zeroRat, numer, denom, showRat)
    where
        -- Rat,makeRat,zeroRat,numer,denom,showRat definitionsIn terms of the information-hiding approach, the secret of the RationalCore module is the rational number data representation used.
We can encapsulate the utility functions in a separate module, which would enable them to be used by several other modules.
However, given that the only use of the utility functions is within the data representation module, we choose not to separate them at this time. We leave them as local functions in the data abstraction module. Of course, we could also eliminate them and use the corresponding Prelude functions directly.
RationalSimilarly, functions negRat, addRat, subRat, mulRat, divRat, and eqRat use the core data abstraction and, in turn, extend the interface to include rational number arithmetic operations.
We can encapsulate these in another Haskell module that imports the module giving the data representation. This module must be in a file named Rational1.hs.
    module Rational1
      ( Rat, zeroRat, makeRat, numer, denom, showRat,
        negRat, addRat, subRat, mulRat, divRat, eqRat )
    where
        import RationalCore
        -- negRat,addRat,subRat,mulRat,divRat,eqRat definitions Other modules that use the rational number package can import module Rational1.
The modularization described above (potentially):
enables a module to be reused in several different programs
offers robustness with respect to change
The data representation and arithmetic algorithms can change independently.
allows multiple implementations of each module as long as the public (abstract) interface is kept stable
enables understanding of one module without understanding the internal details of modules it uses
costs some in terms of extra code and execution efficiency
But that probably does not matter given the benefits above and the code optimizations carried out by the compiler.
In the rational number data representation above, constructor makeRat creates pairs in which the two integers are relatively prime and the sign is on the numerator. Selector functions numer and denom just return these stored values.
An alternative representation is to reverse this approach, as shown in the following module (in file RationalDeferGCD.hs.)
    module RationalDeferGCD
        (Rat, zeroRat, makeRat, numer, denom, showRat)
    where
    type Rat = (Int,Int)
    zeroRat :: (Int,Int)
    zeroRat = (0,1)
    makeRat :: Int -> Int -> Rat
    makeRat x 0 = error ( "Cannot construct a rational number "
                          ++ show x ++ "/0" )
    makeRat 0 y = zeroRat 
    makeRat x y = (x,y)
    numer :: Rat -> Int
    numer (x,y) = x' `div` d
        where x' = (signum' y) * x 
              y' = abs' y 
              d  = gcd' x' y'
    denom :: Rat -> Int
    denom (x,y) = y' `div` d
        where x' = (signum' y) * x 
              y' = abs' y 
              d  = gcd' x' y'
    showRat :: Rat -> String
    showRat x = show (numer x) ++ "/" ++ show (denom x)This approach defers the calculation of the greatest common divisor until a selector is called.
In this alternative representation, a rational number (x,y) must satisfy all parts of the following Deferred Representation Property:
(x,y)  (Int,Int)
y /= 0
if x == 0 , then y == 1
rational number value is
We require that makeRat x y satisfies the precondition:
The function generates an explicit error condition if it does not.
As a postcondition, we require makeRat x y to return a result (x',y') such that:
(x',y') satisfies the Deferred Representation Property
rational number value is
The preconditions of both numer (x,y) and denom (x,y) are that (x,y) satisfies the Deferred Representation Property.
The postcondition of numer (x,y) = x' is that the rational number values   .
Similarly, the postcondition of denom (x,y) = y is that the rational number values   .
Question:
What are the advantages and disadvantages of the two data representations?
Like module RationalCore, the design secret for this module, RationalDeferGCD, is the rational number data representation.
Regardless of which approach is used, the definitions of the arithmetic and comparison functions do not change. Thus the Rational module can import data representation module RationalCore or RationalDeferGCD.
Figure 7-1 shows the dependencies among the modules we have examined in the rational arithmetic example.
We can consider the RationalCore and RationalDeferGCD modules as two concrete instances (Haskell module s) of a more abstract module we call RationalRep in the diagram.
The module Rational relies on the abstract module RationalRep for an implementation of rational numbers. In the Haskell code above, there are really two versions of the Haskell module Rational that differ only in whether they import RationalCore or RationalDeferGCD.
We could also replace alias Rat by a user-defined type to get another alternative definition of RationalRep, as long as the interface functions do not have to work with types other than Int.
In the Rational Arithmetic example, we defined two information-hiding modules:
βRationalRepβ, whose secret is how to represent the rational number data and whose interface consists of the data type Rat, constant zeroRat, operations (functions) makeRat, numer, denom, and showRat, and the constraints on these types and functions
βRationalβ, whose secret is how to implement the rational number arithmetic and whose interface consists of operations (functions) negRat, addRat, subRat, mulRat, divRat, and eqRat, the other moduleβs interface, and the constraints on these types and functions
We developed two distinct Haskell modules, RationalCore and RationalDeferGCD, to implement the βRationalRepβ information-hiding module.
We developed one distinct Haskell module, Rational, to implement the βRationalβ information-hiding module. This module can be paired (i.e.Β by changing the import statement) with either of the other two variants of βRationalRepβ module. (Source file Rational1.hs imports module RationalCore; source file Rational2.hs imports RationalDeferGCD.)
Unfortunately, Haskell 2010 has a relatively weak module system that does not support multiple implementations as well as we might like. There is no way to declare that multiple Haskell modules have the same interface other than copying the common code into each module and documenting the interface carefully. We must also have multiple versions of Rational that differ only in which other module is imported.
Together the Glasgow Haskell Compiler (GHC) release 8.2 (July 2017) and the Cabal-Install package manager release 2.0 (August 2017) support a new extension, the Backpack mixin package system. This new system remedies the above shortcoming. In this new approach, we would define the abstract module βRationalRepβ as a signature file and require that RationalCore and RationalDeferGCD conform to it.
Further discussion of this new module system is beyond the scope of this chapter.
Chapter 12 discusses testing of the Rational modules designed in this chapter. The test scripts for:
Module RationalRep
TestRatRepCore.hs for RationalCore
TestRatRepDefer.hs for RationalDeferGCD
Module Rational
TestRational1.hs for Rational using RationalCore.
TestRational2.hs for Rational using RationalDeferGCD.
As we see in the rational arithmetic example, a module that provides a data abstraction must ensure that the objects it creates and manipulates maintain their integrityβalways have a valid structure and state.
The RationalCore rational number representation satisfies the Rational Representation Property.
The RationalDeferGCD rational number representation satisfies the Deferred Representation Property.
These properties are invariants for those modules. An invariant for the data abstraction can help us design and implement such objects.
A logical assertion that must always be true for every βobjectβ created by the public constructors and manipulated only by the public operations of the data abstraction.
Often, we separate an invariant into two parts.
An invariant stated in terms of the public features and abstract properties of the βobjectβ.
A detailed invariant giving the required relationships among the internal features of the implementation of an βobjectβ
An interface invariant is a key aspect of the abstract interface of a module. It is useful to the users of the module, as well to the developers.
In the Rational Arithmetic example, the interface invariant for the βRationalRepβ abstract module is the following.
For any valid Haskell rational number r, all the following hold:
r  Rat
denom r > 0
if numer r == 0, then denom r == 1
numer r and denom r are relatively prime
the (mathematical) rational number value is
We note that the precondition for makeRat x y is defined above without any dependence upon the concrete representation.
We can restate the postcondition for makeRat x y = r generically to require both of the following to hold:
r satisfies the RationaRep Interface Invariant
rational number r βs value is 
The preconditions of both numer r and denom r are that their argument r satisfies the RationalRep Interface Invariant.
The postcondition of numer r = x' is that the rational number value  is equal to the rational number value of r.
Similarly, the postcondition of denom r = y' is that the rational number value  is equal to the rational number value of r.
An implementation invariant guides the developers in the design and implementation of the internal details of a module. It relates the internal details to the interface invariant.
RationalCoreWe can state an implementation invariant for the RationalCore module.
For any valid Haskell rational number r, all the following hold:
r == (x,y) for some (x,y)  Rat
y > 0
if x == 0, then y == 1
x and y are relatively prime
rational number value is
The implementation invariant implies the interface invariant given the definitions of data type Rat and selector functions numer and denom. Constructor function makeRat does the work to establish the invariant initially.
RationalDeferGCDWe can state an implementation invariant for the RationalDeferGCD module.
For any valid Haskell rational number r, all the following hold:
r == (x,y) for some (x,y)  Rat
y /= 0
if x == 0, then y == 1
rational number value is
The implementation invariant implies the interface invariant given the definitions of Rat and of the selector functions numer and denom. Constructor function makeRat is simple, but the selector functions numer and denom do quite a bit of work to establish the interface invariant.
The Rational abstract module extends the RationalRep abstract module with new functionality.
It imports the public interface of the RationalRep abstract module and exports those features in its own public interface. Thus it must maintain the interface invariant for the RationalRep module it uses.
It does not add any new data types or constructor (or destructor) functions. So it does not need any new invariant components for new data abstractions.
It adds one unary and four binary arithmetic functions that take rational numbers and return a rational number. It does so by using the data abstraction provided by the RationalRep module. These must preserve the RationalRep interface invariant.
It adds an equality comparison function that takes two rational numbers and returns a Bool.
The previous chapter examined procedural abstraction and stepwise refinement for development of a square root package.
This chapter examined data abstraction for development of a rational number arithmetic package. The chapters explored concepts and methods for modular design and programming using Haskell, including preconditions, postconditions, and invariants.
The next chapter examines the substitution model for evaluation of Haskell programs and explores efficiency and termination in the context of that model.
A later chapter examines how to test the modules developed in this example.
For each of the following exercises, develop and test a Haskell function or set of functions.
Develop a Haskell module (or modules) for line segments on the two-dimensional coordinate plane using the rectangular coordinate system.
We can represent a line segment with two pointsβthe starting point and the ending point. Develop the following Haskell functions:
constructor newSeg that takes two points and returns a new line segment
selectors startPt and endPt that each take a segment and return its starting and ending points, respectively
We normally represent the plane with a rectangular coordinate system. That is, we use two axesβan x axis and a y axisβintersecting at a right angle. We call the intersection point the origin and label it with 0 on both axes. We normally draw the x axis horizontally and label it with increasing numbers to the right and decreasing numbers to the left. We also draw the y axis vertically with increasing numbers upward and decreasing numbers downward. Any point in the plane is uniquely identified by its x-coordinate and y-coordinate.
Define a data representation for points in the rectangular coordinate system and develop the following Haskell functions:
constructor newPtFromRect that takes the x and y coordinates of a point and returns a new point
selectors getx and gety that takes a point and returns the x and y coordinates, respectively
display function showPt that takes a point and returns an appropriate String representation for the point
Now, using the various constructors and selectors, also develop the Haskell functions for line segments:
midPt that takes a line segment and returns the point at the middle of the segment
display function showSeg that takes a line segment and returns an appropriate String representation
Note that newSeg, startPt, endPt, midPt, and showSeg can be implemented independently from how the points are represented.
Develop a Haskell module (or modules) for line segments that represents points using the polar coordinate system instead of the rectangular coordinate system used in the previous exercise.
A polar coordinate system represents a point in the plane by its radial coordinate r (i.e.Β the distance from the pole) and its angular coordinate t (i.e.Β the angle from the polar axis in the reference direction). We sometimes call r the magnitude and t the angle.
By convention, we align the rectangular and polar coordinate systems by making the origin the pole, the positive portion of the x axis the polar axis, and let the first quadrant (where both x and y are positive) be the smallest positive angles in the reference direction. That is, with a traditional drawing of the coordinate systems, we measure and the radial coordinate r as the distance from the origin measure the angular coordinate t counterclockwise from the positive x axis.
Using knowledge of trigonometry, we can convert among rectangular coordinates (x,y) and polar coordinates (r,t) using the equations:
Define a data representation for points in the polar coordinate system and develop the following Haskell functions:
constructor newPtFromPolar that takes the magnitude r and angle t as the polar coordinates of a point and returns a new point
selectors getMag and getAng that each take a point and return the magnitude r and angle t coordinates, respectively
selectors getx and gety that return the x and y components of the points (represented here in polar coordinates)
display functions showPtAsRect and showPtAsPolar to convert the points to strings using rectangular and polar coordinates, respectively,
Functions newSeg, startPt, endPt, midPt, and showSeg should work as in the previous exercise.
Modify the solutions to the previous two line-segment module exercises to enable the line segment functions to be in one module that works properly if composed with either of the two data representation modules. (The solutions may have already done this.)
Modify the solution to the previous line-segment exercise to use the Backpack module system.
Modify the modules in the previous exercise to enable the line segment module to work with both data representations in the same program.
Modify the solution to the Rational Arithmetic example to use the Backpack module system.
State preconditions and postconditions for the functions in abstract module Rational.
In Summer and Fall 2016, I adapted and revised much of this work from my previous materials:
Discussion of the Rational Arithmetic modules mostly from chapter 5 of my Notes on Functional Programming with Haskell [Cunningham 2014], from my Lua-based implementations, and from section 2.1 of Abelson and Sussmanβs Structure and Interpretation of Computer Programs [Abelson 1996]
Discussion of modular design and programming issues from my Data Abstraction [Cunningham 2018a] and Modular Design [Cunningham 2018b] notes, which draw from the ideas of several of the references listed below
In 2017, I continued to develop this work as Sections 2.6-2.7 in Chapter 2, Basic Haskell Functional Programming, of my 2017 Haskell-based programming languages textbook.
In Spring and Summer 2018, I divided the previous Basic Haskell Functional Programming chapter into four chapters in the 2018 version of the textbook, now titled Exploring Languages with Interpreters and Functional Programming. Previous sections 2.1-2.3 became the basis for new Chapter 4, First Haskell Programs; previous Section 2.4 became Section 5.3 in the new Chapter 5, Types; and previous sections 2.5-2.7 were reorganized into new Chapter 6, Procedural Abstraction, and Chapter 7, Data Abstraction (this chapter).
I maintain this chapter as text in Pandocβs dialect of Markdown using embedded LaTeX markup for the mathematical formulas and then translate the document to HTML, PDF, and other forms as needed.
Haskell module, module exports and imports, module dependencies, rational number arithmetic, data abstraction, properties of data, data representation, precondition, postcondition, invariant, interface invariant, implementation or representation invariant, canonical or normal forms, relatively prime, information hiding, module secret, encapsulation, interface, abstract interface, type inference.