11 September 2018
Browser Advisory: The HTML version of this textbook requires a browser that supports the display of MathML. A good choice as of September 2018 is a recent version of Firefox from Mozilla.
Chapter 2 introduced the concepts of procedural and data abstraction. Chapter 6 focuses on procedural abstraction and modular design and programming. This chapter focuses on data abstraction.
The goals of this chapter are to:
illustrate use of data abstraction
reinforce and extend the concepts of modular design and programming using Haskell modules
Data abstraction can help make a program robust with respect to change in the data. As in the previous chapter, letβs begin the study of this design technique with an example.
For this example, letβs implement a group of Haskell functions to perform rational number arithmetic, assuming that the Haskell library does not contain such a data type.
In mathematics we usually write rational numbers in the form where x
and y
are integers and y
0
.
For now, let us assume we have a special type Rat
to represent rational numbers and a constructor function
to create a Haskell rational number instance from a numerator x
and a denominator y
. That is, makeRat x y
constructs a Haskell rational number with mathematical value , where .
Let us also assume we have selector functions numer
and denom
with the signatures:
Functions numer
and denom
take a valid Haskell rational number and return its numerator and denominator, respectively.
Int
values x
and y
where , there exists a Haskell rational number r
such that makeRat x y == r
and rational number values .
Note: In this example, we use fraction notation like to denote the mathematical value of the rational number. In constrast, r
above denotes a Haskell value representing a rational number.
We consider how to implement rational numbers in Haskell later, but for now letβs look at rational arithmetic implemented using the constructor and selector functions specified above.
Given our knowledge of rational arithmetic from mathematics, we can define the operations for unary negation, addition, subtraction, multiplication, division, and equality as follows. We assume that the operands x
and y
are values created by the constructor makeRat
.
negRat :: Rat -> Rat
negRat x = makeRat (- numer x) (denom x)
addRat, subRat, mulRat, divRat :: Rat -> Rat -> Rat -- (1)
addRat x y = makeRat (numer x * denom y + numer y * denom x)
(denom x * denom y)
subRat x y = makeRat (numer x * denom y - numer y * denom x)
(denom x * denom y)
mulRat x y = makeRat (numer x * numer y) (denom x * denom y)
divRat x y -- (2) (3)
| eqRat y zeroRat = error "Attempt to divide by 0"
| otherwise = makeRat (numer x * denom y)
(denom x * numer y)
eqRat :: Rat -> Rat -> Bool
eqRat x y = (numer x) * (denom y) == (numer y) * (denom x)
The above code:
combines the type signatures for all four arithmetic operations into a single declaration by listing the names separated by commas
introduces the parameterless function zeroRat
to abstract the constant rational number value 0
Note: We could represent zero as makeRat 0 1
but choose to introduce a separate abstraction.
calls the error
function for an attempt to divide by zero
These arithmetic functions do not depend upon any specific representation for rational numbers. Instead, they use rational numbers as a data abstraction defined by the type Rat
, constant zeroRat
, constructor function makeRat
, and selector functions numer
and denom
.
The goal of a data abstraction is to separate the logical properties of data from the details of how the data are represented.
Now, how can we represent rational numbers?
For this package, we define type synonym Rat
to denote this type:
For example, (1,7)
, (-1,-7)
, (3,21)
, and (168,1176)
all represent the value .
As with any value that can be expressed in many different ways, it is useful to define a single canonical (or normal) form for representing values in the rational number type Rat
.
It is convenient for us to choose a Haskell rational number representation (x,y)
that satisfies all parts of the following Rational Representation Property:
(x,y)
(Int,Int)
y > 0
if x == 0
, then y == 1
x
and y
are relatively prime
rational number value is
By relatively prime, we mean that the two integers have no common divisors except 1.
This representation keeps the magnitudes of the numerator x
and denominator y
small, thus reducing problems with overflow arising during arithmetic operations.
This representation also gives a unique representation for zero. For convenience, we define the name zeroRat
to represent this constant:
We can now define constructor function makeRat x y
that takes two Int
values (for the numerator and the denominator) and returns the corresponding Haskell rational number in this canonical form.
makeRat :: Int -> Int -> Rat
makeRat x 0 = error ( "Cannot construct a rational number "
++ show x ++ "/0" ) -- (1)
makeRat 0 _ = zeroRat
makeRat x y = (x' `div` d, y' `div` d) -- (2)
where x' = (signum' y) * x -- (3,4)
y' = abs' y
d = gcd' x' y'
In the definition of makeRat
, we use features of Haskell we have not used in the previous examples. the above code:
uses the infix ++
(read βappendβ) operator to concatenate two strings
We discuss ++
in the chapter on infix operations.
puts backticks (`
) around an alphanumeric function name to use that function as an infix operator
The function div
denotes integer division. Above the div
operator denotes the integer division function used in an infix manner.
uses a where
clause to introduce x'
, y'
, and d
as local definitions within the body of makeRat
These local definition can be accessed from within makeRat
but not from outside the function. In contrast, sqrtIter
in the Square Root example is at the same level as sqrt'
, so it can be called by other functions (in the same Haskell module at least).
The where
feature allows us to introduce new definitions in a top-down mannerβfirst using a symbol and then defining it.
uses type inference for local variables x'
, y'
, and d
instead of giving explicit type definitions
These parameterless functions could be declared
but it was not necessary because Haskell can infer the types from the types involved in their defining expressions.
Type inference can be used more broadly in Haskell, but explicit type declarations should be used for any function called from outside.
We require that makeRat x y
satisfy the precondition:
The function generates an explicit error exception if it does not.
As a postcondition, we require makeRat x y
to return a result (x',y')
such that:
(x',y')
satisfies the Rational Representation Property
rational number value is
Note: Together the two postcondition requirements imply that .
The function signum'
(similar to the more general function signum
in the Prelude) takes an integer and returns the integer -1
, 0
, or 1
when the number is negative, zero, or positive, respectively.
The function abs'
(similar to the more general function abs
in the Prelude) takes an integer and returns its absolute value.
The function gcd'
(similar to the more general function gcd
in the Prelude) takes two integers and returns their greatest common divisor.
gcd' :: Int -> Int -> Int
gcd' x y = gcd'' (abs' x) (abs' y)
where gcd'' x 0 = x
gcd'' x y = gcd'' y (x `rem` y)
Prelude operation rem
returns the remainder from dividing its first operand by its second.
Given a tuple (x,y)
constructed by makeRat
as defined above, we can define numer (x,y)
and denom (x,y)
as follows:
The preconditions of both numer (x,y)
and denom (x,y)
are that their arguments (x,y)
satisfy the Rational Representation Property.
The postcondition of numer (x,y) = x
is that the rational number values .
Similarly, the postcondition of denom (x,y) = y
is that the rational number values .
Finally, to allow rational numbers to be displayed in the normal fractional representation, we include function showRat
in the package. We use function show
, found in the Prelude, here to convert an integer to the usual string format and use the list operator ++
to concatenate the two strings into one.
Unlike Rat
, zeroRat
, makeRat
, numer
, and denom
, function showRat
(as implemented) does not use knowledge of the data representation. We could optimize it slightly by allowing it to access the structure of the tuple directly.
There are three groups of functions in this package:
the six public rational arithmetic functions negRat
, addRat
, subRat
, mulRat
, divRat
, and eqRat
the public type Rat
, constant zeroRat
, public constructor function makeRat
, public selector functions numer
and denom
, and string conversion function showRat
the private utility functions called only by the second group, but just reimplementations of Prelude functions anyway
RationalCore
As we have seen, data type Rat
; constant zeroRat
; functions makeRat
, numer
, denom
, and showRat
; and the functionsβ preconditions and postconditions form the interface to the data abstraction.
The data abstraction hides the information about the representation of the data. We can encapsulate this group of functions in a Haskell module as follows. This source code must also be in a file named RationalCore.hs
.
module RationalCore
(Rat, makeRat, zeroRat, numer, denom, showRat)
where
-- Rat,makeRat,zeroRat,numer,denom,showRat definitions
In terms of the information-hiding approach, the secret of the RationalCore
module is the rational number data representation used.
We can encapsulate the utility functions in a separate module, which would enable them to be used by several other modules.
However, given that the only use of the utility functions is within the data representation module, we choose not to separate them at this time. We leave them as local functions in the data abstraction module. Of course, we could also eliminate them and use the corresponding Prelude functions directly.
Rational
Similarly, functions negRat
, addRat
, subRat
, mulRat
, divRat
, and eqRat
use the core data abstraction and, in turn, extend the interface to include rational number arithmetic operations.
We can encapsulate these in another Haskell module that imports the module giving the data representation. This module must be in a file named Rational1.hs
.
module Rational1
( Rat, zeroRat, makeRat, numer, denom, showRat,
negRat, addRat, subRat, mulRat, divRat, eqRat )
where
import RationalCore
-- negRat,addRat,subRat,mulRat,divRat,eqRat definitions
Other modules that use the rational number package can import module Rational1
.
The modularization described above (potentially):
enables a module to be reused in several different programs
offers robustness with respect to change
The data representation and arithmetic algorithms can change independently.
allows multiple implementations of each module as long as the public (abstract) interface is kept stable
enables understanding of one module without understanding the internal details of modules it uses
costs some in terms of extra code and execution efficiency
But that probably does not matter given the benefits above and the code optimizations carried out by the compiler.
In the rational number data representation above, constructor makeRat
creates pairs in which the two integers are relatively prime and the sign is on the numerator. Selector functions numer
and denom
just return these stored values.
An alternative representation is to reverse this approach, as shown in the following module (in file RationalDeferGCD.hs
.)
module RationalDeferGCD
(Rat, zeroRat, makeRat, numer, denom, showRat)
where
type Rat = (Int,Int)
zeroRat :: (Int,Int)
zeroRat = (0,1)
makeRat :: Int -> Int -> Rat
makeRat x 0 = error ( "Cannot construct a rational number "
++ show x ++ "/0" )
makeRat 0 y = zeroRat
makeRat x y = (x,y)
numer :: Rat -> Int
numer (x,y) = x' `div` d
where x' = (signum' y) * x
y' = abs' y
d = gcd' x' y'
denom :: Rat -> Int
denom (x,y) = y' `div` d
where x' = (signum' y) * x
y' = abs' y
d = gcd' x' y'
showRat :: Rat -> String
showRat x = show (numer x) ++ "/" ++ show (denom x)
This approach defers the calculation of the greatest common divisor until a selector is called.
In this alternative representation, a rational number (x,y)
must satisfy all parts of the following Deferred Representation Property:
(x,y)
(Int,Int)
y /= 0
if x == 0
, then y == 1
rational number value is
We require that makeRat x y
satisfies the precondition:
The function generates an explicit error condition if it does not.
As a postcondition, we require makeRat x y
to return a result (x',y')
such that:
(x',y')
satisfies the Deferred Representation Property
rational number value is
The preconditions of both numer (x,y)
and denom (x,y)
are that (x,y)
satisfies the Deferred Representation Property.
The postcondition of numer (x,y) = x'
is that the rational number values .
Similarly, the postcondition of denom (x,y) = y
is that the rational number values .
Question:
What are the advantages and disadvantages of the two data representations?
Like module RationalCore
, the design secret for this module, RationalDeferGCD
, is the rational number data representation.
Regardless of which approach is used, the definitions of the arithmetic and comparison functions do not change. Thus the Rational
module can import data representation module RationalCore
or RationalDeferGCD
.
Figure 7-1 shows the dependencies among the modules we have examined in the rational arithmetic example.
We can consider the RationalCore
and RationalDeferGCD
modules as two concrete instances (Haskell module
s) of a more abstract module we call RationalRep
in the diagram.
The module Rational
relies on the abstract module RationalRep
for an implementation of rational numbers. In the Haskell code above, there are really two versions of the Haskell module Rational
that differ only in whether they import RationalCore
or RationalDeferGCD
.
We could also replace alias Rat
by a user-defined type to get another alternative definition of RationalRep
, as long as the interface functions do not have to work with types other than Int
.
In the Rational Arithmetic example, we defined two information-hiding modules:
βRationalRepβ, whose secret is how to represent the rational number data and whose interface consists of the data type Rat
, constant zeroRat
, operations (functions) makeRat
, numer
, denom
, and showRat
, and the constraints on these types and functions
βRationalβ, whose secret is how to implement the rational number arithmetic and whose interface consists of operations (functions) negRat
, addRat
, subRat
, mulRat
, divRat
, and eqRat
, the other moduleβs interface, and the constraints on these types and functions
We developed two distinct Haskell modules, RationalCore
and RationalDeferGCD
, to implement the βRationalRepβ information-hiding module.
We developed one distinct Haskell module, Rational
, to implement the βRationalβ information-hiding module. This module can be paired (i.e.Β by changing the import
statement) with either of the other two variants of βRationalRepβ module. (Source file Rational1.hs
imports module RationalCore
; source file Rational2.hs
imports RationalDeferGCD
.)
Unfortunately, Haskell 2010 has a relatively weak module system that does not support multiple implementations as well as we might like. There is no way to declare that multiple Haskell modules have the same interface other than copying the common code into each module and documenting the interface carefully. We must also have multiple versions of Rational
that differ only in which other module is imported.
Together the Glasgow Haskell Compiler (GHC) release 8.2 (July 2017) and the Cabal-Install package manager release 2.0 (August 2017) support a new extension, the Backpack mixin package system. This new system remedies the above shortcoming. In this new approach, we would define the abstract module βRationalRepβ as a signature file and require that RationalCore
and RationalDeferGCD
conform to it.
Further discussion of this new module system is beyond the scope of this chapter.
Chapter 12 discusses testing of the Rational modules designed in this chapter. The test scripts for:
Module RationalRep
TestRatRepCore.hs
for RationalCore
TestRatRepDefer.hs
for RationalDeferGCD
Module Rational
TestRational1.hs
for Rational
using RationalCore
.
TestRational2.hs
for Rational
using RationalDeferGCD
.
As we see in the rational arithmetic example, a module that provides a data abstraction must ensure that the objects it creates and manipulates maintain their integrityβalways have a valid structure and state.
The RationalCore
rational number representation satisfies the Rational Representation Property.
The RationalDeferGCD
rational number representation satisfies the Deferred Representation Property.
These properties are invariants for those modules. An invariant for the data abstraction can help us design and implement such objects.
A logical assertion that must always be true for every βobjectβ created by the public constructors and manipulated only by the public operations of the data abstraction.
Often, we separate an invariant into two parts.
An invariant stated in terms of the public features and abstract properties of the βobjectβ.
A detailed invariant giving the required relationships among the internal features of the implementation of an βobjectβ
An interface invariant is a key aspect of the abstract interface of a module. It is useful to the users of the module, as well to the developers.
In the Rational Arithmetic example, the interface invariant for the βRationalRepβ abstract module is the following.
For any valid Haskell rational number r
, all the following hold:
r
Rat
denom r > 0
if numer r == 0
, then denom r == 1
numer r
and denom r
are relatively prime
the (mathematical) rational number value is
We note that the precondition for makeRat x y
is defined above without any dependence upon the concrete representation.
We can restate the postcondition for makeRat x y = r
generically to require both of the following to hold:
r
satisfies the RationaRep Interface Invariant
rational number r
βs value is
The preconditions of both numer r
and denom r
are that their argument r
satisfies the RationalRep Interface Invariant.
The postcondition of numer r = x'
is that the rational number value is equal to the rational number value of r
.
Similarly, the postcondition of denom r = y'
is that the rational number value is equal to the rational number value of r
.
An implementation invariant guides the developers in the design and implementation of the internal details of a module. It relates the internal details to the interface invariant.
RationalCore
We can state an implementation invariant for the RationalCore
module.
For any valid Haskell rational number r
, all the following hold:
r == (x,y)
for some (x,y)
Rat
y > 0
if x == 0
, then y == 1
x
and y
are relatively prime
rational number value is
The implementation invariant implies the interface invariant given the definitions of data type Rat
and selector functions numer
and denom
. Constructor function makeRat
does the work to establish the invariant initially.
RationalDeferGCD
We can state an implementation invariant for the RationalDeferGCD
module.
For any valid Haskell rational number r
, all the following hold:
r == (x,y)
for some (x,y)
Rat
y /= 0
if x == 0
, then y == 1
rational number value is
The implementation invariant implies the interface invariant given the definitions of Rat
and of the selector functions numer
and denom
. Constructor function makeRat
is simple, but the selector functions numer
and denom
do quite a bit of work to establish the interface invariant.
The Rational
abstract module extends the RationalRep
abstract module with new functionality.
It imports the public interface of the RationalRep
abstract module and exports those features in its own public interface. Thus it must maintain the interface invariant for the RationalRep
module it uses.
It does not add any new data types or constructor (or destructor) functions. So it does not need any new invariant components for new data abstractions.
It adds one unary and four binary arithmetic functions that take rational numbers and return a rational number. It does so by using the data abstraction provided by the RationalRep
module. These must preserve the RationalRep
interface invariant.
It adds an equality comparison function that takes two rational numbers and returns a Bool
.
The previous chapter examined procedural abstraction and stepwise refinement for development of a square root package.
This chapter examined data abstraction for development of a rational number arithmetic package. The chapters explored concepts and methods for modular design and programming using Haskell, including preconditions, postconditions, and invariants.
The next chapter examines the substitution model for evaluation of Haskell programs and explores efficiency and termination in the context of that model.
A later chapter examines how to test the modules developed in this example.
For each of the following exercises, develop and test a Haskell function or set of functions.
Develop a Haskell module (or modules) for line segments on the two-dimensional coordinate plane using the rectangular coordinate system.
We can represent a line segment with two pointsβthe starting point and the ending point. Develop the following Haskell functions:
constructor newSeg
that takes two points and returns a new line segment
selectors startPt
and endPt
that each take a segment and return its starting and ending points, respectively
We normally represent the plane with a rectangular coordinate system. That is, we use two axesβan x
axis and a y
axisβintersecting at a right angle. We call the intersection point the origin and label it with 0 on both axes. We normally draw the x
axis horizontally and label it with increasing numbers to the right and decreasing numbers to the left. We also draw the y
axis vertically with increasing numbers upward and decreasing numbers downward. Any point in the plane is uniquely identified by its x
-coordinate and y
-coordinate.
Define a data representation for points in the rectangular coordinate system and develop the following Haskell functions:
constructor newPtFromRect
that takes the x
and y
coordinates of a point and returns a new point
selectors getx
and gety
that takes a point and returns the x
and y
coordinates, respectively
display function showPt
that takes a point and returns an appropriate String
representation for the point
Now, using the various constructors and selectors, also develop the Haskell functions for line segments:
midPt
that takes a line segment and returns the point at the middle of the segment
display function showSeg
that takes a line segment and returns an appropriate String
representation
Note that newSeg
, startPt
, endPt
, midPt
, and showSeg
can be implemented independently from how the points are represented.
Develop a Haskell module (or modules) for line segments that represents points using the polar coordinate system instead of the rectangular coordinate system used in the previous exercise.
A polar coordinate system represents a point in the plane by its radial coordinate r
(i.e.Β the distance from the pole) and its angular coordinate t
(i.e.Β the angle from the polar axis in the reference direction). We sometimes call r
the magnitude and t
the angle.
By convention, we align the rectangular and polar coordinate systems by making the origin the pole, the positive portion of the x
axis the polar axis, and let the first quadrant (where both x
and y
are positive) be the smallest positive angles in the reference direction. That is, with a traditional drawing of the coordinate systems, we measure and the radial coordinate r
as the distance from the origin measure the angular coordinate t
counterclockwise from the positive x
axis.
Using knowledge of trigonometry, we can convert among rectangular coordinates (x,y)
and polar coordinates (r,t)
using the equations:
Define a data representation for points in the polar coordinate system and develop the following Haskell functions:
constructor newPtFromPolar
that takes the magnitude r
and angle t
as the polar coordinates of a point and returns a new point
selectors getMag
and getAng
that each take a point and return the magnitude r
and angle t
coordinates, respectively
selectors getx
and gety
that return the x
and y
components of the points (represented here in polar coordinates)
display functions showPtAsRect
and showPtAsPolar
to convert the points to strings using rectangular and polar coordinates, respectively,
Functions newSeg
, startPt
, endPt
, midPt
, and showSeg
should work as in the previous exercise.
Modify the solutions to the previous two line-segment module exercises to enable the line segment functions to be in one module that works properly if composed with either of the two data representation modules. (The solutions may have already done this.)
Modify the solution to the previous line-segment exercise to use the Backpack module system.
Modify the modules in the previous exercise to enable the line segment module to work with both data representations in the same program.
Modify the solution to the Rational Arithmetic example to use the Backpack module system.
State preconditions and postconditions for the functions in abstract module Rational
.
In Summer and Fall 2016, I adapted and revised much of this work from my previous materials:
Discussion of the Rational Arithmetic modules mostly from chapter 5 of my Notes on Functional Programming with Haskell [Cunningham 2014], from my Lua-based implementations, and from section 2.1 of Abelson and Sussmanβs Structure and Interpretation of Computer Programs [Abelson 1996]
Discussion of modular design and programming issues from my Data Abstraction [Cunningham 2018a] and Modular Design [Cunningham 2018b] notes, which draw from the ideas of several of the references listed below
In 2017, I continued to develop this work as Sections 2.6-2.7 in Chapter 2, Basic Haskell Functional Programming, of my 2017 Haskell-based programming languages textbook.
In Spring and Summer 2018, I divided the previous Basic Haskell Functional Programming chapter into four chapters in the 2018 version of the textbook, now titled Exploring Languages with Interpreters and Functional Programming. Previous sections 2.1-2.3 became the basis for new Chapter 4, First Haskell Programs; previous Section 2.4 became Section 5.3 in the new Chapter 5, Types; and previous sections 2.5-2.7 were reorganized into new Chapter 6, Procedural Abstraction, and Chapter 7, Data Abstraction (this chapter).
I maintain this chapter as text in Pandocβs dialect of Markdown using embedded LaTeX markup for the mathematical formulas and then translate the document to HTML, PDF, and other forms as needed.
Haskell module
, module exports and imports, module dependencies, rational number arithmetic, data abstraction, properties of data, data representation, precondition, postcondition, invariant, interface invariant, implementation or representation invariant, canonical or normal forms, relatively prime, information hiding, module secret, encapsulation, interface, abstract interface, type inference.