Exploring Languages

with Interpreters

and Functional Programming

Chapter 22

**H. Conrad Cunningham**

**21 April 2022**

Copyright (C) 2017, 2018, 2022, H. Conrad Cunningham

Professor of Computer and Information Science

University of Mississippi

214 Weir Hall

P.O. Box 1848

University, MS 38677

(662) 915-7396 (dept. office)

Professor of Computer and Information Science

University of Mississippi

214 Weir Hall

P.O. Box 1848

University, MS 38677

(662) 915-7396 (dept. office)

**Browser Advisory:** The HTML version of this textbook
requires a browser that supports the display of MathML. A good choice as
of April 2022 is a recent version of Firefox from Mozilla.

This chapter (22) revisits the specification, design, and implementation of data abstraction modules in Haskell. It follows the general approach introduced in Chapter 7 but uses algebraic data types introduced in Chapter 21 to represent the data. An algebraic data enables the Haskell module implementing the abstraction to encapsulate the details of the data structure.

The goals of this chapter are to:

reinforce the methods for specification and design of data abstractions

illustrate how to use Haskell modules and algebraic data types to enforce the encapsulation of a module’s implementation secrets

introduce additional concepts and terminology for data abstractions

The concepts and terminology in this chapter are mostly general. They are applicable to most any language. Here we look specifically at Haskell and focus on the details of one application. (I have implemented basically the same data abstraction module in Scala and Elixir.)

Chapter 7 used the term *data
abstraction*.

This chapter uses the related term *abstract data type* [4] to refer to a data
abstraction encapsulated in an information-hiding module. The data
abstraction module defines and exports a user-defined type (i.e., an
algebraic data type) and a set of operations (i.e., functions) on that
type. The type is abstract in the sense that its concrete representation
is hidden; only the module’s operations may manipulate the
representation directly.

For convenience, this chapter sometimes uses acronym *ADT* to
refer to an abstract data type.

In Chapters 6 and
7, we explored the concepts of
contracts, which include preconditions and postconditions for the
functions in the module and interface and implementation) invariants for
the data created and manipulated by the module. For convenience, this
chapter refers to these as the *abstract model* for the ADT.

In this chapter, we develop a family of *doubly labelled
digraph* data structures.

As a *graph*, the data structure consists of a finite set of
*vertices (nodes)* and a set of *edges*. Each edge
connects two vertices. (Some writers require that the set of vertices be
nonempty, but here we prefer to allow an empty graph to have no
vertices. But the question remains whether such a graph with no vertices
is pointless concept.)

As a *directed graph* (or *digraph*), each pair of
vertices has at most one edge connecting them; the edge has a direction
from one of the edges to the other.

As a *doubly labelled* graph, each vertex and each edge has
some user-defined data (i.e., labels) attached.

This chapter draws on the discussion of digraphs and their
specification in Chapters 1 and 10 of the Dale and Walker book
*Abstract Data Types* [4].

For what purpose can we use a doubly labelled digraph data structure?

One concrete use case is to represent the game world in an implementation of an adventure game.

For example, in the Wizard’s Adventure Game from Chapter 5 of
*Land of Lisp: Learn to Program in Lisp, One Game at a Time*
[1], the game’s
rooms become vertices, passages between rooms become edges, and
descriptions associated with rooms or passages become labels on the
associated vertex or edge (as shown in Figure 22.1).

Aside: By using a digraph to model the game world, we disallow
multiple passages directly from one room to another. By changing the
graph to a *multigraph*, we can allow multiple directed edges
from one vertex to another.

The Adventure game must create and populate the game world initially, but it does not typically modify the game world during play. It maintains the game state (e.g., player location) separately from the game world. A player moves from room to room during play; the labelled digraph provides the static structure and descriptions of the game world.

How can we define an abstract data type?

The behavior of an ADT is defined by a set of operations that can be
applied to an *instance* of the ADT.

Each operation of an ADT can have inputs (i.e., parameters) and
outputs (i.e., results). The collection of information about the names
of the operations and their inputs and outputs is the *interface*
of the ADT.

To specify an ADT, we need to give:

the

*name*of the ADTthe

*sets*(or domains) upon which the ADT is builtThese include the type being defined and the auxiliary types (e.g., primitive data types and other ADTs) used as parameters or return values of the operations.

the

*signatures*(syntax or structure) of the operations- name
- input sets (i.e., the types, number, and order of the parameters)
- output set (i.e., the type of the return value)

the

*semantics*(or meaning) of the operations

Note: In this chapter, we more state the specification of the data abstraction more systematically than in Chapter 7. But we are doing essentially the same things we did for the Rational Arithmetic modules in Chapter 7.

We categorize an ADT’s operations into four groups depending upon their functionality:

A

*constructor*(sometimes called a creator, factory, or producer function) constructs and initializes an instance of the ADT.A

*mutator*(sometimes called a modifier, command, or setter function) returns the instance with its state changed.An

*accessor*(sometimes called an observer, query, or getter function) returns information from the state of an instance without changing the state.A

*destructor*destroys an instance of the ADT.

We normally list the operations in that order.

For a language with immutable data structures like Haskell, a mutator returns a distinct new instance of the ADT with a state that is a modified version of the original instance’s state. That is, we are taking an applicative (or functional or referentially transparent) approach to ADT specifications.

Note: Of course, in an imperative language, a mutator can change the state of an instance in place. That may be more efficient, but it tends to be less safe. It also tends to make concurrent use of an abstract data type more problematic.

Technically speaking, a destructor is not an operation of the ADT. We can represent the other types of operations as functions on the sets in the specification. However, we cannot define a destructor in that way. But destructors are of pragmatic importance in the implementation of ADTs, particularly in languages that do not have automatic storage reclamation (i.e., garbage collection).

There are two primary approaches for specifying the semantics of the operations:

The

*axiomatic*(or*algebraic*) approach gives a set of logical rules (properties or axioms) that relate the operations to one another. The meanings of the operations are defined implicitly in terms of each other.The

*constructive*(or*abstract model*) approach describes the meaning of the operations explicitly in terms of operations on other abstract data types. The underlying*model*may be any well-defined mathematical model or a previously defined ADT.

In some ways, the axiomatic approach is the more elegant of the two approaches. It is based in the well-established mathematical fields of abstract algebra and category theory. Furthermore, it defines the new ADT independently of other ADTs. To understand the definition of the new ADT it is only necessary to understand its axioms, not the semantics of a model.

However, in practice, the axiomatic approach to specification becomes very difficult to apply in complex situations. The constructive approach, which builds a new ADT from existing ADTs, is the more useful methodology for most practical software development situations.

In this chapter, we use the constructive approach.

Now let’s look at a constructive specification of the doubly labelled digraph.

First, we specify the ADT as an implementation-independent
abstraction. The *secret* of the ADT module is the data structure
used internally to implement the doubly labelled digraph.

Then, we examine two implementations of the abstraction:

using Haskell lists to represent the vertex and edge sets

using a Haskell

`Map`

to map a vertex to the set of outgoing edges from that vertex

Before we specify the ADT, let’s define the mathematical notation we use. We choose notation that can readily be used in comments in program.

We use the the plain-text specification notation to describe the abstract data type’s model and its semantics. The following document summarizes this notation:

TODO:

- Make sure the specification in this chapter and source code use the PTSN.
- For the full ELIFP book, it might be better to make the PTSN document an appendix chapter (81?).
- Perhaps use more traditional logic and math notation [2] in the ELIFP book and PTSN in the code.

We *name* the abstract data type being defined to be `Digraph`

.

We specify that this abstract data type be represented by a Haskell
algebraic data type `Digraph a b c`

,
which has three *type* parameters (i.e., sets):

`VertexType`

, the set of possible vertices (i.e., vertex identifiers) in the`Digraph`

`VertexLabelType`

, the set of possible labels on vertices in the`Digraph`

`EdgeLabelType`

, the set of possible labels on edges in the`Digraph`

Given this ADT defines a digraph, edges can be identified by ordered pairs (tuples) of vertices. Values from the above types, in particular the labels, may have several components.

We define the following operations on the Labelled Digraph ADT (shown below as Haskell function signatures).

Given the primary use case described above, we specify a constructor
to create an empty graph (`new_graph`

), a mutator to add a
new vertex (`add_vertex`

), and mutator to add a new edge
between existing vertices (`add_edge`

).

We also specify mutators to remove vertices
(`remove_vertex`

) and edges (`remove_edge`

) and to
update the labels on vertices (`update_vertex`

) and edges
(`update_edge`

). (Note: In the identified use case, these are
likely used less often than the mutators that add new vertices and
edges.)

We specify a single constructor with the following signature:

` new_graph :: Digraph a b c`

We specify six mutators with the following signatures:

```
add_vertex :: Digraph a b c -> a -> b -> Digraph a b c
remove_vertex :: Digraph a b c -> a -> Digraph a b c
update_vertex :: Digraph a b c -> a -> b -> Digraph a b c
add_edge :: Digraph a b c -> a -> a -> c -> Digraph a b c
remove_edge :: Digraph a b c -> a -> a -> Digraph a b c
update_edge :: Digraph a b c -> a -> a -> c -> Digraph a b c
```

We specify query functions to check whether the labelled digraph is
empty (`is_empty`

), has a given vertex
(`has_vertex`

), and has an edge between two vertices
(`has_edge`

).

We also specify accessors to retrieve the label associated with a
given vertex (`get_vertex`

) and edge
(`get_edge`

).

Given the identified use case, we also specify accessors to return
lists of all vertices in the graph (`all_vertices`

) and of
just their labels (`all_vertices_labels`

) and to return lists
of all outgoing edges from a vertex (`from_edges`

) and of
just their labels (`from_edges_labels`

).

We thus specify nine accessors with the following signatures:

```
is_empty :: Digraph a b c -> Bool
get_vertex :: Digraph a b c -> a -> b
has_vertex :: Digraph a b c -> a -> Bool
get_edge :: Digraph a b c -> a -> a -> c
has_edge :: Digraph a b c -> a -> a -> Bool
all_vertices :: Digraph a b c -> [a]
from_edges :: Digraph a b c -> a -> [a]
all_vertices_labels :: Digraph a b c -> [(a,b)]
from_edges_labels :: Digraph a b c -> a -> [(a,c)]
```

TODO: Consider changing `get_vertex`

and
`get_edge`

to return `Maybe b`

and
`Maybe c`

, respectively.

Given the identified use case and that Haskell uses garbage collection, no destructor seems to be needed in most cases.

We *model* the state of the instance of the Labelled Digraph
ADT with an abstract value `G`

such that
`G = (V,E,VL,EL)`

with `G`

’s components satisfying
the following **Labelled Digraph Properties**.

`V`

is a finite subset of values from the set`VertexType`

.`V`

denotes the vertices (or nodes) of the digraph.Any two elements of

`V`

can be compared for*equality*.`E`

is a binary relation on the set`V`

. A pair`(v1,v2) IN E`

denotes that there is a directed edge from`v1`

to`v2`

in the digraph.Note that this model allows at most one (directed) edge from a vertex

`v1`

to vertex`v2`

. It allows a directed edge from a vertex to itself.Also, because vertices can be compared for equality, any two edges can also be compared for equality.

`VL`

is a total function from set`V`

to the set`VertexLabelType`

.`EL`

is a total function from set`E`

to the set`EdgeLabelType`

.

We define the following interface invariant for the Labelled Digraph ADT:

Any valid labelled digraph instance`G`

, appearing in either the arguments or return value of a public ADT operation, must satisfy the Labelled Digraph Properties.

We specify the various ADT operations below using their type signatures, preconditions, and postconditions. Along with the interface invariant, these comprise the (implementation-independent) specification of the ADT (i.e., its abstract interface).

In these assertions, for a digraph `g`

that satisfies the
invariants, `G(g)`

denotes its abstract
model`(V,E,VL,EL)`

as described above. The value
`Result`

denotes the return value of function.

TODO: Consider in what order these should appear.

Constructor

`new_graph`

creates and returns a new empty instance of the graph ADT.Precondition:

`True`

Postcondition:

`G(Result) == ({},{},{},{})`

Accessor

`is_empty g`

returns`True`

if and only if graph`g`

is empty.Precondition:

`G(g) = (V,E,VL,EL)`

Postcondition:

`Result == (V == {} && E == {})`

Mutator

`add_vertex g nv nl`

inserts vertex`nv`

with label`nl`

into graph`g`

and returns the resulting graph.Precondition:

`G(g) = (V,E,VL,EL) && nv NOT_IN V`

Postcondition:

`G(Result) == (V UNION {nv}, E, VL UNION {(nv,nl)}, EL)`

Mutator

`remove_vertex g ov`

deletes vertex`ov`

from graph`g`

and returns the resulting graph.Precondition:

`G(g) = (V,E,VL,EL) && ov IN V`

Postcondition:

`G(Result) == (V', E', VL', EL') where V' = V - {ov} E' = E - {(ov,*),(*,ov)} VL' = VL - {(ov,*)} EL' = EL - {((ov,*),*),((*,ov),*)}`

Mutator

`update_vertex g ov nl`

changes the label on vertex`ov`

in graph`g`

to be`nl`

and returns the resulting graph.Precondition:

`G(g) = (V,E,VL,EL) && ov IN V`

Postcondition:

`G(Result) == (V - {ov}, E, VL', EL) where VL' = (VL - {(ov,VL(ov))}) UNION {(ov,nl)}`

Accessor

`get_vertex g ov`

returns the label from vertex`ov`

in graph`g`

TODO: If signature changed to return

`Maybe`

, change precondition and postcondition appropriately.Precondition:

`G(g) = (V,E,VL,EL) && ov IN V`

Postcondition:

`Result == VL(ov)`

Accessor

`has_vertex g ov`

returns`True`

if and only if`ov`

is a vertex of graph`g`

.Precondition:

`G(g) = (V,E,VL,EL)`

Postcondition:

`G(Result) == ov IN V`

Mutator

`add_edge g v1 v2 nl`

inserts an edge from vertex`v1`

to vertex`v2`

in graph`g`

and returns the resulting graph.Precondition:

`G(g) = (V,E,VL,EL) && v1 IN V && v2 IN V && (v1,v2) NOT_IN E`

Postcondition:

`G(Result) == (V, E', VL, EL') where E' = E UNION {(v1,v2)} EL' = EL UNION {((v1,v2),nl)}`

Mutator

`remove_edge g v1 v2`

deletes the edge from vertex`v1`

to vertex`v2`

from graph`g`

and returns the resulting graph.Precondition:

`G(g) = (V,E,VL,EL) V - {ov} && (v1,v2) IN E`

Postcondition:

`G(Result) == (V, E - {(v1,v2)}, VL, EL - { ((v1,v2),*) }`

Mutator

`update_edge g v1 v2 nl`

changes the label on the edge from vertex`v1`

to vertex`v2`

in graph`g`

to have label`nl`

and returns the resulting graph.Precondition:

`G(g) = (V,E,VL,EL) && (v1,v2) IN E`

Postcondition:

`G(Result) == (V, E, VL, EL') where EL' == (EL - {((v1,v2),*)}) UNION {((v2,v2),nl)`

Accessor

`get_edge g v1 v2`

returns the label on the edge from vertex`v1`

to vertex`v2`

in graph`g`

.TODO: If signature changed to return

`Maybe`

, change precondition and postcondition appropriately.Precondition:

`G(g) = (V,E,VL,EL) && (v1,v2) IN E`

Postcondition:

`Result == EL((v1,v2))`

Accessor

`has_edge g v1 v2`

returns`True`

if and only if there is an edge from a vertex`v1`

to a vertex`v2`

in graph`g`

.Precondition:

`G(g) = (V,E,VL,EL)`

Postcondition:

`Result == (v1,v2) IN E`

Accessor

`all_vertices g`

returns a sequence of all the vertices in graph`g`

. The returned sequence is represented by a builtin Haskell list.Precondition:

`G(g) = (V,E,VL,EL)`

Postcondition:

`(ForAll ov: ov IN Result <=> ov IN V) && length(Result) == size(V)`

Accessor

`from_edges g v1`

returns a sequence of all vertices`v2`

such that there is an edge from vertex`v1`

to vertex`v2`

in graph`g`

. The returned sequence is represented by a builtin Haskell list.Precondition:

`G(g) = (V,E,VL,EL) && v1 IN V`

Postcondition:

`(ForAll v2: v2 IN Result <=> (v1,v2) IN E) && length(Result) == (# v2 :: (v1,v2) IN E)`

TODO: Function

`from_edges g v1`

should return`[]`

when`v1`

does not appear in`g`

, so that it can work well with the Wizard’s Adventure game. We should redefine the precondition and postcondition to specify this behavior.Accessor

`all_vertices_labels g`

returns a sequence of all pairs`(v,l)`

such that`v`

is a vertex and`l`

is it’s label in graph`g`

. The returned sequence is represented by a builtin Haskell list.Precondition:

`G(g) = (V,E,VL,EL)`

Postcondition:

`(ForAll v, l: (v,l) IN Result <=> (v,l) IN VL) && length(Result) == size(VL)`

Accessor

`from_edges_labels g v1`

returns a sequence of all pairs`(v2,l)`

such that there is an edge`(v1,v2)`

labelled with`l`

in graph`g`

.Precondition:

`G(g) = (V,E,VL,EL) && v1 IN V`

Postcondition:

`(ForAll v2, l :: (v2,l) IN Result <=> ((v1,v2),l) IN EL) && length(Result) == (# v2 :: (v1,v2 ) IN E)`

TODO: Function

`from_edges_labels g v1`

should return`[]`

when`v1`

does not appear in`g`

, so that it can work well with the Wizard’s Adventure game. We should redefine the precondition and postcondition to specify this behavior.

Below we state the header for a Haskell module `Digraph_XXX`

that implements the Labelled Digraph ADT. The module name suffix `XXX`

denotes
the particular implementation for a data representation, but the
signatures and semantics of the operations are the same regardless of
representation.

The module *exports* data type `Digraph`

, but
its constructors are not exported. This allows modules that import `Digraph_XXX`

to
use the data type without knowing how the data type is implemented.

If we had `Digraph(..)`

in the export list, then the data type and all its constructors would be
exported.

The intention of this interface is to constrain the type parameters
of `Digraph a b c`

so that:

Type

`a`

(i.e., type`VertexType`

) must be in Haskell class`Eq`

. This is essentially required by the interface invariant (i.e., the Labelled Digraph Properties).Types

`a`

,`b`

, and`c`

(i.e., types`VertexType`

,`VertexLabelType`

, and`EdgeLabelType`

) must be in Haskell class`Show`

. This contraint enables the vertices and labels to be displayed as text.

It may be desirable (or necessary) for an implementation to further
constrain the type parameters. For example, some implementations may
need to constrain `VertexType`

to
be from class `Ord`

(i.e.,
totally ordered). It does not seem to restrict the generality of the ADT
significantly to require that vertices be drawn from a totally ordered
set such as integers, strings, or enumerated types.

```
module DigraphADT_XXX
Digraph --constraints (Eq a, Show a, Show b, Show c)
( --Digraph a b c
, new_graph --Digraph a b c -> Bool
, is_empty --Digraph a b c -> a -> b -> Digraph a b c
, add_vertex --Digraph a b c -> a -> Digraph a b c
, remove_vertex--Digraph a b c -> a -> b -> Digraph a b c
, update_vertex--Digraph a b c -> a -> b
, get_vertex --Digraph a b c -> a -> Bool
, has_vertex --Digraph a b c -> a -> a -> c -> Digraph a b c
, add_edge --Digraph a b c -> a -> a -> Digraph a b c
, remove_edge --Digraph a b c -> a -> a -> c -> Digraph a b c
, update_edge --Digraph a b c -> a -> a -> c
, get_edge --Digraph a b c -> a -> a -> Bool
, has_edge --Digraph a b c -> [a]
, all_vertices --Digraph a b c -> a -> [a]
, from_edges --Digraph a b c -> [(a,b)]
, all_vertices_labels--Digraph a b c -> a -> [(a,c)]
, from_edges_labels
)where -- definitions for the types and functions
```

Note: The Glasgow Haskell Compiler (GHC) release 8.2 (July 2017) and the Cabal-Install package manager release 2.0 (August 2017) support a new mixin package system called Backpack. This extension would enable us to define an abstract module “DigraphADT” as a signature file with the above interface. Other modules can then implement this abstract interface thus giving a more explicit and flexible definition of this abstract data type.

This section gives an implementation of the ADT that uses Haskell lists to represent the vertex and edge sets.

We represent the List implementation of the Labelled Digraph ADT as
an instance of the Haskell algebraic data type `Digraph`

as
shown below. (Remember that type variable `a`

is `VertexType`

,
`b`

is `VertexLabelType`

,
and `c`

is `EdgeLabelType`

.)

`data Digraph a b c = Graph [(a,b)] [(a,a,c)] `

In an instance `(Graph vs es)`

:

`vs`

is a list of tuples`(v,vl)`

where`v`

has`VertexType`

and represents a vertex of the digraph`vl`

has`VertexLabelType`

and is the unique label associated with vertex`v`

- a vertex
`v`

occurs at most once in`vs`

(i.e.,`vs`

encodes a function from vertices to vertex labels)

`es`

is a list of tuples`((v1,v2),el)`

where`v1`

and`v2`

are vertices occurring in`vs`

, representing a directed edge from`v1`

to`v2`

`el`

has`EdgeLabelType`

and is the unique label associated with edge`(v1,v2)`

- an edge
`(v1,v2)`

occurs at most once in`es`

(i.e.,`es`

encodes a function from edges to edge labels)

In terms of the abstract model, `vs`

encodes
`VL`

directly and, because `VL`

is a total
function on `V`

, it encodes `V`

indirectly.
Similarly, `es`

encodes `EL`

directly and
`E`

indirectly.

Of course, there are many other ways to represent the graph as lists. This representation is biased for a context where, once built, the labelled digraph is relatively static and the most frequent operations are the retrieval of labels attached to vertices or edges. That is, it is biased toward the Adventure game use case.

Given that all the type parameters must be of class `Show`

, we also
define `Digraph`

to
also be of class `Show`

as
defined below.

```
instance (Show a, Show b, Show c) =>
Show (Digraph a b c) where
show (Graph vs es) =
"(Digraph " ++ show vs ++ ", " ++ show es ++ ")"
```

Given the above description, we then define the following implementation (representation) invariant for the list-based version of the Labelled Digraph ADT:

Any Haskell`Digraph`

value`(Graph vs es)`

with abstract model`G = (V,E,VL,EL)`

, appearing in either the arguments or return value of an operation, must also satisfy the following:

```
(ForAll v, l :: (v,l) IN vs <=> (v,l) IN VL ) &&
(ForAll v1, v2, m :: (v1,v2,m) IN es <=> ((v1,v2),m) IN EL )
```

The code in this section shows a list-based implementation for several of the operations related to vertices.

The Haskell module for the list representation of the Labelled
Digraph ADT is in source file `DigraphADT_List.hs`

. A simple smoke
test driver module is in source file `DigraphADT_TestList.hs`

.

The implementations of constructor `new_graph`

and accessor
`is_empty`

are straightforward.

```
new_graph :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c
= Graph [] []
new_graph
is_empty :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c -> Bool
Graph [] _ ) = True
is_empty (= False is_empty _
```

Function `has_vertex`

just needs to
search through the list of vertices to determine whether or not the
vertex occurs. It relies upon `VertexType`

being in class
`Eq`

.

```
has_vertex :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c -> a -> Bool
Graph vs _) ov =
has_vertex (not (null [ n | (n,_) <- vs, n == ov])
```

Because of lazy evaluation, the list comprehension only needs to evaluate far enough to find the occurrence of the vertex in the list.

To add a new vertex and its label to the graph, `add_vertex`

must return a new graph
with the new vertex-label pair added to the head of the vertex list. To
meet the specification, it must not allow a vertex to be added if the
vertex already occurs in the list.

```
add_vertex :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c -> a -> b -> Digraph a b c
@(Graph vs es) nv nl
add_vertex g| not (has_vertex g nv) = Graph ((nv,nl):vs) es
| otherwise = error has_nv
where has_nv =
"Vertex " ++ show nv ++ " already in digraph"
```

Function `remove_vertex`

is a
bit trickier with this representation. To remove an existing vertex and
its label from the graph, `remove_vertex`

must return a new graph
with that vertex’s tuple removed from the list of vertices and with any
outgoing edges also removed from the list of edges.

```
remove_vertex :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c -> a -> Digraph a b c
@(Graph vs es) ov
remove_vertex g| has_vertex g ov = Graph ws fs
| otherwise = error no_ov
where ws = [ (w,m) | (w,m) <- vs, w /= ov ]
= [ (v1,v2,m) |
fs <- es, v1 /= ov, v2 /= ov ]
(v1,v2,m) = "Vertex " ++ show ov ++ " not in digraph" no_ov
```

The implementation of `remove_vertex`

filters all occurrences
of the vertex from the list of vertices. Given the implementation
invariant, this is not necessary. However, this potentially adds some
safety to the implementation at the possible expense of execution
time.

For an existing vertex in the list of vertices, function `update_vertex`

replaces the old label
with the new label. Like `remove_vertex`

, it potentially
processes the entire list of vertices and makes the change to all
occurrences, when the implementation invariant would allow it to stop on
the first (and only) occurrence.

```
update_vertex :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c -> a -> b -> Digraph a b c
@(Graph vs es) ov nl
update_vertex g| has_vertex g ov = Graph (map chg vs) es
| otherwise = error no_ov
where chg (w,m) = (if w == ov then (ov,nl) else (w,m))
= "Vertex " ++ show ov ++ " not in digraph" no_ov
```

For an existing vertex, function `get_vertex`

retrieves the label.
Because of lazy evaluation, the search of the list of vertices stops
with the first occurrence.

TODO: Modify appropriately if changed to `Maybe`

return.

```
get_vertex :: (Eq a, Show a, Show b, Show c) =>
Digraph a b c -> a -> b
Graph vs _) ov
get_vertex (| not (null ls) = head ls
| otherwise = error no_ov
where ls = [ l | (w,l) <- vs, w == ov]
= "Vertex " ++ show ov ++ " not in digraph" no_ov
```

TODO: Modify source file appropriately if changed to
`Maybe`

return.

The remainder of the functions are defined in file `DigraphADT_List.hs`

..

We can create an empty labelled digraph `g0`

having `Int`

identifiers for vertices, `Int`

labels for
vertices, and `Int`

labels for
edges as follows:

`= (new_graph :: Digraph Int Int Int) g0 `

Then we can add a new vertex with identifier `1`

and vertex
label `101`

as
follows:

`= add_vertex g0 1 101 g1 `

TODO: Consider whether to make any of the following changes to the specification and implementation above.

Based on the list-based design and implementation above, what improvements should we consider? Here are some possibilities.

As described above, the current list implementations of functions such as

`remove_vertex`

and`update_vertex`

do some unnecessary work with respect to the implementation invariant. This could be eliminated.The data representation (i.e., implementation invariant) could be changed to allow, for example, multiple occurrences of vertices in the vertex list. This would avoid the checks of

`has_vertex`

in`add_vertex`

and`update_vertex`

. Then, as it does above,`remove_vertex`

needs to remove all occurrences of the vertex.Other functions would need to be modified accordingly so that they only access the first occurrence of a vertex (especially the

`all_vertices`

and`all_vertices_labels`

functions).A similar change could be made to the list of edges.

Note: The Labelled Diagraph ADT specification does not specify what the behavior should be when the referenced vertex or edge is not defined. The change suggested in this item gives non-error behavior to those situations. Perhaps a better alternative would be to change the general ADT specification to require specific behaviors in those cases.

Most of the functions throw an

`error`

exception when the vertex they reference does not exist. A better Haskell design would redefine these functions to return a`Maybe`

or`Either`

value. This would eliminate most of the`has_vertex`

checks and make the functions defined on all possible inputs.This would require changes to the overall Labelled Digraph ADT specification and its abstract interface.

New functions could be added to the Labelled Digraph ADT—such as an equality check on graphs, a constructor that creates a copy of an existing graph, or functions to apply various graph algorithms.

Existing functions could be eliminated. For example, if the graph is only constructed and used for retrieval, then the remove and update functions could be eliminated.

This section gives an implementation of the ADT that uses a Haskell
`Map`

to
map a vertex to the set of outgoing edges from that vertex

We represent the Map implementation of the Labelled Digraph ADT as an
instance of the Haskell algebraic data type `Digraph`

as
shown below. (Remember that type variable `a`

is `VertexType`

,
`b`

is `VertexLabelType`

,
and `c`

is `EdgeLabelType`

.)

```
import qualified Data.Map.Strict as M
data Digraph a b c = Graph (M.Map a (b,[(a,c)]))
```

In the data constructor `(Graph m)`

,
`m`

is an instance of `Data.Map.Strict`

.
This collection is set of key-value pairs implemented as a balanced
tree, giving logarithmic access time.

An instance of `(Graph m)`

corresponds to the abstract model as follows:

The keys for the

`Map m`

collection are of`VertexLabelType`

.The interface invariant requires that

`VertexType`

be in class`Eq`

. The implementation based on`Data.Map.Strict`

further constrains vertices to be in subclass`Ord`

because the vertices are the keys of the`Map`

.TODO: Consider restricting the Digraph spec to require

`Ord`

.`Map m`

is defined for all keys`v1`

in vertex set`V`

and undefined for all other keys.For some vertex

`v1`

, the value of`m`

at key`v1`

is a pair`(l,es)`

where`l`

is an element of`VertexLabelType`

and is the unique label associated with`v1`

, that is,`l = VL(v1)`

.`es`

is the list of all tuples`(v2,el)`

such that`(v1,v2) IN E`

,`el IN`

`EdgeLabelType`

, and`el = EL((v1,v2))`

. That is,`(v1,v2)`

is an edge and`el`

is its unique label.

Given that all the type parameters must be of class `Show`

, we also
define `Digraph`

to
also be of class `Show`

as
defined below.

```
instance (Show a, Show b, Show c) => Show (Digraph a b c) where
show (Graph m) = "(Digraph " ++ show (M.toAscList m) ++ ")"
```

Given the above description, we then define the following implementation (representation) invariant for the list-based version of the Labelled Digraph ADT:

Any Haskell`Digraph`

value`(Graph m)`

with abstract model`G = (V,E,VL,EL)`

, appearing in either the arguments or return value of an operation, must also satisfy the following:

```
(ForAll v1, l, es ::
( m(v1) defined && m(v1) == (l,es) ) <=>
( VL(v1) == l &&
(ForAll v2, el :: (v2,el) IN es <=>
EL((v1,v2)) == el) ) )
```

The code in this section shows a map-based implementation for the same operations we examined for the list-based implementation.

The Haskell module for the map representation of the Labelled Digraph
ADT is in source file `DigraphADT_Map.hs`

.. A simple smoke
test driver module is in source file `DigraphADT_TestMap.hs`

..

Constructor `new_graph`

and
accessors `is_empty`

and
`has_vertex`

are just wrappers for functions from `Data.Map.Strict`

.

```
new_graph :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c
= Graph M.empty
new_graph
is_empty :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c -> Bool
Graph m) = M.null m
is_empty (
has_vertex :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c -> a -> Bool
Graph m) ov = M.member ov m has_vertex (
```

To add a new vertex and label to the graph, `add_vertex`

must return a graph with
the new key-value pair inserted into the existing graph’s
`Map`

. The value consists of the label paired with a nil list
of adjacent edges. To meet the specification, it must not allow a vertex
to be added if the vertex already occurs in the list.

```
add_vertex :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c -> a -> b -> Digraph a b c
@(Graph m) nv nl
add_vertex g| not (has_vertex g nv) = Graph (M.insert nv (nl,[]) m)
| otherwise = error has_nv
where has_nv =
"Vertex " ++ show nv ++ " already in digraph"
```

Except for making sure the vertex to be deleted is the graph,
function `remove_vertex`

is just a
wrapper for the `Data.Map.Strict.delete`

function.

```
remove_vertex :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c -> a -> Digraph a b c
@(Graph m) ov
remove_vertex g| has_vertex g ov = Graph (M.delete ov m)
| otherwise = error no_ov
where no_ov = "Vertex " ++ show ov ++ " not in digraph"
```

If the argument vertex is in the graph, then function `update_vertex`

retrieves its old label
and edge list and then reinserts the new label paired with the same edge
list.

```
update_vertex :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c -> a -> b -> Digraph a b c
@(Graph m) ov nl
update_vertex g| has_vertex g ov =
Graph (M.insert ov (upd (M.lookup ov m)) m)
| otherwise = error no_ov
where upd (Just (ol,edges)) = (nl,edges)
= error no_entry
upd _ = "Vertex " ++ show ov ++ " not in digraph"
no_ov =
no_entry "Missing/malformed value for vertex " ++ show ov
```

For an existing vertex, function `get_vertex`

retrieves the associated
value and extracts the label.

```
get_vertex :: (Ord a, Show a, Show b, Show c) =>
Digraph a b c -> a -> b
@(Graph m) ov
get_vertex g| has_vertex g ov = getlabel (M.lookup ov m)
| otherwise = error no_ov
where getlabel (Just (ol,_)) = ol
= "Vertex " ++ show ov ++ " not in digraph" no_ov
```

The remainder of the functions are defined in file `DigraphADT_Map.hs`

.

The Map-based functions can be called in the same manner as the
List-based function, except that the vertices must be in class `Ord`

.

All the improvements suggested for the list-based implementation apply to the map-based implementation except for the first.

For large graphs, the map-based implementation should perform better than the list-based implementation.

For large graphs with many outgoing edges on each vertex, it might be
useful to implement the edge-list itself with a `Map`

.

This chapter (22) revisited the issues of specification, design, and implementation of data abstractions as modules in Haskell. It used a labelled digraph data structure as the example.

Although we may not specify all subsequent Haskell modules as systematically as we did in this chapter, we do use the modular style of programming in the various interpreters developed in Chapter 41 and following.

In the future, we plan to implement a Adventure game on top of the ADT implemented in this chapter.

TODO

TODO: If the `Maybe`

improvement is not done above for
errors, put that here as an exercise.

Restate the preconditions and postconditions for functions

`from_edges`

and`from_edges`

so that they must return empty lists when the argument vertex`v1`

is not in the vertex set. (See the notes on these operations in the semantic specification above.)Develop a comprehensive test script for the Labelled Digraph ADT implementations using blackbox, module-level, functional testing as described in Chapters 11 and 12.

Adapt the Haskell Labelled Digraph ADT interface and it two implementations to use GHC’s Backpack module system.

Specify a similar Labelled Digraph ADT as a Java interface.

Give two different implementations of the Labelled Digraph ADT in Java using the specification from the previous exercise.

Specify a similar Labelled Digraph ADT as a Python 3 module.

Give two different implementations of the Labelled Digraph ADT in Python using the specification from the previous exercise.

Choose one of the improvements described in the “Improvements in the list implementation” subsection and change the specification and list implementation as needed for the improvement.

Choose one of the improvements and change the specification and map implementation as needed for the improvement.

Give a full specification (similar to the one for the Labeled Digraph ADT) in this chapter) for the Carrie’s Candy Bowl project in Chapter 21<!-22—>. That is, give the name, set, signatures, and constructive semantics. If helpful, you may use the mathematical concept of bag.

Specify a doubly labelled directed multigraph data structure to replace the doubled labelled digraph. (That is, allow multiple directed edges from one vertex to another.)

Give an implementation of the doubly labelled directed multigraph specified in the previous exercise.

In this project, you are asked to design and implement Haskell modules to represent Mealy Machines and to simulate their execution.

This kind of machine is a useful abstraction for simple controllers that listen for input events and respond by generating output events. For example in an automobile application, the input might be an event such as “fuel level low” and the output might be command to “display low-fuel warning message”.

In the theory of computation, a *Mealy Machine* is a
*finite-state automaton* whose output values are determined both
by its current state and the current input. It is a *deterministic
finite state transducer* such that, for each state and input, at
most one transition is possible.

Appendix A of the Linz textbook [5] defines a Mealy Machine mathematically by a tuple

$M = (Q,\Sigma,\Gamma,\delta,\theta,q_{0})$

where

$Q$ is a finite set of internal states

$\Sigma$ is the input alphabet (a finite set of values)

$\Gamma$ is the output alphabet (a finite set of values)

$\delta: Q \times \Sigma \longrightarrow Q$ is the transition function

$\theta: Q \times \Sigma \longrightarrow \Gamma$ is the output function

$q_{0}$ is the initial state of $M$ (an element of $Q$)

In an alternative formulation, the transition and output functions can be combined into a single function:

$\delta: Q \times \Sigma \longrightarrow Q \times \Gamma$

We often find it useful to picture a finite state machine as a
*transition graph* where the states are mapped to vertices and
the transition function represented by directed edges between vertices
labelled with the input and output symbols.

Specify, design, and implement a general representation for a Mealy Machine as a Haskell module implementing an abstract data type. It should hide the representation of the machine and should have, at least, the following public operations.

`newMachine s`

creates a new machine with initial (and current) state`s`

and no transitions.Note: This assumes that the state, input, and output sets are exactly those added with the mutator operations below. An alternative would be to change this function to take the allowed state, input, and output sets.

`addState m s`

adds a new state`s`

to machine`m`

and returns an`Either`

wrapping the modified machine or an error message.`addTransition m s1 in out s2`

adds a new transition to machine`m`

and returns an`Either`

wrapping the modified machine or an error message. From state`s1`

with input`in`

the modified machine outputs`out`

and transitions to state`s2`

.`addResets m`

adds all reset transitions to machine`m`

and returns the modified machine. From state`s1`

on input`in`

the modified machine outputs`out`

and transitions to state`s2`

. This operation makes the transition function a total function by adding any missing transitions from a state back to the initial state.`setCurrent m s`

sets the current state of machine`m`

to`s`

and returns an`Either`

wrapping the modified machine or an error message.`getCurrent m`

returns the current state of machine`m`

.`getStates m`

returns a list of the elements of the state set of machine`m`

.`getInputs m`

returns a list of the input set of machine`m`

.`getOutputs m`

returns a list of the output set of machine`m`

.`getTransitions m`

returns a list of the transition set of machine`m`

. Tuple`(s1,in,out,s2)`

occurs in the returned list if and only if, from state`s1`

with input`in`

, the machine outputs`out`

and moves to state`s2`

.`getTransitionsFrom m s`

returns an`Either`

wrapping a list of the set of transitions enabled from state`s`

of machine`m`

or an error message.

Given the above implementation for a Mealy Machine, design and implement a separate Haskell module that simulates the execution of a Mealy Machine. It should have, at least, the following new public operations.

`move m in`

moves machine`m`

from the current state given input`in`

and returns an`Either`

wrapping a tuple`(m',out)`

or an error message. The tuple gives the modified machine`m'`

and the output`out`

.`simulate m ins`

simulates execution of machine`m`

from its current state through a sequence of moves for the inputs in list`ins`

and returns an`Either`

wrapping a tuple`(m',outs)`

or an error message. The tuple gives the modified machine`m'`

after the sequence of moves and the output list`outs`

.

Note: It is possible to use a Labelled Digraph ADT module in the implementation of the Mealy Machine.

Implement a Haskell module that uses a different representation for the Mealy Machine. Make sure the simulator module still works correctly.

In Spring 2017, I created a Labelled Digraph ADT document by adapting and revising comments from the Haskell implementations of the Labelled Digraph abstract data type. I had specified the ADT and developed the implementations as my solution for Assignment #1 in CSci 556 (Multiparadigm Programming) in Spring 2015. I also included some content from my notes on Data Abstraction [3].

(In addition to the list- and map-based Haskell implementations of the Labelled Digraph ADT, I developed a list-based implementation in Elixir in Spring 2015 and two Scala-based implementations in Spring 2016.)

In Spring 2017, I also created a Mealy Machine Simulator Exercise document by adapting and revising a project I had assigned in the Scala-based offering of CSci 555 (Functional Programming) in Spring 2016.

In 2018, I merged and revised these documents to become new Chapter
22, Data Abstraction Revisited, in the textbook *Exploring Languages
with Interpreters and Functional Programming*.

I retired from the full-time faculty in May 2019. As one of my post-retirement projects, I am continuing work on this textbook. In January 2022, I began refining the existing content, integrating additional separately developed materials, reformatting the document (e.g., using CSS), constructing a bibliography (e.g., using citeproc), and improving the build workflow and use of Pandoc.

I maintain this chapter as text in Pandoc’s dialect of Markdown using embedded LaTeX markup for the mathematical formulas and then translate the document to HTML, PDF, and other forms as needed.

Data abstraction; abstract data type (ADT), instance; specification of ADTs using name, sets, signatures, and semantics; constructor, accessor, mutator, and destructor operations; axiomatic and constructive semantics; abstract model (contract, precondition, postcondition, interface and implementation invariant, abstract interface); use of Haskell module hiding features to implement the abstract data type’s interface; using mathematical concepts to model the data abstraction (graph, digraph, labelled graph, multigraph, set, sequence, bag, total and partial functions, relation); graph data structure; adventure game.

Mealy Machine, simulator, finite-state automaton (machine), deterministic finite state transducer, state, transition, transition graph.

[1]

Conrad Barski. 2011. *Land of
Lisp: Learn to program in Lisp, one game at a
time*. No Starch Press, San Francisco, California, USA.

[2]

H.
Conrad Cunningham. 2006. *A programmer’s introduction to predicate
logic*. University of Mississippi, Department of Computer
and Information Science, University, Mississippi, USA.
Retrieved from https://john.cs.olemiss.edu/~hcc/docs/PredicateLogicNotes/Programmers_Introduction_to_Predicate-Logic.pdf

[3]

H.
Conrad Cunningham. 2019. *Notes on data abstraction*. University
of Mississippi, Department of Computer and Information
Science, University, Mississippi, USA. Retrieved from https://john.cs.olemiss.edu/~hcc/docs/DataAbstraction/DataAbstraction.html

[4]

Nell Dale and Henry M. Walker. 1996.
*Abstract data types: Specifications, implementations, and
applications*. D. C. Heath, Lexington, Massachusetts, USA.

[5]

Peter Linz. 2017. *Formal languages and
automata* (Sixth ed.). Jones & Bartlett, Burlington,
Massachusetts, USA.