Two main approaches to functional arrays may be discerned: incremental and monolithic definition. In the incremental case, we have a function that produces an empty array of a given size and another that takes an array, an index, and a value, producing a new array that differs from the old one only at the given index. Obviously, a naive implementation of such an array semantics would be intolerably inefficient, either requiring a new copy of an array for each incremental redefinition, or taking linear time for array lookup; thus, serious attempts at using this approach employ sophisticated static analysis and clever run-time devices to avoid excessive copying. The monolithic approach, on the other hand, constructs an array all at once, without reference to intermediate array values. Although Haskell has an incremental array update operator, the main thrust of the array facility is monolithic.
Arrays are not part of the Standard Prelude---the standard library contains the array operators. Any module using arrays must import the Array module.
The range operation takes a bounds pair and produces the list of indices lying between those bounds, in index order. For example,
range (0,4) => [0,1,2,3,4]
range ((0,0),(1,2)) => [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2)]
The inRange predicate determines whether an index lies between a given pair of bounds. (For a tuple type, this test is performed component-wise.) Finally, the index operation allows a particular element of an array to be addressed: given a bounds pair and an in-range index, the operation yields the zero-origin ordinal of the index within the range; for example:
index (1,9) 2 => 1
index ((0,0),(1,2)) (1,1) => 4
Array subscripting is performed with the infix operator !, and the bounds of an array can be extracted with the function bounds:
squares!7 => 49
bounds squares => (1,100)
We might generalize this example by parameterizing the bounds and the
function to be applied to each index:
mkArray :: (Ix a) => (a -> b) -> (a,a) -> Array a b
mkArray f bnds = array bnds [(i, f i) | i <- range bnds]
Thus, we could define squares as mkArray (\i -> i * i) (1,100).
Many arrays are defined recursively; that is, with the values of some
elements depending on the values of others. Here, for example, we
have a function returning an array of Fibonacci numbers:
fibs :: Int -> Array Int Int
fibs n = a where a = array (0,n) ([(0, 1), (1, 1)] ++
[(i, a!(i-2) + a!(i-1)) | i <- [2..n]])
Another example of such a recurrence is the n by n wavefront
matrix, in which elements of the first row and first column all have
the value 1 and other elements are sums of their neighbors to the
west, northwest, and north:
wavefront :: Int -> Array (Int,Int) Int
wavefront n = a where
a = array ((1,1),(n,n))
([((1,j), 1) | j <- [1..n]] ++
[((i,1), 1) | i <- [2..n]] ++
[((i,j), a!(i,j-1) + a!(i-1,j-1) + a!(i-1,j))
| i <- [2..n], j <- [2..n]])
The wavefront matrix is so called because in a parallel
implementation, the recurrence dictates that the computation can begin
with the first row and column in parallel and proceed as a
wedge-shaped wave, traveling from northwest to southeast. It is
important to note, however, that no order of computation is specified
by the association list.
In each of our examples so far, we have given a unique association for each index of the array and only for the indices within the bounds of the array, and indeed, we must do this in general for an array be fully defined. An association with an out-of-bounds index results in an error; if an index is missing or appears more than once, however, there is no immediate error, but the value of the array at that index is then undefined, so that subscripting the array with such an index yields an error.
We complete our introduction to Haskell arrays with the familiar
example of matrix multiplication, taking advantage of overloading
to define a fairly general function. Since only multiplication and
addition on the element type of the matrices is involved, we get
a function that multiplies matrices of any numeric type unless we
try hard not to. Additionally, if we are careful to apply only
(!) and the operations of Ix to indices, we get genericity over
index types, and in fact, the four row and column index types need
not all be the same. For simplicity, however, we require that
the left column indices and right row indices be of the same type, and
moreover, that the bounds be equal:
matMult :: (Ix a, Ix b, Ix c, Num d) =>
Array (a,b) d -> Array (b,c) d -> Array (a,c) d
matMult x y = array resultBounds
[((i,j), sum [x!(i,k) * y!(k,j) | k <- range (lj,uj)])
| i <- range (li,ui),
j <- range (lj',uj') ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "matMult: incompatible bounds"
As an aside, we can also define matMult using accumArray,
resulting in a presentation that more closely resembles the
usual formulation in an imperative language:
matMult x y = accumArray (+) 0 resultBounds
[((i,j), x!(i,k) * y!(k,j))
| i <- range (li,ui),
j <- range (lj',uj')
k <- range (lj,uj) ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "matMult: incompatible bounds"
We can generalize further by making the function higher-order,
simply replacing sum and (*) by functional parameters:
genMatMult :: (Ix a, Ix b, Ix c) =>
([f] -> g) -> (d -> e -> f) ->
Array (a,b) d -> Array (b,c) e -> Array (a,c) g
genMatMult sum' star x y =
array resultBounds
[((i,j), sum' [x!(i,k) `star` y!(k,j) | k <- range (lj,uj)])
| i <- range (li,ui),
j <- range (lj',uj') ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "matMult: incompatible bounds"
APL fans will recognize the usefulness of functions like the following:
genMatMult maximum (-)
genMatMult and (==)
With the first of these, the arguments are numeric matrices, and the
(i,j)-th element of the result is the maximum difference between
corresponding elements of the i-th row and j-th column of the
inputs. In the second case, the arguments are matrices of any equality
type, and the result is a Boolean matrix in which element (i,j)
is True if and only if the i-th row of the first argument and
j-th column of the second are equal as vectors.
Notice that the element types of genMatMult need not be the same, but merely appropriate for the function parameter star. We could generalize still further by dropping the requirement that the first column index and second row index types be the same; clearly, two matrices could be considered conformable as long as the lengths of the columns of the first and the rows of the second are equal. The reader may wish to derive this still more general version. (Hint: Use the index operation to determine the lengths.)