2 August 2018
Browser Advisory: The HTML version of this textbook requires a browser that supports the display of MathML. A good choice as of August 2018 is a recent version of Firefox from Mozilla.
The previous chapter illustrated how to state and prove Haskell “laws” about already defined functions.
This chapter illustrates how to use similar reasoning methods to synthesize (i.e. derive or calculate) function definitions from their specifications.
The next chapter applies these program synthesis techniques to a larger set of examples on text processing.
This chapter deals with program synthesis.
In the proof of a property, we take an existing program and then demonstrate that it satisfies some property.
In the synthesis of a program, we take a property called a specification and then synthesize a program that satisfies it [Bird 1988]. (Program synthesis is called program derivation in other contexts, such as in the Gries textbook [Gries 1981].)
Both proof and synthesis require essentially the same reasoning. Often a proof can be turned into a synthesis by simply reversing a few of the steps, and vice versa.
This section is based on Sections 5.4.5 and 5.5 of the [Bird 1988] and Section 4.5 of [Hoogerwoord 1989].
A (second-order) Fibonacci sequence is the sequence in which the first two elements are 0 and 1 and each successive element is the sum of the two immediately preceding elements:
0, 1, 1, 2, 3, 5, 8, 13, ...
As we have seen in Chapter 9, we can take the above informal description and define a function to compute the n
th element of the Fibonacci sequence. The definition is straightforward. Unfortunately, this algorithm is quite inefficient, O(fib n)
.
In Chapter 9 we also developed a more efficient, but less straightforward, version by using two accumulating parameters. This definition seemed to be “pulled out of thin air”. Can we synthesize a definition that uses the more efficient algorithm from the simpler definition above?
Yes, but we use a slightly different approach than we did before. We can improve the performance of the Fibonacci computation by using a technique called tupling [Hoogerwoord 1989] as we saw in Chapter 20.
The tupling technique can be applied to a set of functions with the same domain and the same recursive pattern. First, we define a new function whose value is a tuple, the components of which are the values of the original functions. Then, we proceed to calculate a recursive definition for the new function.
This technique is similar to the technique of adding accumulating parameters to define a new function.
Given the definition of fib
above, we begin with the specification [Bird 1988]:
and synthesize a recursive definition by using induction on the natural number n
.
Base case n = 0
:
{ specification }
{ arithmetic, fib.1, fib.2
}
This gives us a definition for the base case.
Inductive case n = m+1
:
Given that there is a definition for twofib m
that satisfies the specification
calculate a definition for twofib (m+1)
that satisfies the specification.
{ specification }
{ arithmetic, fib.3
}
{ abstraction }
{ induction hypothesis }
This gives us a definition for the inductive case.
Bringing the cases together and rewriting twofib (m+1)
to get a valid pattern, we synthesize the following definition:
twofib :: Int -> (Int,Int)
twofib 0 = (0,1)
twofib n | n > 0 = (b,a+b)
where (a,b) = twofib (n-1)
fastfib :: Int -> Int
fastfib n = fst (twofib n)
Above fst
is the standard prelude function to extract the first component of a pair (i.e. a 2-tuple).
The key to the performance improvement is solving a “harder” problem: computing fib n
and fib (n+1)
at the same time. This allows the values needed to be “passed forward” to the “next iteration”.
In general, we can approach the synthesis of a function using the following method.
Devise a specification for the function in terms of defined functions, data, etc.
Assume the specification holds.
Using proof techniques (as if proving the specification), calculate an appropriate definition for the function.
As needed, break the synthesis calculation into cases motivated by the induction “proof” over an appropriate (well-founded) set (e.g. over natural numbers or finite lists). The inductive cases usually correspond to recursive legs of the definition.
Now let’s consider a function to generate a list of the elements fib 0
through fib n
for some natural number n
. A simple backward recursive definition follows:
allfibs :: Int -> [Int]
allfibs 0 = [0] -- allfibs.1
allfibs n | n > 0 = allfibs (n-1) ++ [fib n] -- allfibs.2
Using fastfib
, each fib n
calculation is O(n)
. Each ++
call is also O(n)
. The fib
and the ++
are “in sequence”, so each call of allfibs
is just O(n)
. However, there are O(n)
recursive calls of allfibs
, so the overall complexity is O(n^2)
.
We again attempt to improve the efficiency by tupling We begin with the following specification for fibs
:
We already have definitions for the functions on the right-hand side, fib
and allfibs
. Our task now is to synthesize a definition for the left-hand side, fibs
.
We proceed by induction on the natural number n
and consider two cases.
Base case n = 0
:
{ fibs
specification }
{ fib.1, fib.2, allfibs.1
}
This gives us a definition for the base case.
Inductive case n = m+1
Given that there is a definition for fibs m
that satisfies the specification
calculate a definition for fibs (m+1)
that satisfies the specification.
{ fibs
specification }
{ fib.3, allfibs.2
}
{ abstraction }
{ induction hypothesis }
This gives us a definition for the inductive case.
Bringing the cases together, we get the following definitions:
fibs :: Int -> (Int,Int,[Int])
fibs 0 = (0,1,[0])
fibs n | n > 0 = (b,a+b,c++[b])
where (a,b,c) = fibs (n-1)
allfibs1 :: Int -> [Int]
allfibs1 n = thd3 (fibs n)
Above thd3
is the standard prelude function to extract the third component of a 3-tuple.
We have eliminated the O(n)
fib
calculations, but still have an O(n)
append (++
) within each of the O(n)
recursive calls of fibs
. This program is better, but is still O(n^2)
.
Note that in the c ++ [b]
expression there is a single element on the right. Perhaps we could build this term backwards using cons, an O(1)
operation, and then reverse the final result.
We again attempt to improve the efficiency by tupling. We begin with the following specification for fibs
:
For convenience in calculation, we replace reverse
by its backward recursive equivalent rev
.
We again proceed by induction on { `n} and consider two cases.
Base case n = 0
:
{ fibs'
specification }
{ fib.1
, fib.2
, allfibs.1
}
{ rev.2
}
{ rev.1
, append.1
}
This gives us a definition for the base case.
Inductive case n = m+1
:
Given that there is a definition for fibs' m
that satisfies the specification
calculate a definition for fibs' (m+1)
that satisfies the specification.
{ fibs'
specification }
{fib.3
, allfibs.2
}
{ abstraction }
{ induction hypothesis }
{ rev (xs ++ [x]) = x : rev xs
, Note 1 }
{ substitution }
This gives us a definition for the inductive case.
Note 1: The proof of rev (xs ++ [x]) = x : rev xs
is left as an exercise.
Bringing the cases together, we get the following definition:
fibs' :: Int -> (Int,Int,[Int])
fibs' 0 = (0,1,[0])
fibs' n | n > 0 = (b,a+b,b:c)
where (a,b,c) = fibs' n
allfibs2 :: Int -> [Int]
allfibs2 n = reverse (thd3 (fibs' n))
Function fibs'
is O(n)
. Hence, allfibs2'
is O(n)
.
Are further improvements possible?
Clearly, function fibs'
must generate an element of the sequence for each integer in the range [0..n]
. Thus no complexity order improvement is possible.
However, from our previous experience, we know that it should be possible to avoid doing a reverse by using a tail recursive auxiliary function to compute the Fibonacci sequence. The investigation of this possible improvement is left to the reader.
For an O(log2(n))
algorithm to compute fib n
, see section 5.2 of Kaldewaij’s textbook on program derivation [Kaldewaij 1990].
drop
from take
Suppose that we have the following definition for the list function take
, but no definition for drop
.
Further suppose that we wish to synthesize a definition for drop
that satisfies the following specification for any natural number n
and finite list xs
.
We proved this as a property earlier, given definitions for both take
and drop
. The synthesis uses induction on both n
and xs
and the same cases we used in the proof.
Base case n = 0
:
{ specification, substitution for this case }
{ take.1
}
{ ++
identity }
This gives the equation drop 0 xs = xs
.
Base case n = m+1
:
{ specification, substitution for this case }
{ take.2
}
{ ++
identity }
This gives the defining equation drop (m+1) [] = []
. Since the value of the argument (m+1)
is not used in the above calculation, we can generalize the definition to drop _ [] = []
.
Inductive case n = m+1
, xs = (a:as)
:
Given that there is a definition for drop m as
that satisfies the specification:
calculate an appropriate definition for drop (m+1) (a:as)
that satisfies the specification.
{ specification, substitution for this case }
{ take.3
}
{ append.2
}
Hence, a:(take m as ++ drop (m+1) (a:as)) = (a:as)
.
{ axiom of equality of lists (Note 1) }
{ m
0, specification }
{ equality of lists (Note 2) }
Because of the induction hypothesis, we know that drop m as
is defined. This gives a definition for this case.
Notes:
The symbol denotes logical equivalence (i.e. if and only if) and is pronounced “equivales”.
(x:xs) = (y:ys)
x = y && xs = ys
. In this case x
and y
both equal a
.
xs ++ ys = xs ++ zs
ys = zs
can be proved by induction on xs
using the Note 1 property.
Bringing the cases together, we get the definition that we saw earlier.
drop :: Int -> [a] -> [a]
drop n xs | n <= 0 = xs -- drop.1
drop _ [] = [] -- drop.2
drop n (_:xs) = drop (n-1) xs -- drop.3
In Chapter 14, we looked at two different definitions of a function to reverse the elements of a list. Function rev
uses a straightforward backward linear recursive technique and reverse
uses a tail recursive auxiliary function. We proved that these definitions are equivalent.
rev :: [a] -> [a]
rev [] = [] -- rev.1
rev (x:xs) = rev xs ++ [x] -- rev.2
reverse :: [a] -> [a]
reverse xs = rev' xs [] -- reverse.1
where rev' [] ys = ys -- reverse.2
rev' (x:xs) ys = rev' xs (x:ys) -- reverse.3
Function rev'
is a generalization of rev
. Is there a way to calculate rev'
from rev
?
Yes, by using the Tail Recursion Theorem for lists. We develop this theorem in a more general setting than rev
.
The following is based on Section 4.7 of [Hoogerwoord 1989].
For some types X
and Y
, let function fun
be defined as follows:
fun :: X -> Y
fun x | not (b x) = f x -- fun.1
| b x = h x *** fun (g x) -- fun.2
Functions b
, f
, g
, h
, and ***
are not defined in terms of fun
.
b :: X -> Bool
such that, for any x
, b x
is defined whenever fun x
is defined.
g :: X -> X
such that, for any x
, g x
is defined whenever fun x
is defined and b x
holds.
h :: X -> Y
such that, for any x
, h x
is defined whenever fun x
is defined and b x
holds.
(***) :: Y -> Y -> Y
such that operation ***
is defined for all elements of Y
and is an associative operation with left identity e
.
f :: X -> Y
such that, for any x
, f x
is defined whenever fun x
is defined and not (b x)
holds.
X
with relation admits induction (i.e. is a well-founded ordering).
For any x
, if fun x
is defined and b x
holds, then g x
x
.
Note that both fun x
and the recursive leg h x *** fun (g x)
have the general structure y *** fun z
for some expressions y
and z
(i.e. fun x = e *** fun x
). Thus we specify a more general function fun'
bsuch that
and such that fun'
s is defined for any x
X
for which fun x
is defined.
Given the above specification, we note that:
{ fun'
specification }
{ e
is the left identity for ***
}
We proceed by induction on the type X
with . (We are using well-founded induction, a more general form of induction than we have used before.
We have two cases. The base case is when not (b x)
holds for argument x
of fun'
. The inductive case is when b x
holds (i.e, g x
x
).
Base case not (b x)
: (That is, x
is a minimal element of X
under .)
{ fun'
specification }
{ fun.1
}
Inductive case b x
: (That is, g x
x
.)
Given that there is a definition for fun' y (g x)
that satisfies the specification for any y
calculate a definition for fun' y x
that satisfies the specification.
{ fun'
specification }
{ fun.2
}
{ ***
associativity }
{ g x
x
, induction hypothesis }
Thus we have synthesized the following tail recursive definition for function fun'
and essentially proved the Tail Recursion Theorem shown below.
fun' :: Y -> X -> Y
fun' y x | not (b x) = y *** f x -- fun'.1
| b x = fun' (y *** h x) (g x) -- fun'.2
Note that the first parameter of fun'
is an accumulating parameter.
fun
, fun'
, and e
are defined as given above, then fun x = fun' e x
.
Now let’s consider the rev
and rev'
functions again. First, let’s rewrite the definitions of rev
in a form similar to the definition of fun
.
For rev
we substitute the following for the components of the fun
definition:
fun x
rev xs
b x
xs /= []
g x
tail xs
h x
[head xs]
l *** r
r ++ l
(Note the flipped operands,)
f x
[]
l
r
(length l) < (length r)
e
[]
fun' y x
rev' xs ys
(Note the flipped arguments.)
Thus, by applying the tail recursion theorem, fun'
becomes the following:
rev' :: [a] -> [a] -> [a]
rev' xs ys
| xs == [] = ys -- rev'.1
| xs /= [] = rev' (tail xs) ([head xs]++ys) -- rev'.2
From the Tail Recursion Theorem, we conclude that rev xs = rev' xs []
.
Why would we want to convert a backward linear recursive function to a tail recursive form?
A tail recursive definition is sometimes more space efficient (as we saw in Chapter 9).
This is especially the case if the strictness of an accumulating parameter can be exploited (as we saw in Chapters 9 and 15).
A tail recursive definition sometimes allows the replacement of an “expensive” operation (requiring many steps) by a less “expensive” one. (For example, ++
is replaced by cons in the transformation from rev
to rev'
.)
A tail recursive definition can be transformed (either by hand or by a compiler) into an efficient loop.
A tail recursive definition is usually more general than its backward linear recursive counterpart. Sometimes we can exploit this generality to synthesize a more efficient definition. (We see an example of this in the next subsection.)
This section is adapted from Section 11.3 of Cohen’s textbook [Cohen 1990].
Although the Tail Recursion Theorem is important, the technique we used to develop it is perhaps even more important. We can sometimes use the technique to transform one tail recursive definition into another that is more efficient [Hoogerwoord 1989].
Consider exponentiation by a natural number power. The operation **
can be defined recursively in terms of multiplication as follows:
For (**)
we substitute the following for the components of the fun
definition of the previous subsection:
fun x
m ** n
b x
n > 0
(Applied only to natural numbers.)
g x
n - 1
h x
m
l *** r
l * r
f x
1
l
r
l < r
e
1
fun' y x
exp a m n
Thus, by applying the Tail Recursion Theorem, we define the function exp
such that
and, in particular:
The resulting function exp
is defined as follows (for n >= 0
):
In terms of time, this function is no more efficient than the original version; both require O(n)
multiplies. (However, by exploiting the strictness of the first parameter, exp
can be made more space efficient than **
. )
Note that exp
algorithm converges upon the final result in steps of one. Can we take advantage of the generality of exp
and the arithmetic properties of exponentiation to find an algorithm that converges in larger steps?
Yes, we can by using the technique that we used to develop the Tail Recursion Theorem. In particular, let’s try to synthesize an algorithm that converges logarithmically (in steps of half the distance) instead of linearly.
Speaking operationally, we are looking for a “short cut” to the result. To find this short cut, we use the “maps” that we have of the “terrain”. That is, we take advantage of the properties we know about the exponentiation operator.
We thus attempt to find expressions x
and y
such that
where “/
” represents division on integers.
For the base case where n = 0
, this is trivial. We proceed with a calculation to discover values for x
and y
that make
when n > 0
(i.e. in the inductive case). In doing this we can use the specification for exp
(i.e.exp a m n = a * (m ** n)
).
{ exp
specification }
{ Choose y = m ** 2
(Note 1) }
Note 1: The strategy is to make choices for x
and y
that make
and
equal. This choice for y
is toward getting the m ** n
term.
Because we are dealing with integer division, we need to consider two cases because of truncation.
Subcase even n
(for n > 0
):
{ arithmetic properties of exponentiation, n
even }
{ Choose x = a
, toward getting a * (m ** n)
}
{ exp
specification }
Thus, for even n
, we derive:
We optimize and replace m ** 2
by m * m
.
Subcase odd n
(for n > 0
): That is, n/2 = (n-1)/2
.
{ arithmetic properties of exponentiation }
{ Choose x = a * m
, toward getting a * (m ** n)
}
{ arithmetic properties of exponentiation }
{ exp
specification }
Thus, for odd n
, we derive:
To differentiate the logarithmic definition for exponentiation from the linear one, we rename the former to exp'
. We have thus defined exp'
as follows (for n >= 0
):
exp' :: Int -> Int -> Int -> Int
exp' a m 0 = a -- exp'.1
exp' a m n
| even n = exp' a (m*m) (n/2) -- exp'.2
| odd n = exp' (a*m) (m*m) ((n-1)/2) -- exp'.3
Above we showed that exp a m n = exp' a m n
. However, execution of exp'
converges faster upon the result: O(log2(n))
steps rather than O(n)
.
Note: Multiplication and division of integers by natural number powers of , particularly , can be implemented on must current computers by arithmetic left and right shifts, respectively, which are faster than general multiplication and division.
The next chapter applies the program synthesis techniques developed in this chapter to a larger set of examples on text processing.
No subsequent chapter depends explicitly upon the program synthesis content from these chapters. However, if practiced regularly, the techniques explored in this chapter can enhance a programmer’s ability to solve problems and construct correct functional programming solutions.
The following function computes the integer base 2 logarithm of a positive integer:
Using the tail recursion theorem, write a definition for lg
that is tail recursive.
Synthesize the recursive definition for ++
from the following specification:
Using tupling and function fact5
from Chapter 4, synthesize an efficient function allfacts
to generate a list of factorials for natural numbers 0 through parameter n
, inclusive.
Consider the following recursive definition for natural number multiplication:
This is an O(n)
algorithm for computing m * n
. Synthesize an alternative operation that is O(log2(n))
. Doubling (i.e. n*2
) and halving (i.e n/2
with truncation) operations may be used but not multiplication (*
) in general.
Derive a “more general” version of the Tail Recursion Theorem for functions of the shape
where functions b
, f
, g
, and h
are constrained as in the definition of fun
in the Tail Recursion Theorem. Be sure to identify the appropriate constraints on d
, ***
, and +++
including the necessary properties of ***
and +++
.
In Summer 2018, I adapted and revised this chapter and the next from:
These previous notes drew on the presentations in the first edition of the Bird and Wadler textbook [Bird 1988], Hoogerwoord’s dissertation [Hoogerwoord 1989], Kaldewaij’s textbook [Kaldewaij 1990], Cohen’s textbook [Cohen 1990], and other sources.
I incorporated this work as new Chapter 26, Program Synthesis (this chapter), and new Chapter 27, Text Processing, in the 2018 version of the textbook Exploring Languages with Interpreters and Functional Programming and continue to revise it.
I maintain this chapter as text in Pandoc’s dialect of Markdown using embedded LaTeX markup for the mathematical formulas and then translate the document to HTML, PDF, and other forms as needed.
TODO