I was recently reading the excellent article How to design co-programs by Jeremy Gibbons, which discusses structural corecursion—an operation which builds up a data structure layer-by-layer—as a missed design recipe for dealing with recursive types in the book How to Design Programs. Gibbons states that the book discusses structural recursion:
When the input is of a recursive type, the skeleton encapsulates structural recursion—a case analysis between the base case and the inductive case, the latter case entailing recursive calls.
And generative recursion:
A secondary, more advanced, strategy is to use generative recursion, otherwise known as divide-and-conquer. The skeleton in this design recipe incorporates a test for triviality; in the non-trivial cases, it splits the problem into subproblems, recursively solves the subproblems, and assembles the subresults into an overall result.
But mentions that How to Design Programs missed an important design recipe. The conquer part of divide-and-conquer is structural recursion—recursively analysing the subproblems to assemble the overall result. What if we look at just the division part? Well, that’s structural corecursion—building up the subproblems, layer by layer, as we unfold from the input.
I won’t go any further into structural corecursion—Gibbons provides a great
explanation, read the article!—but he does show the reader divide-and-conquer
implementations of both QuickSort and MergeSort. It’s the latter that I want to
draw attention to as there’s a bit of jank in his implementation—note the two
calls to merge in mergeAll:
data Tree = Empty | Node Tree Integer Tree
mergesort :: [Integer] -> [Integer]
mergeSort = mergeAll . splitUp
splitUp :: [Integer] -> Tree
splitUp [] = Empty
splitUp (a:x) = Node (splitUp y) a (splitUp z)
where (y, z) = halve x
halve :: [Integer] -> ([Integer], [Integer])
halve [] = ([], [])
halve [a] = ([a], [])
halve (a:b:rest) = (a:y, b:z)
where (y, z) = halve rest
mergeAll :: Tree -> [Integer]
mergeAll Empty = []
mergeAll (Node t a u) =
merge (mergeAll t) (merge [a] (mergeAll u))
merge :: [Integer] -> [Integer] -> [Integer]
merge [] y = y
merge x [] = x
merge (a:x) (b:y) = if a <= b
then a:merge x (b:y)
else b:merge (a:x) yMind you, Gibbons didn’t miss this, he brought up this issue himself, and he even suggests a path to a solution:
Choosing this particular tree type is a bit clunky for Merge Sort, because of the two calls to
mergerequired inmergeAll. It would be neater to use non-empty externally labelled binary trees, with elements at the leaves and none at the branches:
data Tree = Tip Integer | Bin Tree TreeThen you define the main function to work only for non-empty lists, and provide a separate case for sorting the empty list. This clunkiness is a learning opportunity: to realise the problem, come up with a fix (no node labels in the tree), rearrange the furniture accordingly, then replay the development and compare the results.)
Learning opportunity for the reader? Yeah, I got nerd sniped. Here’s how I did it.
Here’s the first issue: if we’re supposed to corecurse to build up these
non-empty trees, how can we convert an empty list into a Tree? We can’t! Here
came the first epiphany: I need to ensure that, at all times while building up
the tree, the program is only ever working on non-empty lists, which is a job
for the NonEmpty data structure:
data NonEmpty a = a :> [a]Which implies that splitUp has a type signature of NonEmpty Integer -> Tree.
I’ll do a case analysis on the input to MergeSort which ensures we only call this if we get a non-empty list as input—if it’s an empty list, we can just return the empty list.
Now, we need to work out when we want splitUp to construct a Tip or a
Bin.
The shape of the program is determined by the shape of
the output. Structural corecursion!
The second epiphany came when I figured out that when there’s only one item in
the NonEmpty, we need to construct a Tip, but if there’s at least two, then
we can cleave the NonEmpty in twain
Does using sufficiently
archaic language absolve an author of the “you could be AI” allegations? and corecurse, building up the Tree from the partitioned input:
splitUp :: NonEmpty Integer -> Tree
splitUp (a :> []) )= Tip a
splitUp (a :> (b:rest)) =
Bin (splitUp (a :> y)) (splitUp (b :> z))
where (y, z) = halve restI want to note an interesting similarity between halve and splitUp—they
both look for the case where there’s at least two elements in a data-structure,
so that they can split it apart into two non-empty data structures (halve by
producing a tuple of two lists, and splitUp by producing a Bin that contains
two sub-trees).
Anyways, with that figured out, we can now put together the fixed MergeSort implementation:
-- ===== Data structures
data NonEmpty a = a :> [a]
data Tree = Tip Integer | Bin Tree Tree
-- ===== Sort
mergeSort :: [Integer] -> [Integer]
mergeSort [] = []
mergeSort (a:as) = mergeAll . splitUp $ a :> as
-- ----- Recursion / conquer
mergeAll :: Tree -> [Integer]
mergeAll (Tip i) = [i]
mergeAll (Bin a b) = merge (mergeAll a) (mergeAll b)
merge :: [Integer] -> [Integer] -> [Integer]
merge [] y = y
merge x [] = x
merge (a:x) (b:y)
| a <= b = a : merge x (b:y)
| otherwise = b : merge (a:x) y
-- ----- Corecursion / divide
splitUp :: NonEmpty Integer -> Tree
splitUp (a :> []) )= Tip a
splitUp (a :> (b:rest)) =
Bin (splitUp (a :> y)) (splitUp (b :> z))
where (y, z) = halve rest
halve :: [Integer] -> ([Integer], [Integer])
halve [] = ([], [])
halve [a] = ([a], [])
halve (a:b:rest) = (a:y, b:z)
where (y, z) = halve restAnd thus concludes my implementation of MergeSort.
I also want to briefly compare my MergeSort to Gibbons’ QuickSort:
quicksort :: [Integer] -> [Integer]
quicksort = flatten . build
data Tree = Empty
| Node Tree Integer Tree
build :: [Integer] -> Tree
build [] = Empty
build (a:as) = Node (build lte) a (build gt)
where (lte, gt) = partition (a <=) as
flatten :: Tree -> [Integer]
flatten Empty = []
flatten (Node t a u) = flatten t ++ [a] ++ flatten uBecause there’s an interesting difference between QuickSort and
MergeSort—MergeSort does the sorting during recursion (while analysing
the intermediate tree with mergeAll), while QuickSort does the sorting during
corecursion (while synthesising the intermediate tree with build).