Fixed Point Datatypes

July 7, 2008

This is more Haskell translation of my reading of “Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire”, this time concerning fixed points of data types.

First, what is a fixed point (aka. fixpoint)? In calculus, functions can have fixed points. These are values x such that f(x) = x. As an example, f(x) = x^2 has a fixed point at both 0 and 1. For f(x) = x, every x is a fixpoint!

The notion of fixpoints in programming is somewhat different, though related. In programming fixed points are about self-application, rather than the calculus definition above. I hope the distinction will become clear with some examples.

First, consider the following definition of a simple pair-like type:

data Pair a b = Pair a b

I can use any type I like for a and b, including, say, Pair a b:

example1 = Pair 1 (Pair 2 'a')

I could go further:

example2 = Pair 1 (Pair 2 (Pair 3 (Pair 4 (Pair 5 'a'))))

Is that starting to look familiar? This is an approximation of lists, though not exactly a pleasant one to work with.
Note that Pair a is a Functor. This idea of applying a Functor to itself is exactly fixpoints of data types. With the Pair a Functor, there is nothing to stop the self-application, so we have something isomorphic to an infinite [a].
Augmenting Pair to include a second, nullary, constructor, thus:

data Pair a b = Stop | Pair a b

We can then take the fixpoint of the Functor Pair a, which is isomorphic to:

data PairFix a = Stop | Pair a (PairFix a)

And that is clearly isomorphic to desugared lists:

data List a = Empty | Cons a (List a)

We can even abstract this notion of self-application of a Functor in a curious newtype:

newtype Functor f => Mu f = In (f (Mu f))

Such that PairFix a = Mu (Pair a), assuming we’ve declared an instance of Functor for Pair a.

Lists are actually only one example. Mu can be applied to any Functor. Using the binary tree implementation from a previous post ,

data Tree a = Empty | Tree a (Tree a) (Tree a)

This is the fixpoint of the following type:

data Cell a b = Empty | Cell a b b

Since Cell a forms a Functor, Mu (Cell a) == Tree a.

This is all merely an intriguing bit of type theory until we make the realization that all of these structures have something in common. That something is that they can all be recursed over. This can be seen most clearly by looking at lists, which as I demonstrated above are isomorphic to Mu (Pair a) using the second definition of Pair. The main thrust of Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire is that recursion patterns (ie. higher-order functions) we generally think of as being for lists are in fact applicable to any fixpoint data structure.

The paper goes on to describe four core “foomorphisms”: catamorphisms, anamorphisms, hylomorphisms and paramorphisms. The names are from Greek, and like “monad” and “functor” are far scarier than the things they name. These are generalizations of foldr, unfoldr, primitive recursion and primitive recursion with an accumulating parameter, respectively.

I had originally planned to describe these in more detail here, but Edward Kmett (edwardk) has begun writing a Field Guide to all of these recursion schemes. His posts on each of the foomorphisms include both category theory and concrete examples, and are an excellent read. As you can see from the front page of his Field Guide, there are many, many more than the four described in the original paper. Implementations of all of these and much more are available in category-extras on Hackage.

I will still describe some of the core examples and definitions from the paper in later posts. In particular my explorations of the standard examples (Peano numbers, factorial, trees) from Edward Kmett’s Guide and elsewhere. My goal is to learn to recognize them well “in the wild”, since they aid understanding even when the function doesn’t explicitly use the foomorphism.


Equational Reasoning Shopping List

April 10, 2008

Pure functional programming has the advantage that one can reason with it very like one does in mathematics. Specifically, if you know two expressions to be equal, they can be freely swapped. This underlies the use of inlining, sharing and the ability to replace any linear recursion with a fold, unfold, zip, etc. as appropriate.

As a first Haskell example, consider the following very-well-known optimization. Given code like

map f $ map g $ map h list

or in the more common style:

map f . map g . map h $ list

it appears naively that this will traverse the list three times: once to apply h, again to apply g, and finally a third time to apply f.

But because map, f, g and h are all pure functions, they can be replaced with something equivalent without changing the result of the code.

Consider an element of list, say x. After the three maps, it will have been replaced in the final list with f(g(h x))), or equivalently f . g . h $ x. So we should be able to replace the three maps with one that does the same thing:

map (f . g . h) list

And in fact GHC does precisely this.

Equational Rules
Following is a collection (and more are most welcome!) of rules for transforming your programs. I have written their equality with ==, though of course Haskell functions are not in Eq. I have written implication as –> and equivalence/double-implication as <–>.

map f . map g == map (f . g)
filter f . filter g     == filter (\x -> f x && g x)
                        == filter (liftM2 (&&) f g)
f . f == id --> filter (p . f) == map f . filter p . map f
filter (all p) . cartesianProduct == cartesianProduct . map (filter p)
filter p . concat == concat . map (filter p) == concatMap (filter p)

And now some concerning pairs and arrow combinators, Haskell translations of ones from Meijer et al.’s seminal paper:

fst . (f *** g) == f . fst
fst . (f &&& g) == f
-- and the obvious equivalents for snd and g.
fst . h &&& snd . h == id
fst &&& snd == id
(f *** g) . (h &&& j) == (f . h) &&& (g . j)
(f &&& g) . h == (f . h) &&& (g . h)
(f *** g) == (h *** j) <--> f == h && g == j
(f &&& g) == (h &&& j) <--> f == h && g == j

And more, translated from the same paper, for Either and the arrow combinators using it:

(f +++ g) . Left == Left . f
(f ||| g) . Left == f
-- and the obvious equivalents for g and Right
{- h strict -} --> (h . Left) ||| (h . Right) == h
Left ||| Right == id
(f ||| g) . (h +++ j) == (f . h) ||| (g . j)
{- f strict -} --> f . (g ||| h) == (f . g) ||| (f . h)
(f ||| g) == (h ||| j)  f == h && g == j
(f +++ g) == (h +++ j)  f == h && g == j

And finally the “Abides Law” tying the two together:

(f &&& g) ||| (h &&& j) == (f ||| h) &&& (g ||| j)

There are many more such rules, and if you have more to suggest I’d love to add them to the list.


Follow

Get every new post delivered to your Inbox.