11.07.2015 Views

J.N. Oliveira PROGRAM DESIGN BY CALCULATION University of ...

J.N. Oliveira PROGRAM DESIGN BY CALCULATION University of ...

J.N. Oliveira PROGRAM DESIGN BY CALCULATION University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

J.N. <strong>Oliveira</strong><strong>PROGRAM</strong> <strong>DESIGN</strong> <strong>BY</strong><strong>CALCULATION</strong>(DRAFT)<strong>University</strong> <strong>of</strong> Minho(in preparation)


J.N. <strong>Oliveira</strong><strong>PROGRAM</strong> <strong>DESIGN</strong> <strong>BY</strong><strong>CALCULATION</strong>(DRAFT)<strong>University</strong> <strong>of</strong> Minho(in preparation)


iiCONTENTS2.19 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.20 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 533 Recursion in the Pointfree Style 553.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Introducing inductive datatypes . . . . . . . . . . . . . . . . . . . 613.3 Observing an inductive datatype . . . . . . . . . . . . . . . . . . 663.4 Synthesizing an inductive datatype . . . . . . . . . . . . . . . . . 703.5 Introducing (list) catas, anas and hylos . . . . . . . . . . . . . . . 713.6 Inductive types more generally . . . . . . . . . . . . . . . . . . . 763.7 Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.8 Polynomial functors . . . . . . . . . . . . . . . . . . . . . . . . . 793.9 Polynomial inductive types . . . . . . . . . . . . . . . . . . . . . 813.10 F-algebras and F-homomorphisms . . . . . . . . . . . . . . . . . 823.11 F-catamorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . 833.12 Parameterization and type functors . . . . . . . . . . . . . . . . . 863.13 A catalogue <strong>of</strong> standard polynomial inductive types . . . . . . . . 903.14 Functors and type functors in HASKELL . . . . . . . . . . . . . . 923.15 The mutual-recursion law . . . . . . . . . . . . . . . . . . . . . . 933.16 “Banana-split”: a corollary <strong>of</strong> the mutual-recursion law . . . . . . 1013.17 Inductive datatype isomorphism . . . . . . . . . . . . . . . . . . 1023.18 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 1024 Why Monads Matter 1054.1 Partial functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.2 Putting partial functions together . . . . . . . . . . . . . . . . . . 1064.3 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.4 Monads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1094.4.1 Properties involving (Kleisli) composition . . . . . . . . . 1114.5 Monadic application (binding) . . . . . . . . . . . . . . . . . . . 1124.6 Sequencing and the do-notation . . . . . . . . . . . . . . . . . . 1134.7 Generators and comprehensions . . . . . . . . . . . . . . . . . . 1144.8 Monads in HASKELL . . . . . . . . . . . . . . . . . . . . . . . . 1154.8.1 Monadic I/O . . . . . . . . . . . . . . . . . . . . . . . . 1164.9 The state monad . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174.10 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 124


CONTENTSiiiII Moving Away From (Pure) Functions 1255 Quasi-inductive datatypes 1275.1 Introducing Non-inductive Datatypes . . . . . . . . . . . . . . . . 1275.2 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . 1335.3 Well-founded coalgebras and induction . . . . . . . . . . . . . . 1345.4 Datatype invariants and pro<strong>of</strong> obligations . . . . . . . . . . . . . 1365.5 Binary relations and finite mappings . . . . . . . . . . . . . . . . 1385.6 Finite mapping induction principle . . . . . . . . . . . . . . . . . 1405.7 An overview <strong>of</strong> the powerset and finite mapping algebras . . . . . 1415.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1415.9 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 143III Calculating with Relations 1456 Introduction to Relational Calculation 1476.1 Functions are not enough . . . . . . . . . . . . . . . . . . . . . . 1476.2 Relational composition and converse . . . . . . . . . . . . . . . . 1516.3 Relational equality . . . . . . . . . . . . . . . . . . . . . . . . . 1526.4 Specifications as “properties” . . . . . . . . . . . . . . . . . . . . 1546.5 Relational approach . . . . . . . . . . . . . . . . . . . . . . . . . 1546.6 Pre/post specification style . . . . . . . . . . . . . . . . . . . . . 1556.7 From predicates to relations . . . . . . . . . . . . . . . . . . . . . 1566.8 Basic relational combinators . . . . . . . . . . . . . . . . . . . . 1586.9 Converse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1606.10 Meet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1616.11 Pointwise vs pointfree notation . . . . . . . . . . . . . . . . . . . 1626.12 Orders and their taxonomy . . . . . . . . . . . . . . . . . . . . . 1646.13 Derived combinators . . . . . . . . . . . . . . . . . . . . . . . . 1666.14 Entireness and simplicity . . . . . . . . . . . . . . . . . . . . . . 1686.15 Surjectiveness and injectiveness . . . . . . . . . . . . . . . . . . 1686.16 Binary relation taxonomy . . . . . . . . . . . . . . . . . . . . . . 1696.17 Reasoning about functions . . . . . . . . . . . . . . . . . . . . . 1706.18 Galois connections . . . . . . . . . . . . . . . . . . . . . . . . . 1746.19 Converse in a Galois connection . . . . . . . . . . . . . . . . . . 1776.20 Functions in a Galois connection . . . . . . . . . . . . . . . . . . 1776.21 Relational division . . . . . . . . . . . . . . . . . . . . . . . . . 178


ivCONTENTS6.22 Meet and join . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1826.23 Relational split . . . . . . . . . . . . . . . . . . . . . . . . . . . 1836.24 Relational either . . . . . . . . . . . . . . . . . . . . . . . . . . 1846.25 Meaning <strong>of</strong> VDM-SL specs . . . . . . . . . . . . . . . . . . . . . 1886.26 Relational semantics <strong>of</strong> VDM-SL . . . . . . . . . . . . . . . . . 1906.27 Relational McCarthy conditional . . . . . . . . . . . . . . . . . . 1916.28 Reasoning about VDM-SL . . . . . . . . . . . . . . . . . . . . . 1926.29 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 1967 An Introduction to Relational Hylomorphisms 1977.1 “How” does one specify? . . . . . . . . . . . . . . . . . . . . . . 1977.2 Divide-and-conquer (formally) . . . . . . . . . . . . . . . . . . . 1987.3 Relators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1997.4 Properties <strong>of</strong> relators . . . . . . . . . . . . . . . . . . . . . . . . 2007.5 Equations and fixpoints . . . . . . . . . . . . . . . . . . . . . . . 2027.6 Solving (Fixpoint) Equations . . . . . . . . . . . . . . . . . . . . 2037.7 Solving (Fixpoint) Equations III . . . . . . . . . . . . . . . . . . 2047.8 Solving relational equations . . . . . . . . . . . . . . . . . . . . 2057.9 Laws <strong>of</strong> the Fixpoint Calculus . . . . . . . . . . . . . . . . . . . 2067.10 µ-fusion theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 2097.11 Hylo(cata)-fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.12 Hylo(ana)-fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 2127.13 Examples: VDM collective types . . . . . . . . . . . . . . . . . . 2137.14 Relational cata(ana)morphisms . . . . . . . . . . . . . . . . . . . 2147.15 Inductive coreflexives . . . . . . . . . . . . . . . . . . . . . . . . 2157.16 Hylos as unique solutions . . . . . . . . . . . . . . . . . . . . . . 2187.17 Accessibility and membership . . . . . . . . . . . . . . . . . . . 2197.18 Hylo-factorization Theorem . . . . . . . . . . . . . . . . . . . . 2227.19 Virtual data-structuring . . . . . . . . . . . . . . . . . . . . . . . 2237.20 Final note on inductive relation ≺ . . . . . . . . . . . . . . . . . 2247.21 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 2248 Theorems for Free 2258.1 Parametric polymorphism: why? . . . . . . . . . . . . . . . . . . 2258.2 Free theorem <strong>of</strong> type t . . . . . . . . . . . . . . . . . . . . . . . 2268.3 Reynolds arrow operator . . . . . . . . . . . . . . . . . . . . . . 2278.4 Pointwise version <strong>of</strong> FT . . . . . . . . . . . . . . . . . . . . . . . 2298.5 Second example: FT <strong>of</strong> (| |) . . . . . . . . . . . . . . . . . . . . . 230


CONTENTSv8.6 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 231IV Data Refinement by Calculation 2339 On Data Representation 2359.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2359.2 Data refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . 2379.3 Representation relations . . . . . . . . . . . . . . . . . . . . . . . 2389.4 Right invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . 2399.5 Refinement inequations . . . . . . . . . . . . . . . . . . . . . . . 2399.6 Relational representation . . . . . . . . . . . . . . . . . . . . . . 2419.7 Functional representation . . . . . . . . . . . . . . . . . . . . . . 2429.8 Concrete invariants . . . . . . . . . . . . . . . . . . . . . . . . . 2429.9 A fundamental iso abstraction . . . . . . . . . . . . . . . . . . . 2439.10 Pointfree untot = (i ◦ 1·) . . . . . . . . . . . . . . . . . . . . . . . 2449.11 Properties <strong>of</strong> ≤ . . . . . . . . . . . . . . . . . . . . . . . . . . . 2459.12 Structural data refinement . . . . . . . . . . . . . . . . . . . . . . 2469.13 The finite map bifunctor . . . . . . . . . . . . . . . . . . . . . . 2519.14 Transposing relations . . . . . . . . . . . . . . . . . . . . . . . . 2549.15 Transposing finite relations . . . . . . . . . . . . . . . . . . . . . 2559.16 Recursive data refinement . . . . . . . . . . . . . . . . . . . . . . 2569.17 Recursion “removal” . . . . . . . . . . . . . . . . . . . . . . . . 2579.18 Closure and wellfoundedness . . . . . . . . . . . . . . . . . . . . 2609.19 Object oriented Data Implementation . . . . . . . . . . . . . . . . 2619.20 Multiple inheritance . . . . . . . . . . . . . . . . . . . . . . . . . 2619.21 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 26210 On Algorithmic Refinement 26510.1 Implicit/explicit refinement . . . . . . . . . . . . . . . . . . . . . 26510.2 Handling refinement equations . . . . . . . . . . . . . . . . . . . 26810.3 Solving refinement equations . . . . . . . . . . . . . . . . . . . . 26810.4 Properties <strong>of</strong> ⊢ . . . . . . . . . . . . . . . . . . . . . . . . . . . 26910.5 Stepwise refinement . . . . . . . . . . . . . . . . . . . . . . . . . 27010.6 Refinement is a partial order . . . . . . . . . . . . . . . . . . . . 27110.7 Main refinement strategies . . . . . . . . . . . . . . . . . . . . . 27210.8 Data refinement in full . . . . . . . . . . . . . . . . . . . . . . . 27310.9 Analysis <strong>of</strong> refinement equation . . . . . . . . . . . . . . . . . . 274


viCONTENTS10.10Solving refinement equations . . . . . . . . . . . . . . . . . . . . 27510.11Functional solutions . . . . . . . . . . . . . . . . . . . . . . . . . 27510.12Calculation <strong>of</strong> while/for loops . . . . . . . . . . . . . . . . . 27710.13Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 277A Haskell Support Library in Haskell 279A.0.1 Set.hs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279B Solutions to Selected Exercises 281


List <strong>of</strong> ExercisesExercise 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Exercise 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Exercise 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Exercise 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Exercise 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Exercise 2.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Exercise 2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercise 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercise 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercise 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Exercise 2.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Exercise 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Exercise 2.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Exercise 2.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Exercise 2.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Exercise 2.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Exercise 2.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Exercise 2.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Exercise 2.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Exercise 2.20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Exercise 2.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Exercise 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66Exercise 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Exercise 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76vii


0 LIST OF EXERCISESExercise 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Exercise 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Exercise 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Exercise 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Exercise 3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Exercise 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Exercise 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Exercise 3.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Exercise 3.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Exercise 3.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Exercise 3.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Exercise 3.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Exercise 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Exercise 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Exercise 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112Exercise 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Exercise 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Exercise 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132Exercise 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132Exercise 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134Exercise 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136Exercise 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138Exercise 5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141Exercise 5.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142Exercise 5.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142Exercise 5.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142


PreambleThis textbook in preparation has arisen from the author’s research and teachingexperience. Its main aim is to provide s<strong>of</strong>tware practitioners with a calculationalapproach to the design <strong>of</strong> s<strong>of</strong>tware artifacts ranging from simple algorithms andfunctions to the specification and realization <strong>of</strong> information systems. Put in otherwords, the book invites s<strong>of</strong>tware designers to raise standards and adopt maturedevelopment techniques found in other engineering disciplines, which (as a rule)are rooted on a sound mathematical basis so as to enable algebraic reasoning.It is interesting to note that while coining the phrase s<strong>of</strong>tware engineering inthe 1960s, our colleagues <strong>of</strong> the time were already promising such high qualitystandards. The terminology seems to date from the Garmisch NATO conferencein 1968, from whose report [NR69] the following excerpt is quoted:In late 1967 the Study Group recommended the holding <strong>of</strong> a working conferenceon S<strong>of</strong>tware Engineering. The phrase ‘s<strong>of</strong>tware engineering’ wasdeliberately chosen as being provocative, in implying the need for s<strong>of</strong>twaremanufacture to be based on the types <strong>of</strong> theoretical foundations and practicaldisciplines, that are traditional in the established branches <strong>of</strong> engineering.Provocative or not, the need for sound theoretical foundations has clearly been underconcern since the very beginning <strong>of</strong> the discipline. However, how “scientific”do such foundations turn out to be, now that four decades have since elapsed?Ten years ago, Richard Bird and Oege de Moore published a textbook [BdM97]in whose preface C.A.R. Hoare writes:Programming notation can be expressed by “formulæ and equations (...)which share the elegance <strong>of</strong> those which underlie physics and chemistry orany other branch <strong>of</strong> basic science”.The formulæ and equations mentioned in this quotation are those <strong>of</strong> the disciplineknown as the Algebra <strong>of</strong> Programming. Many others have contributed to this body1


2 LIST OF EXERCISES<strong>of</strong> knowledge, notably Roland Backhouse and his colleagues at Eindhoven andNottingham, see eg. [ABH + 92] and [Bac04], among many others. Unfortunately,both <strong>of</strong> these references are still unpublished.When the author <strong>of</strong> this draft textbook decided to teach Algebra <strong>of</strong> Programmingto 2nd year students <strong>of</strong> the Minho degrees in computer science, back to1998, he found [BdM97] too difficult for the students to follow, mainly because<strong>of</strong> its explicit categorial (allegorical) flavour. So he decided to start writing slidesand notes helping the students to read the book. Eventually, such notes becamechapters 2 to 4 <strong>of</strong> the current version <strong>of</strong> the monograph. The same procedurewas taken when teaching the relational approach <strong>of</strong> [BdM97] to 4th and 5th yearstudents (today at master level).This draft book is by and large incomplete, most chapters being still in slideform. Such chapters are omitted from the current print-out.Braga, <strong>University</strong> <strong>of</strong> Minho, December 2008José N. <strong>Oliveira</strong>


Part ICalculating with Functions9


Chapter 2An Introduction to PointfreeProgrammingEverybody is familiar with the concept <strong>of</strong> a function since the school desk. The functionalintuition traverses mathematics from end to end because it has a solid semantics rooted ona well-known mathematical system — the class <strong>of</strong> “all” sets and set-theoretical functions.Functional programming literally means “programming with functions”. Programminglanguages such as LISP or HASKELL allow us to program with functions. However,the functional intuition is far more reaching than producing code which runs on a computer.Since the pioneering work <strong>of</strong> John McCarthy — the inventor <strong>of</strong> LISP — in theearly 1960s, one knows that other branches <strong>of</strong> programming can be structured, or expressedfunctionally. The idea <strong>of</strong> producing programs by calculation, that is to say, that<strong>of</strong> calculating efficient programs out <strong>of</strong> abstract, inefficient ones has a long tradition infunctional programming.This book is structured around the idea that functional programming can be used asa basis for teaching programming as a whole, from the successor function n ↦→ n + 1 tolarge information system design.This chapter provides a light-weight introduction to the theory <strong>of</strong> functional programming.Its emphasis is on explaining how to construct new functions out <strong>of</strong> other functionsusing a minimal set <strong>of</strong> predefined functional combinators. This leads to a programmingstyle which is point free in the sense that function descriptions dispense with variables(definition points).Many technical issues are deliberately ignored and deferred to later chapters. Mostprogramming examples will be provided in the HASKELL functional programming language.Appendix A includes the listings <strong>of</strong> some HASKELL modules which complementthe HUGS Standard Prelude (which is based very closely on the Standard Prelude forHASKELL 1.4.) and help to “animate” the main concepts introduced in this chapter.11


12 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MING2.1 Introducing functions and typesThe definition <strong>of</strong> a functionf : A B (2.1)can be regarded as a kind <strong>of</strong> “process” abstraction: it is a “black box” which produces anoutput once it is supplied with an input:x(∈ A) ✲ f ✲ f x(∈ B)From another viewpoint, f can be regarded as a kind <strong>of</strong> “contract”: it commits itselfto producing a B-value provided it is supplied with an A-value. How is such a valueproduced? In many situations one wishes to ignore it because one is just using functionf. In others, however, one may want to inspect the internals <strong>of</strong> the “black box” in orderto know the function’s computation rule. For instance,succ : IN INsucc n def = n + 1expresses the computation rule <strong>of</strong> the successor function — the function succ which finds“the next natural number” — in terms <strong>of</strong> natural number addition and <strong>of</strong> natural number1. What we above meant by a “contract” corresponds to the signature <strong>of</strong> the function,which is expressed by arrow IN IN in the case <strong>of</strong> succ and which, by the way, canbe shared by other functions, e.g. sq n def = n 2 .In programming terminology one says that succ and sq have the same “type”. Typesplay a prominent rôle in functional programming (as they do in other programming paradigms).Informally, they provide the “glue”, or interfacing material, for putting functions togetherto obtain more complex functions. Formally, a “type checking” discipline canbe expressed in terms <strong>of</strong> compositional rules which check for functional expression wellformedness.It has become standard to use arrows to denote function signatures or function types,recall (2.1). In this book the following variants will be used interchangeably to denotethe fact that function f accepts arguments <strong>of</strong> type A and produces results <strong>of</strong> type B: f :


2.2. FUNCTIONAL APPLICATION 13fB A , f : A B , B A or A f B . This corresponds to writing f:: a -> b in the HASKELL functional programming language, where type variablesare denoted by lowercase letters. A will be referred to as the domain <strong>of</strong> f and B will bereferred to as the codomain <strong>of</strong> f. Both A and B are symbols which denote sets <strong>of</strong> values,very <strong>of</strong>ten called types.2.2 Functional applicationWhat do we want functions for? If we ask this question to a physician or engineer theanswer is very likely to be: one wants functions for modelling and reasoning about thebehaviour <strong>of</strong> real things.For instance, function distance t = 60×t could be written by a school physics studentto model the distance (in, say, kilometers) a car will drive (per hour) at average speed60km/hour. When questioned about how far the car has gone in 2.5 hours, such a modelprovides an immediate answer: just evaluate distance 2.5 to obtain 150km.So we get a naïve purpose <strong>of</strong> functions: we want them to be applied to argumentsin order to obtain results. Functional application is denoted by juxtaposition, e.g. f afor Bf (xy).fAand a ∈ A, and associates to the left: f xy denotes (f x)y rather than2.3 Functional equality and compositionApplication is not everything we want to do with functions. Very soon our physics studentwill be able to talk about properties <strong>of</strong> the distance model, for instance that propertydistance (2 × t) = 2 × (distance t) (2.2)holds. Later on, we could learn from her or him that the same property can be restated asdistance (twice t) = twice (distance t), by introducing function twice x def = 2 ×x. Or evensimply asdistance · twice = twice · distance (2.3)where “·” denotes function-arrow chaining, as suggested by drawingIRtwice distance IR twiceIRIRdistance(2.4)


14 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGwhere both space and time are modelled by real numbers.This trivial example illustrates some relevant facets <strong>of</strong> the functional programmingparadigm. Which version <strong>of</strong> the property presented above is “better”? the version explicitlymentioning variable t and requiring parentheses (2.2)? the version hiding variable tbut resorting to function twice (2.3)? or even drawing (2.4)?Expression (2.3) is clearly more compact than (2.2). The trend for notation economyand compactness is well-known throughout the history <strong>of</strong> mathematics. In the 16th century,for instance, algebrists would write 12.cu.˜p.18.ce.˜p.27.co.˜p.17 for what is nowadayswritten as 12x 3 +18x 2 +27x+17. We may find such syncopated notation odd, but shouldnot forget that at its time it was replacing even more obscure expression denotations.Why do people look for compact notations? A compact notation leads to shorterdocuments (less lines <strong>of</strong> code in programming) in which patterns are easier to identifyand to reason about. Properties can be stated in clear-cut, one-line long equations whichare easy to memorize. And diagrams such as (2.4) can be easily drawn which enable usto visualize maths in a graphical format.Some people will argue that such compact “pointfree” notation (that is, the notationwhich hides variables, or function “definition points”) is too cryptic to be useful as a practicalprogramming medium. In fact, pointfree programming languages such as Iverson’sAPL or Backus’ FP have been more respected than loved by the programmers community.Virtually all commercial programming languages require variables and so implement themore traditional “pointwise” notation.Throughout this book we will adopt both, depending upon the context. Our chosenprogramming medium — HASKELL — blends the pointwise and pointfree programmingstyles in a quite successful way. In order to switch from one to the other, we need two“bridges”: one lifting equality to the functional level and the other lifting application.Concerning equality, note that the “=” sign in (2.2) differs from that in (2.3): whilethe former states that two real numbers are the same number, the latter states that twoIR IR functions are the same function. Formally, we will say that two functionsf,g : B A are equal if they agree at pointwise-level, that isf = g iff 〈∀ a : a ∈ A : f a = B g a〉 (2.5)where = B denotes equality at B-level.Concerning application, the pointfree style replaces it by the more generic concept<strong>of</strong> functional composition suggested by function-arrow chaining: wherever two functionsare such that the target type <strong>of</strong> one <strong>of</strong> them, say BfgAis the same as the source type<strong>of</strong> the other, say C B , then another function can be defined, C A — calledthe composition <strong>of</strong> f and g, or “f after g” — which “glues” f and g together:(f · g)a def = f (g a) (2.6)f·g


2.3. FUNCTIONAL EQUALITY AND COMPOSITION 15This situation is pictured by the following arrow-diagramBgA(2.7)f f·gCor by block-diagrama✲gg a✲f ✲ f (g a)Therefore, the type-rule associated to functional composition can be expressed as follows:BCBfgf·gCAAComposition is certainly the most basic <strong>of</strong> all functional combinators. It is the firstkind <strong>of</strong> “glue” which comes to mind when programmers need to combine, or chain functions(or processes) to obtain more elaborate functions (or processes) 1 . This is because<strong>of</strong> one <strong>of</strong> its most relevant properties,which shares the pattern <strong>of</strong>, for instance(f · g) · h = f · (g · h) (2.8)(a + b) + c = a + (b + c)and so is called the associative property <strong>of</strong> composition. This enables us to move parenthesesaround in pointfree expressions involving functional compositions, or even to omitthem, for instance by writing f ·g·h·i as an abbreviation <strong>of</strong> ((f ·g)·h)·i, or <strong>of</strong> (f ·(g·h))·i,or <strong>of</strong> f ·((g ·h)·i), etc. For a chain <strong>of</strong> n-many function compositions the notation ○ n i=1 f iwill be acceptable as abbreviation <strong>of</strong> f 1 · · · · · f n .1 It even has a place in script languages such as UNIX’s, where f | g is the shell counterpart<strong>of</strong> g · f, for appropriate “processes” f and g.


16 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MING2.4 Identity functionsHow free are we to fulfill the “give me an A and I will give you a B” contract <strong>of</strong> equation(2.1)? In general, the choice <strong>of</strong> f is not unique. Some fs will do as little as possiblewhile others will laboriously compute non-trivial outputs. At one <strong>of</strong> the extremes, wefind functions which “do nothing” for us, that is, the added-value <strong>of</strong> their output whencompared to their input amounts to nothing:f a = aIn this case B = A, <strong>of</strong> course, and f is said to be the identity function on A:id A : A Aid A a def = a(2.9)Note that every type X “has” its identity id X . Subscripts will be omitted whereverimplicit in the context. For instance, the arrow notation IN IN saves us from writingid IN , etc.. So, we will <strong>of</strong>ten refer to “the” identity function rather than to “an” identityfunction.How useful are identity functions? At first sight, they look fairly uninteresting. Butthe interplay between composition and identity, captured by the following equation,idf · id = id · f = f (2.10)will be appreciated later on. This property shares the pattern <strong>of</strong>, for instance,a + 0 = 0 + a = aThis is why we say that id is the unit <strong>of</strong> composition. In a diagram, (2.10) looks like this:AfBididABf(2.11)Note the graphical analogy <strong>of</strong> diagrams (2.4) and (2.11). Diagrams <strong>of</strong> this kind are verycommon and express important properties <strong>of</strong> functions, as we shall see further on.2.5 Constant functionsOpposite to the identity functions, which do not lose any information, we find functionswhich lose all (or almost all) information. Regardless <strong>of</strong> their input, the output <strong>of</strong> thesefunctions is always the same value.


2.6. MONICS AND EPICS 17Let C be a nonempty data domain and let and c ∈ C. Then we define the everywherec function as follows, for arbitrary A:c : A Cca def = c(2.12)The following property defines constant functions at pointfree level,and is depicted by a diagram similar to (2.11):c · f = c (2.13)CcA(2.14)idCcBfNote that, strictly speaking, symbol c denotes two different functions in diagram (2.14):one, which we should have written c A , accepts inputs from A while the other, which weshould have written c B , accepts inputs from B:c B · f = c A (2.15)This property will be referred to as the constant-fusion property.As with identity functions, subscripts will be omitted wherever implicit in the context.Exercise 2.1. The HUGS Standard Prelude provides for constant functions: you writeconst c for c. Check that HUGS assigns the same type to expressions f . const cand const (f c), for every f and c. What else can you say about these functionalexpressions? Justify.✷2.6 Monics and epicsIdentity functions and constant functions are the limit points <strong>of</strong> the functional spectrumwith respect to information preservation. All the other functions are in between: they lose“some” information, which is regarded as uninteresting for some reason. This remarksupports the following aphorism about a facet <strong>of</strong> functional programming: it is the art


18 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MING<strong>of</strong> transforming or losing information in a controlled and precise way. That is to say,the art <strong>of</strong> constructing the exact observation <strong>of</strong> data which fits in a particular context orrequirement.How do functions lose information? Basically in two different ways: they may be“blind” enough to confuse different inputs, by mapping them onto the same output, orthey may ignore values <strong>of</strong> their codomain. For instance, c confuses all inputs by mappingthem all onto c. Moreover, it ignores all values <strong>of</strong> its codomain apart from c.Functions which do not confuse inputs are called monics (or injective functions)fand obey the following property: B A is monic if, for every pair <strong>of</strong> functionsAh,kC , if f · h = f · k then h = k, cf. diagramBfAhkC(f is “cancellable on the left”).It is easy to check that “the” identity function is monic,id · h = id · k ⇒ h = k≡ { by (2.10) }h = k ⇒ h = k≡ { predicate logic }TRUEand that any constant function c is not monic:c · h = c · k ⇒ h = k≡ { by (2.15) }c = c ⇒ h = k≡ { function equality is reflexive }TRUE ⇒ h = k≡ { predicate logic }h = kSo the implication does not hold in general (only if h = k).Functions which do not ignore values <strong>of</strong> their codomain are called epics (or surjectivefunctions) and obey the following property: AfB is epic if, for every pair <strong>of</strong>


2.7. ISOS 19functions Ch,kA , if h · f = k · f then h = k, cf. diagramCkhAfB(f is “cancellable on the right”).As expected, identity functions are epic:h · id = k · id ⇒ h = k≡ { by (2.10) }h = k ⇒ h = k≡ { predicate logic }TRUEExercise 2.2. Under what circumstances is a constant function epic? Justify.✷2.7 IsosA function BfAwhich is both monic and epic is said to be iso (an isomorphism, ora bijective function). In this situation, f always has a converse (or inverse) B f ◦ A ,which is such thatf · f ◦ = id B ∧ f ◦ · f = id A (2.16)(i.e. f is invertible).Isomorphisms are very important functions because they convert data from one “format”,say A, to another format, say B, without losing information. So f and and f ◦ arefaithful protocols between the two formats A and B. Of course, these formats contain thesame “amount” <strong>of</strong> information, although the same data adopts a different “shape” in each<strong>of</strong> them. In mathematics, one says that A is isomorphic to B and one writes A ∼ = B toexpress this fact.Isomorphic data domains are regarded as “abstractly” the same. Note that, in general,there is a wide range <strong>of</strong> isos between two isomorphic data domains. For instance, let


20 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGWeekday be the set <strong>of</strong> weekdays,Weekday ={Sunday,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday}and let symbol 7 denote the set {1,2,3,4,5,6,7}, which is the initial segment <strong>of</strong> IN containingexactly seven elements. The following function f, which associates each weekdaywith its “ordinal” number,f : Weekday 7f Monday = 1f Tuesday = 2f Wednesday = 3f Thursday = 4f Friday = 5f Saturday = 6f Sunday = 7is iso (guess f ◦ ). Clearly, f d = i means “d is the i-th day <strong>of</strong> the week”. But note thatfunction g d def = rem(f d,7) + 1 is also an iso between Weekday and 7. While f regardsMonday the first day <strong>of</strong> the week, g places Sunday in that position. Both f and g arewitnesses <strong>of</strong> isomorphismWeekday ∼ = 7 (2.17)Finally, note that all classes <strong>of</strong> functions referred to so far — constants, identities,epics, monics and isos — are closed under composition, that is, the composition <strong>of</strong> twoconstants is a constant, the composition <strong>of</strong> two epics is epic, etc.2.8 Gluing functions which do not compose — productsFunction composition has been presented above as the basis for gluing functions togetherin order to build more complex functions. However, not every two functions can be gluedtogether by composition. For instance, functions f : A C and g : B C donot compose with each other because the domain <strong>of</strong> one <strong>of</strong> them is not the codomain <strong>of</strong>the other. However, both f and g share the same domain C. So, something we can do


2.8. GLUING FUNCTIONS WHICH DO NOT COMPOSE — PRODUCTS 21about gluing f and g together is to draw a diagram expressing this fact, something likeA BfCg Because f and g share the same domain, their outputs can be paired, that is, we maywrite ordered pair (f c,g c) for each c ∈ C. Such pairs belong to the Cartesian product <strong>of</strong>A and B, that is, to the setA × B def = {(a,b) | a ∈ A ∧ b ∈ B}So we may think <strong>of</strong> the operation which pairs the outputs <strong>of</strong> f and g as a new functioncombinator 〈f,g〉 defined as follows:〈f,g〉 : C A × Bdef〈f,g〉 c = (f c,g c)(2.18)Function combinator 〈f,g〉 is pronounced “f split g” (or “pair f and g”) and can bedepicted by the following “block”, or “data flow” diagram:✲f✲ f cc✲g✲ g cFunction 〈f,g〉 keeps the information <strong>of</strong> both f and g in the same way Cartesian productA × B keeps the information <strong>of</strong> A and B. So, in the same way A data or B data can beretrieved from A × B data via the implicit projections π 1 or π 2 ,Aπ 1A × Bπ 2 B (2.19)defined byπ 1 (a,b) = a andπ 2 (a,b) = bf and g can be retrieved from 〈f,g〉 via the same projections:π 1 · 〈f,g〉 = f and π 2 · 〈f,g〉 = g (2.20)


22 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGThis fact (or pair <strong>of</strong> facts) will be referred to as the ×-cancellation property and is illustratedin the following diagram which puts things together:π 1A A × Bf〈f,g〉Cπ 2g B(2.21)In summary, the type-rule associated to the “split” combinator is expressed byABA × BfgCC〈f,g〉A split arises wherever two functions do not compose but share the same domain.What about gluing two functions which fail such a requisite, e.g.CfgAB...?CDThe 〈f,g〉 split combination does not work any more. But a way to “approach” the domains<strong>of</strong> f and g, C and D respectively, is to regard them as targets <strong>of</strong> the projections π 1and π 2 <strong>of</strong> C × D:fAπ 1A × Bπ 2 BgCπ 1C × Dπ 2 DFrom this diagram 〈f · π 1 ,g · π 2 〉 arisesπ 1π 2A A × B B〈f·π 1 ,g·π 2 〉f·π 1 g·π 2C × Dmapping C × D to A × B. It corresponds to the “parallel” application <strong>of</strong> f and g whichis suggested by the following data-flow diagram:


2.8. GLUING FUNCTIONS WHICH DO NOT COMPOSE — PRODUCTS 23c✲f✲ f cd✲g✲ g dFunctional combination 〈f ·π 1 ,g·π 2 〉 appears very <strong>of</strong>ten and deserves special notation— it will be expressed by f × g. So, by definition, we havef × g def = 〈f · π 1 ,g · π 2 〉 (2.22)which is pronounced “product <strong>of</strong> f and g” and has typing-ruleABA × Bfgf×gCDC × D(2.23)Note the overloading <strong>of</strong> symbol “×”, which is used to denote both Cartesian product andfunctional product. This choice <strong>of</strong> notation will be fully justified later on.What is the interplay among functional combinators f · g (composition), 〈f,g〉 (split)and f ×g (product) ? Composition and split relate to each other via the following property,known as ×-fusion:π 1π 2A A × B B 〈g,h〉 gh g·f C h·ff D〈g,h〉 · f = 〈g · f,h · f〉 (2.24)This shows that split is right-distributive with respect to composition. Left-distributivitydoes not hold but there is something we can say about f · 〈g,h〉 in case f = i × j:(i × j) · 〈g,h〉= { by (2.22) }〈i · π 1 ,j · π 2 〉 · 〈g,h〉= { by ×-fusion (2.24) }


24 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MING〈(i · π 1 ) · 〈g,h〉,(j · π 2 ) · 〈g,h〉〉= { by (2.8) }〈i · (π 1 · 〈g,h〉),j · (π 2 · 〈g,h〉)〉= { by ×-cancellation (2.20) }〈i · g,j · h〉The law we have just derived is known as ×-absorption. (The intuition behind this terminologyis that “split absorbs ×”, as a special kind <strong>of</strong> fusion.) It is a consequence <strong>of</strong>×-fusion and ×-cancellation and is depicted as follows:iAπ 1π 1A × Bi×jD D × Eg〈g,h〉Cπ 2π 2h BjE(i × j) · 〈g,h〉 = 〈i · g,j · h〉 (2.25)This diagram provides us with two further results about products and projections whichcan be easily justified:i · π 1 = π 1 · (i × j) (2.26)j · π 2 = π 2 · (i × j) (2.27)Two special properties <strong>of</strong> f × g are presented next. The first one expresses a kind <strong>of</strong>“bi-distribution” <strong>of</strong> × with respect to composition:(g · h) × (i · j) = (g × i) · (h × j) (2.28)We will refer to this property as the ×-functor property. The other property, which wewill refer to as the ×-functor-id property, has to do with identity functions:id A × id B = id A×B (2.29)These two properties will be identified as the functorial properties <strong>of</strong> product. This choice<strong>of</strong> terminology will be explained later on.Let us finally analyse the particular situation in which a split is built involving projectionsπ 1 and π 2 only. These exhibit interesting properties, for instance 〈π 1 ,π 2 〉 = id.This property is known as ×-reflexion and is depicted as follows:π 1A A × Bidπ A×B 1π 2π 2 A × B B〈π 1 ,π 2 〉 = id A×B (2.30)


2.8. GLUING FUNCTIONS WHICH DO NOT COMPOSE — PRODUCTS 25What about 〈π 2 ,π 1 〉? This corresponds to a diagramπ 1B B × A〈ππ 2 ,π 1 〉 2π 2π 1 A × Bwhich looks very much the same if submitted to a 180 o clockwise rotation (thus A andB swap with each other). This suggests that swap (the name we adopt for 〈π 2 ,π 1 〉) is itsown inverse, as can be checked easily as follows:swap · swap A= { by definition swap def = 〈π 2 ,π 1 〉 }〈π 2 ,π 1 〉 · swap= { by ×-fusion (2.24) }〈π 2 · swap,π 1 · swap〉= { definition <strong>of</strong> swap twice }〈π 2 · 〈π 2 ,π 1 〉,π 1 · 〈π 2 ,π 1 〉〉= { by ×-cancellation (2.20) }〈π 1 ,π 2 〉= { by ×-reflexion (2.30) }idTherefore, swap is iso and establishes the following isomorphismA × B ∼ = B × A (2.31)which is known as the commutative property <strong>of</strong> product.The “product datatype” A×B is essential to information processing and is available invirtually every programming language. In HASKELL one writes (A,B) to denote A ×B,for A and B two predefined datatypes, fst to denote π 1 and snd to denote π 2 . In the Cprogramming language this datatype is called the “struct datatype”,struct {A first;B second;};


26 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGwhile in PASCAL it is called the “record datatype”:recordfirst: A;second: Bend;Isomorphism (2.31) can be re-interpreted in this context as a guarantee that one does notlose (or gain) anything in swapping fields in record datatypes. C or PASCAL programmersknow also that record-field nesting has the same status, that is to say that, for instance,datatyperecordF: A;S: recordF: B;S: C;endend;is abstractly the same asrecordF: recordF: A;S: Bend;S: C;end;In fact, this is another well-known isomorphism, known as the associative property<strong>of</strong> product:(A × B) × C , which is pronounced “asso-This is established by A × (B × C)ciate to the right” and is defined byA × (B × C) ∼ = (A × B) × C (2.32)assocrassocr def= 〈π 1 · π 1 , 〈π 2 · π 1 ,π 2 〉〉 (2.33)Section A.0.1 in the appendix lists an extension to the HUGS Standard Prelude, calledSet.hs, which makes isomorphisms such as swap and assocr available. In this module,the concrete syntax chosen for 〈f,g〉 is split f g and the one chosen for f × g is f>< g.Exercise 2.3. Show that assocr is iso by conjecturing its inverse assocl and proving thatfunctional equality assocr · assocl = id holds.✷Exercise 2.4. Use (2.22) to prove properties (2.28) and (2.29).✷


2.9. GLUING FUNCTIONS WHICH DO NOT COMPOSE — COPRODUCTS272.9 Gluing functions which do not compose — coproductsThe split functional combinator arose in the previous section as a kind <strong>of</strong> glue for combiningtwo functions which do not compose but share the same domain. The “dual” situation<strong>of</strong> two non-composable functions f : C A and g : C B which howevershare the same codomain is depicted inA B f g CIt is clear that the kind <strong>of</strong> glue we need in this case should make it possible to apply f incase we are on the “A-side” or to apply g in case we are on the “B-side” <strong>of</strong> the diagram.Let us write [ f,g ] to denote the new kind <strong>of</strong> combinator. Its codomain will be C. Whatabout its domain?We need to describe the datatype which is “either an A or a B”. Since A and B aresets, we may think <strong>of</strong> A ∪ B as such a datatype. This works in case A and B are disjointsets, but wherever the intersection A ∩ B is non-empty it is undecidable whether a valuex ∈ A ∩ B is an “A-value” or a “B-value”. In the limit, if A = B then A ∪ B = A =B, that is to say, we have not invented a new datatype at all. These difficulties can becircumvented by resorting to disjoint union:Ai 1 A + BThe values <strong>of</strong> A+B can be thought <strong>of</strong> as “copies” <strong>of</strong> A or B values which are “stamped”with different tags in order to guarantee that values which are simultaneously in A and Bdo not get mixed up. The tagging functions i 1 and i 2 are called injections:i 2Bi 1 a = (t 1 ,a) , i 2 b = (t 2 ,b) (2.34)Knowing the exact values <strong>of</strong> tags t 1 and t 2 is not essential to understanding the concept<strong>of</strong> a disjoint union. It suffices to know that i 1 and i 2 tag differently and consistently. Forinstance, the following realizations <strong>of</strong> A + B in the C programming language,struct {int tag; /* 1,2 */union {A ifA;B ifB;} data;};


2.9. GLUING FUNCTIONS WHICH DO NOT COMPOSE — COPRODUCTS29+-cancellation :i 1A A + B gCi 2[ g,h ]h B[ g,h ] · i 1 = g , [ g,h ] · i 2 = h (2.38)+-reflexion :i 1A A + B i 1i 2id A+Bi 2 A + BB[ i 1 ,i 2 ] = id A+B (2.39)+-fusion :i 1i 2A A + B B [ g,h ]g h f·g C f·h f Df · [ g,h ] = [ f · g,f · h ] (2.40)+-absorption :Ai 1A + Bi 2B[ g,h ] · (i + j) = [ g · i,h · j ] (2.41)ii 1i+jD D + E gCi 2[ g,h ]h Ej+-functor :(g · h) + (i · j) = (g + i) · (h + j) (2.42)+-functor-id :id A + id B = id A+B (2.43)


30 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGIn summary, the typing-rules <strong>of</strong> the either and sum combinators are as follows:CCCfg[ f,g ]ABA + BCDC + Dfgf+gABA + B(2.44)Exercise 2.5.so that factBy analogy (duality) with swap, show that [ i 2 ,i 1 ] is its own inverse andA + B ∼ = B + A (2.45)holds.✷Exercise 2.6. Dualize (2.33), that is, write the iso which witnesses factA + (B + C) ∼ = (A + B) + C (2.46)from right to left. Use the either syntax available from the HUGS Standard Prelude toencode this iso in HASKELL.✷2.10 Mixing products and coproductsDatatype constructions A ×B and A+B have been introduced above as devices requiredfor expressing the codomain <strong>of</strong> splits (A×B) or the domain <strong>of</strong> eithers (A+B). Therefore,a function mapping values <strong>of</strong> a coproduct (say A+B) to values <strong>of</strong> a product (say A ′ ×B ′ )can be expressed alternatively as an either or as a split. In the first case, both components<strong>of</strong> the either combinator are splits. In the latter, both components <strong>of</strong> the split combinatorare eithers.This exchange <strong>of</strong> format in defining such functions is known as the exchange law. Itstates the functional equality which follows:[ 〈f,g〉, 〈h,k〉 ] = 〈[ f,h ],[ g,k ]〉 (2.47)


2.10. MIXING PRODUCTS AND COPRODUCTS 31It can be checked by type-inference that both the left-hand side and the right-hand side expressions<strong>of</strong> this equality have type B × D A + C , for B A , D A ,hkB C and D C .An example <strong>of</strong> a function which is in the exchange-law format is isomorphismA × (B + C) undistr(A × B) + (A × C)(2.48)(pronounce undistr as “un-distribute-right”) which is defined byfgundistrdef= [ id × i 1 ,id × i 2 ] (2.49)and witnesses the fact that product distributes through coproduct:A × (B + C) ∼ = (A × B) + (A × C) (2.50)In this context, suppose that we know <strong>of</strong> three functions D A , E Bhand F C . By (2.44) we infer E + F B + C . Then, by (2.23) we inferg+hf×(g+h)D × (E + F) A × (B + C)(2.51)So, it makes sense to combine products and sums <strong>of</strong> functions and the expressions whichdenote such combinations have the same “shape” (or symbolic pattern) as the expressionswhich denote their domain and range — the ... × (· · · + · · ·) “shape” in this example. Infact, if we abstract such a pattern via some symbol, say F — that is, if we definefgF(α,β,γ)def= α × (β + γ)F(f,g,h)— then we can write F(D,E,F) F(A,B,C) for (2.51).This kind <strong>of</strong> abstraction works for every combination <strong>of</strong> products and coproducts. Forinstance, if we now abstract the right-hand side <strong>of</strong> (2.48) via patternG(α,β,γ)def= (α × β) + (α × γ)we have G(f,g,h) = (f × g) + (f × h), a function which maps G(A,B,C) = (A ×B) + (A × C) onto G(D,E,F) = (D × E) + (D × F). All this can be put in a diagramF(A,B,C)undistr G(A,B,C)F(f,g,h)F(D,E,F)G(f,g,h)G(D,E,F)


32 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGwhich unfolds toA × (B + C)undistr (A × B) + (A × C)(2.52)f×(g+h)(f×g)+(f×h)D × (E + F) (D × E) + (D × F)once the F and G patterns are instantiated. An interesting topic which stems from (completing)this diagram will be discussed in the next section.Exercise 2.7. Apply the exchange law to undistr.✷Exercise 2.8. Complete the “?”s in diagram[ x,y ]? id+id×f? ?[ k,g ]and then solve the implicit equation for x and y.✷Exercise 2.9. Repeat exercise 2.8 with respect to diagram✷? h+〈i,j〉 ?x+y?id+id×f


2.11. NATURAL PROPERTIES 332.11 Natural propertiesLet us resume discussion about undistr and the two other functions in diagram (2.52).What about using undistr itself to close this diagram, at the bottom? Note that definition(2.49) works for D, E and F in the same way it does for A, B and C. (Indeed, theparticular choice <strong>of</strong> symbols A, B and C in (2.48) was rather arbitrary.) Therefore, weget:A × (B + C)f×(g+h)undistr (A × B) + (A × C)(f×g)+(f×h)D × (E + F) (D × E) + (D × F)undistrwhich expresses a very important property <strong>of</strong> undistr:(f × (g + h)) · undistr = undistr · ((f × g) + (f × h)) (2.53)This is called the natural property <strong>of</strong> undistr. This kind <strong>of</strong> property (<strong>of</strong>ten calledfree instead <strong>of</strong> natural) is not a privilege <strong>of</strong> undistr. As a matter <strong>of</strong> fact, every functioninterfacing patterns such as F or G above will exhibit its own natural property. Furthermore,we have already quoted natural properties without mentioning it. Recall (2.10), forinstance. This property (establishing id as the unit <strong>of</strong> composition) is, after all, the naturalproperty <strong>of</strong> id. In this case we have F α = G α = α, as can be easily observed in diagram(2.11).In general, natural properties are described by diagrams in which two “copies” <strong>of</strong> theoperator <strong>of</strong> interest are drawn as horizontal arrows:AfF AF fφG AB F B G BφG f(F f) · φ = φ · (G f) (2.54)Note that f is universally quantified, that is to say, the natural property holds for everyf : B A .Diagram (2.54) corresponds to unary patterns F and G. As we have seen with undistr,other functions (g,h etc.) come into play for multiary patterns. A very important rôle willbe assigned throughout this book to these F,G, etc. “shapes” or patterns which are sharedby pointfree functional expressions and by their domain and codomain expressions. Fromchapter 3 onwards we will refer to them by their proper name — “functor” — whichis standard in mathematics and computer science. Then we will also explain the namesassigned to properties such as, for instance, (2.28) or (2.42).


34 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGExercise 2.10. Show that (2.26) and (2.27) are natural properties. Dualize these properties.Hint: recall diagram (2.41).✷Exercise 2.11. Establish the natural properties <strong>of</strong> the swap (2.31) and assocr (2.33)isomorphisms.✷2.12 Universal propertiesFunctional constructs 〈f,g〉 and [ f,g ] (and their derivatives f×g and f+g ) provide goodillustration about what is meant by a program combinator in a compositional approach toprogramming: the combinator is put forward equipped with a concise set <strong>of</strong> propertieswhich enable programmers to transform programs, reason about them and perform usefulcalculations. This raises a programming methodology which is scientific and stable.Such properties bear standard names such as cancellation, reflexion, fusion, absorptionetc.. Where do these come from? As a rule, for each combinator to be defined onehas to define suitable constructions at “interface”-level 2 , e.g. A × B and A + B. Theseare not chosen or invented at random: each is defined in a way such that the associatedcombinator is uniquely defined. This is assured by a so-called universal property fromwhich the others can derived.Take product A × B, for instance. Its universal property states that, for each pair <strong>of</strong>ffarrows A C and B C , there exists an arrow A × B C such that〈f,g〉k = 〈f,g〉⇔{π1 · k = fπ 2 · k = g(2.55)holds — recall diagram (2.21) — for all A × B C . This equivalence states that〈f,g〉 is the unique arrow satisfying the property on the right. In fact, read (2.55) in the ⇒direction and let k be 〈f,g〉. Then π 1 · 〈f,g〉 = f and π 2 · 〈f,g〉 = g will hold, meaningthat 〈f,g〉 effectively obeys the property on the right. In other words, we have derived2 In the current context, programs “are” functions and program-interfaces “are” the datatypesinvolved in functional signatures.k


2.12. UNIVERSAL PROPERTIES 35×-cancellation (2.20). Reading (2.55) in the ⇐ direction we understand that, if some ksatisfies such properties, then it “has to be” the same arrow as 〈f,g〉.It is easy to see other properties <strong>of</strong> 〈f,g〉 arising from (2.55). For instance, for k = idwe get ×-reflexion (2.30),{π1 · id = fid = 〈f,g〉 ⇔π 2 · id = g≡ { by (2.10) }{π1 = fid = 〈f,g〉 ⇔π 2 = g≡ { by substitution <strong>of</strong> f and g }id = 〈π 1 ,π 2 〉and for k = 〈i,j〉 · h we get ×-fusion (2.24):〈i,j〉 · h = 〈f,g〉 ⇔{π1 · (〈i,j〉 · h) = fπ 2 · (〈i,j〉 · h) = g≡ { composition is associative (2.8) }{(π1 · 〈i,j〉) · h = f〈i,j〉 · h = 〈f,g〉 ⇔(π 2 · 〈i,j〉) · h = g≡ { by ×-cancellation (just derived) }{i · h = f〈i,j〉 · h = 〈f,g〉 ⇔j · h = g≡ { by substitution <strong>of</strong> f and g }〈i,j〉 · h = 〈i · h,j · h〉It will take about the same effort to derive split structural equality〈i,j〉 = 〈f,g〉⇔{i = fj = g(2.56)from universal property (2.55) — just let k = 〈i,j〉.Similar arguments can be built around coproduct’s universal property,k = [ f,g ] ⇔{k · i1 = fk · i 2 = g(2.57)


36 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGfrom which structural equality <strong>of</strong> eithers can be inferred,[ i,j ] = [ f,g ] ⇔{i = fj = g(2.58)as well as the other properties we know about this combinator.Exercise 2.12. Derive +-cancellation (2.38), +-reflexion (2.39) and +-fusion (2.40)from universal property (2.57). Then derive the exchange law (2.47) from the universalproperty <strong>of</strong> product (2.55) or coproduct (2.57).✷2.13 Guards and McCarthy’s conditionalMost functional programming languages and notations cater for pointwise conditionalexpressions <strong>of</strong> the formif (p x) then (g x) else (h x)meaning {p x ⇒ g x¬(p x) ⇒ hxpfor some given predicate Bool A , some “then”-function B Ahgand some“else”-function B A . Bool is the primitive datatype containing truth values FALSEand TRUE.Can such expressions be written in the pointfree style? They can, provided we introducethe so-called “McCarthy conditional” functional formwhich is defined byp → g,hp → g,h def = [ g,h ] · p? (2.59)a definition we can understand provided we know the meaning <strong>of</strong> the “p?” construct.We call A + Ap?Aa guard, or better, the guard associated to a given predicate


2.13. GUARDS AND MCCARTHY’S CONDITIONAL 37pBool A . Every predicate p gives birth to its own guard p? which, at point-level, isdefined as follows:(p?)a ={p a ⇒ i1 a¬(p a) ⇒ i 2 a(2.60)In a sense, guard p? is more “informative” than p alone: it provides information about theoutcome <strong>of</strong> testing p on some input a, encoded in terms <strong>of</strong> the coproduct injections (i 1 fora true outcome and i 2 for a false outcome, respectively) without losing the input a itself.The following fact, which we will refer to as McCarthy’s conditional fusion law, is aconsequence <strong>of</strong> +-fusion (2.40):f · (p → g,h) = p → f · g,f · h (2.61)We shall introduce and define instances <strong>of</strong> predicate p as long as they are needed. Aparticularly important assumption <strong>of</strong> our notation should, however, be mentioned at thispoint: we assume that, for every datatype A, the equality predicate Bool A × A isdefined in a way which guarantees three basic properties: reflexivity (a = A a for everya), transitivity (a = A b and b = A c implies a = A c) and symmetry (a = A b iff b = A a).Subscript A in = A will be dropped wherever implicit in the context.In HASKELL programming, the equality predicate for a type becomes available bydeclaring the type as an instance <strong>of</strong> class Eq, which exports equality predicate (==).This does not, however, guarantee the reflexive, transitive and symmetry properties, whichneed to be proved by dedicated mathematical arguments.Exercise 2.13. Prove that the following equality between two conditional expressionsk(if p x then f x else hx,if p x then g x else ix)= if p x then k(f x,g x) else k(hx,ix)holds by rewriting it in the pointfree style (using the McCarthy’s conditional combinator)and applying the exchange law (2.47), among others.✷= AExercise 2.14. Prove law (2.61).✷


38 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGExercise 2.15. From (2.59) and propertyp? · f = (f + f) · (p · f)? (2.62)infer(p → f,g) · h = (p · h) → (f · h),(g · h) (2.63)✷2.14 Gluing functions which do not compose — exponentialsNow that we have made the distinction between the pointfree and pointwise functionalnotations reasonably clear, it is instructive to revisit section 2.2 and identify functionalapplication as the “bridge” between the pointfree and pointwise worlds. However, weshould say “a bridge” rather than “the bridge”, for in this section we enrich such an interfacewith another “bridge” which is very relevant to programming.Suppose we are given the task to combine two functions B C × A and D A .It is clear that none <strong>of</strong> the combinations f · g, 〈f,g〉 or [ f,g ] is well-typed. So, f and gcannot be put together directly — they require some extra interfacing.Note that 〈f,g〉 would be well-defined in case the C component <strong>of</strong> f’s domain couldbe somehow “ignored”. Suppose, in fact, that in some particular context the first argument<strong>of</strong> f happens to be “irrelevant”, or to be frozen to some c ∈ C. It is easy to derive a newfunctionf c : A Bf c a def = f(c,a)from f which combines nicely with g via the split combinator: 〈f c ,g〉 is well-definedand bears type B × D A . For instance, suppose that C = A and f is the equalitypredicate = on A. Then Bool= cAis the “equal to c” predicate on A values:= c a def = a = c (2.64)fg


2.14. GLUING FUNCTIONS WHICH DO NOT COMPOSE — EXPONENTIALS39As another example, recall function twice (2.3) which could be defined as × 2 using thenew notation.However, we need to be more careful about what is meant by f c . Such as functionalapplication, expression f c interfaces the pointfree and the pointwise levels — it involvesa function (f) and a value (c). But, for B C × A , there is a major distinctionbetween f c and f c — while the former denotes a value <strong>of</strong> type B, i.e. f c ∈ B, f cdenotes a function <strong>of</strong> type B A . We will say that f c ∈ B A by introducing a newdatatype construct which we will refer to as the exponential:fB Adef= {g | g : B A } (2.65)There are strong reasons to adopt the B A notation to the detriment <strong>of</strong> the more obviousB ← A or A → B alternatives, as we shall see shortly.The B A exponential datatype is therefore inhabited by functions from A to B, that isto say, functional declaration g : B A means the same as g ∈ B A . And what dowe want functions for? We want to apply them. So it is natural to introduce the applyoperatorap : Bapap(f,a) def = f awhich applies a function f to an argument a.fB A × ABack to generic binary function B C × A , let us now think <strong>of</strong> the operationwhich, for every c ∈ C, produces f c ∈ B A . This can be regarded as a function <strong>of</strong>signature B A C which expresses f as a kind <strong>of</strong> C-indexed family <strong>of</strong> functions <strong>of</strong>signature B A . We will denote such a function by f (read f as “f transposed”).Intuitively, we want f and f to be related to each other by the following property:f(c,a) = (f c)a (2.66)Given c and a, both expressions denote the same value. But, in a sense, f is more tolerantthan f: while the latter is binary and requires both arguments (c,a) to become availablebefore application, the former is happy to be provided with c first and with a later on, ifactually required by the evaluation process.Similarly to A × B and A + B, exponential B A involves a universal property,k = f ⇔ f = ap · (k × id) (2.67)from which laws for cancellation, reflexion and fusion can be derived:


40 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGExponentials cancellation :B A B A × A ap Bf f×idC fC × Af = ap · (f × id) (2.68)Exponentials reflexion :id B AB A B A × A ap Bid B A ×id AB Aap B A × Aap = id B A (2.69)Exponentials fusion :B A B A × A ap Bgg g×id C C × A g·(f×id)f f×idD D × Ag · (f × id) = g · f (2.70)Note that the cancellation law is nothing but fact (2.66) written in the pointfree style.Is there an absorption law for exponentials? The answer is affirmative but first weneed to introduce a new functional combinator which arises as the transpose <strong>of</strong> f · ap inthe following diagram:D A × A apf·ap×idB A × Aap Df BWe shall denote this by f A and its type-rule is as follows:CC A ff ABB A


2.14. GLUING FUNCTIONS WHICH DO NOT COMPOSE — EXPONENTIALS41It can be shown that, once A and CgfBare fixed, f A is the function which acceptssome input function B A as argument and produces function f · g as result (seeexercise 2.23). So f A is the “compose with f” functional combinator:Now we are ready to understand the laws which follow:(f A )g def = f · g (2.71)Exponentials absorption :D A D A × A ap Df A f A ×id fB Ag g×idCB A × Aap Bg C × Af · g = f A · g (2.72)Exponentials-functor :(g · h) A = g A · h A (2.73)Exponentials-functor-id :id A = id (2.74)To conclude this section we need to explain why we have adopted the apparentlyesoteric B A notation for the “function from A to B” data type. Let us introduce thefollowing operatorcurry fdef= f (2.75)which maps a function f to its transpose f. This operator, which is very familiar t<strong>of</strong>unctional programmers, maps functions in some function space B C×A to functions in(B A ) C . Its inverse (known as thêfunction) also exists. In the HUGS Standard Preludewe find them declared as follows:curry :: ((a,b) -> c) -> (a -> b -> c)curry f x y = f (x,y)uncurry :: (a -> b -> c) -> ((a,b) -> c)uncurry f p = f (fst p) (snd p)


42 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGFrom (2.75) it is obvious see that writing f or curry f is a matter <strong>of</strong> taste, the latter beingmore in the tradition <strong>of</strong> functional programming. For instance, the fusion law (2.70) canbe re-written ascurry (g · (f × id)) = curry g · fand so on.It is known from mathematics that curry and̂are isos witnessing the following isomorphismwhich is at the core <strong>of</strong> the theory <strong>of</strong> functional programming:B C×A ∼ = (B A ) C (2.76)Fact (2.76) clearly resembles a well known equality concerning numeric exponentials,b c×a = (b a ) c . But other known facts about numeric exponentials, e.g. a b+c = a b × a c or(b × c) a = b a × c a find their counterpart in functional exponentials. The counterpart <strong>of</strong>the former,A B+C ∼ = A B × A C (2.77)arises from the uniqueness <strong>of</strong> the either combination: every pair <strong>of</strong> functions (f,g) ∈A B × A C leads to a unique function [ f,g ] ∈ A B+C and vice-versa, every function inA B+C is the either <strong>of</strong> some function in A B and <strong>of</strong> another in A C .The function exponentials counterpart <strong>of</strong> the second fact about numeric exponentialsabove is(B × C) A ∼ = B A × C A (2.78)This can be justified by a similar argument concerning the uniqueness <strong>of</strong> the split combinator〈f,g〉.What about other facts valid for numeric exponentials such as a 0 = 1 and 1 a = 1? Weneed to know what 0 and 1 mean as datatypes. Such elementary datatypes are presentedin the section which follows.Exercise 2.16. Load module Set.hs (cf. section A.0.1) into the HUGS interpreter andcheck the types assigned to the following functional expressions:curry ap\f -> ap . ( f >< id)uncurry . curryWhich <strong>of</strong> these is functionally equivalent to the uncurry function and why? Which <strong>of</strong>these are functionally equivalent to identity functions? Justify.✷


2.15. ELEMENTARY DATATYPES 432.15 Elementary datatypesSo far we have talked mostly about arbitrary datatypes represented by capital letters A,B, etc. (lowercase a, b, etc. in the HASKELL illustrations). We also mentioned IR, Booland IN and, in particular, the fact that we can associate to each natural number n its initialsegment n = {1,2,... ,n}. We extend this to IN 0 by stating 0 = {} and, for n > 0,n + 1 = {n + 1} ∪ n.Initial segments can be identified with enumerated types and are regarded as primitivedatatypes in our notation. We adopt the convention that primitive datatypes are writtenin the sans serif font and so, strictly speaking, n is distinct from n: the latter denotes anatural number while the former denotes a datatype.Datatype 0Among such enumerated types, 0 is the smallest because it is empty. This is the Voiddatatype in HASKELL, which has no constructor at all. Datatype 0 (which we tend towrite simply as 0) may not seem very “useful” in practice but it is <strong>of</strong> theoretical interest.For instance, it is easy to check that the following “obvious” properties hold:Datatype 1A + 0 ∼ = A (2.79)A × 0 ∼ = 0 (2.80)Next in the sequence <strong>of</strong> initial segments we find 1, which is singleton set {1}. How usefulis this datatype? Note that every datatype A containing exactly one element is isomorphicto {1}, e.g. A = {NIL}, A = {0}, A = {1}, A = {FALSE}, etc.. We represent this class<strong>of</strong> singleton types by 1.Recall that isomorphic datatypes have the same expressive power and so are “abstractlyidentical”. So, the actual choice <strong>of</strong> inhabitant for datatype 1 is irrelevant, and wecan replace any particular singleton set by another without losing information. This isevident from the following relevant facts involving 1:A × 1 ∼ = A (2.81)A 0 ∼ = 1 (2.82)We can read (2.81) informally as follows: if the second component <strong>of</strong> a record (“struct”)cannot change, then it is useless and can be ignored. Selector π 1 is, in this context, an isomapping the left-hand side <strong>of</strong> (2.81) to its right-hand side. Its inverse is 〈id,c〉 where cis a particular choice <strong>of</strong> inhabitant for datatype 1. Concerning (2.82), A 0 denotes the set<strong>of</strong> all functions from the empty set to some A. What does (2.82) mean? It simply tells


44 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGus that there is only one function in such a set — the empty function mapping “no” valueat all. This fact confirms our choice <strong>of</strong> notation once again (compare with a 0 = 1 in anumeric context).Next, we may wonder about facts1 A ∼ = 1 (2.83)A 1 ∼ = A (2.84)which are the functional exponentiation counterparts <strong>of</strong> 1 a = 1 and a 1 = a. Fact (2.83)is valid: it means that there is only one function mapping A to some singleton set {c}— the constant function c. There is no room for another function in 1 A because onlyc is available as output value. Fact (2.84) is also valid: all functions in A 1 are (singlevalued) constant functions and there are as many constant functions in such a set as thereare elements in A.In summary, when referring to datatype 1 we will mean an arbitrary singleton type,and there is a unique iso (and its inverse) between two such singleton types. The HASKELLrepresentative <strong>of</strong> 1 is datatype (), called the unit type, which contains exactly constructor(). It may seem confusing to denote the type and its unique inhabitant by the same symbolbut it is not, since HASKELL keeps track <strong>of</strong> types and constructors in separate symbolsets.Finally, what can we say about 1+A? Every function Bf1 + Aobserving thistype is bound to be an either [ b 0 ,g ] for b 0 ∈ B and B A . This is very similar tothe handling <strong>of</strong> a pointer in C or PASCAL: we “pull a rope” and either we get nothing (1)or we get something useful <strong>of</strong> type B. In such a programming context “nothing” abovemeans a predefined value NIL. This analogy supports our preference in the sequel for NILas canonical inhabitant <strong>of</strong> datatype 1. In fact, we will refer to 1 + A (or A + 1) as the“pointer to A” datatype. This corresponds to the Maybe type constructor <strong>of</strong> the HUGSStandard Prelude.gDatatype 2Let us inspect the 1 + 1 instance <strong>of</strong> the “pointer” construction just mentioned above. Anyfobservation B 1 + 1 can be decomposed in two constant functions: f = [ b 1 ,b 2 ].Now suppose that B = {b 1 ,b 2 } (for b 1 ≠ b 2 ). Then 1 + 1 ∼ = B will hold, for whateverchoice <strong>of</strong> inhabitants b 1 and b 2 . So we are in a situation similar to 1: we will use symbol 2to represent the abstract class <strong>of</strong> all such Bs containing exactly two elements. Therefore,we can write:1 + 1 ∼ = 2


2.16. FINITARY PRODUCTS AND COPRODUCTS 45Of course, Bool = {TRUE, FALSE} and initial segment 2 = {1,2} are in this abstractclass. In the sequel we will show some preference for the particular choice <strong>of</strong> inhabitantsb 1 = TRUE and b 2 = FALSE, which enables us to use symbol 2 in places where Bool isexpected.Exercise 2.17. Relate HASKELL expressionsandeither (split (const True) id) (split (const False) id)\f->(f True, f False)to the following isomorphisms involving generic elementary type 2:Apply the exchange law (2.47) to the first expression above.✷2 × A ∼ = A + A (2.85)A × A ∼ = A 2 (2.86)2.16 Finitary products and coproductsIn section 2.8 it was suggested that product could be regarded as the abstraction behinddata-structuring primitives such as struct in C or record in PASCAL. Similarly, coproductswere suggested in section 2.9 as abstract counterparts <strong>of</strong> C unions or PASCALvariant records. For a finite A, exponential B A could be realized as an array in any <strong>of</strong>these languages. These analogies are captured in table 2.1.In the same way C structs and unions may contain finitely many entries, as mayPASCAL (variant) records, product A × B extends to finitary product A 1 × ... × A n , forn ∈ IN, also denoted by Π n i=1 A i, to which as many projections π i are associated as thenumber n <strong>of</strong> factors involved. Of course, splits become n-ary as well〈f 1 ,...,f n 〉 : A 1 × ... × A n Bfor f i : A i B , i = 1,n.Dually, coproduct A + B is extensible to the finitary sum A 1 + · · · + A n , for n ∈ IN,also denoted by ∑ nj=1 A j , to which as many injections i j are assigned as the number n <strong>of</strong>terms involved. Similarly, eithers become n-aryfor f i : BA i[ f 1 ,...,f n ] : A 1 + ... + A n B, i = 1,n.


46 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGAbstract notation PASCAL C/C++ DescriptionA × BA + BrecordP: A;S: Bend;recordcasetag: integer<strong>of</strong> x =1: (P:A);2: (S:B)struct {A first;B second;};struct {int tag; /* 1,2 */union {A ifA;B ifB;} data;};RecordsVariant recordsend;B A array[A] <strong>of</strong> B B ...[A] Arrays1 + A ˆA A *... PointersTable 2.1: Abstract notation versus programming language data-structures.Datatype nNext after 2, we may think <strong>of</strong> 3 as representing the abstract class <strong>of</strong> all datatypes containingexactly three elements. Generalizing, we may think <strong>of</strong> n as representing the abstractclass <strong>of</strong> all datatypes containing exactly n elements. Of course, initial segment n willbe in this abstract class. (Recall (2.17), for instance: both Weekday and 7 are abstractlyrepresented by 7.) Therefore,andn ∼ = 1 + · · · + 1} {{ }nA × ... × A} {{ }nA + ... + A} {{ }n∼= A n (2.87)∼= n × A (2.88)hold.Exercise 2.18. On the basis <strong>of</strong> table 2.1, encode undistr (2.49) in C or PASCAL.Compare your code with the HASKELL pointfree and pointwise equivalents.✷


2.17. INITIAL AND TERMINAL DATATYPES 472.17 Initial and terminal datatypesAll properties studied for binary splits and binary eithers extend to the finitary case. Forthe particular situation n = 1, we will have 〈f〉 = [ f ] = f and π 1 = i 1 = id, <strong>of</strong>course. For the particular situation n = 0, finitary products “degenerate” to 1 and finitarycoproducts “degenerate” to 0. So diagrams (2.21) and (2.36) are reduced to〈〉1 0CThe standard notation for the empty split 〈〉 is ! C , where subscript C can be omitted ifimplicit in the context. By the way, this is precisely the only function in 1 C , recall (2.83).Dually, the standard notation for the empty either [ ] is ? C , where subscript C can also beomitted. By the way, this is precisely the only function in C 0 , recall (2.82).In summary, we may think <strong>of</strong> 0 and 1 as, in a sense, the “extremes” <strong>of</strong> the wholedatatype spectrum. For this reason they are called initial and terminal, respectively. Weconclude this subject with the presentation <strong>of</strong> their main properties which, as we havesaid, are instances <strong>of</strong> properties we have stated for products and coproducts.C[ ]Initial datatype reflexion :? 0 =id 00? 0 = id 0 (2.89)Initial datatype fusion :0 ? BA fB? Af·? A =? B (2.90)Terminal datatype reflexion :! 1 =id 11! 1 = id 1 (2.91)


48 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGTerminal datatype fusion :1 ! A · f =! B (2.92)A! Af! BBExercise 2.19. Particularize the exchange law (2.47) to empty products and emptycoproducts, i.e. 1 and 0.✷2.18 Sums and products in HASKELLWe conclude this chapter with an analysis <strong>of</strong> the main primitive available in HASKELLfor creating datatypes: the data declaration. Suppose we declaredata CostumerId = P Int | CC Intmeaning to say that, for some company, a client is identified either by its passport numberor by its credit card number, if any. What does this piece <strong>of</strong> syntax precisely mean?If we enquire the HUGS interpreter about what it knows about CostumerId, thereply will contain the following information:Main> :i CostumerId-- type constructordata CostumerId-- constructors:P :: Int -> CostumerIdCC :: Int -> CostumerIdIn general, let A and B be two known datatypes. Via declarationdata C = C1 A | C2 B (2.93)


50 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGLet us use this in calculating the inverse inv <strong>of</strong> [ C1,C2 ]:Therefore:inv · [ C1,C2 ] = id≡ { by +-fusion (2.40) }[ inv · C1,inv · C2 ] = id≡ { by +-reflexion (2.39) }[ inv · C1,inv · C2 ] = [ i 1 ,i 2 ]≡ { either structural equality (2.58) }inv · C1 = i 1 ∧ inv · C2 = i 2inv : C A + Binv(C1a) = i 1 ainv(C2b) = i 2 bIn summary, C1 is a “renaming” <strong>of</strong> injection i 1 , C2 is a “renaming” <strong>of</strong> injection i 2 and Cis “renamed” replica <strong>of</strong> A + B:C[ C1,C2 ]A + B[ C1,C2 ] is called the algebra <strong>of</strong> datatype C and its inverse inv is called the coalgebra<strong>of</strong> C. The algebra contains the constructors <strong>of</strong> C1 and C2 <strong>of</strong> type C, that is, it is usedto “build” C-values. In the opposite direction, co-algebra inv enables us to “destroy” orobserve values <strong>of</strong> C:Cinv∼= A + B[ C1,C2 ]Algebra/coalgebras also arise about product datatypes. For instance, suppose that onewishes to describe datatype Point inhabited by pairs (x 0 ,y 0 ), (x 1 ,y 1 ) etc. <strong>of</strong> Cartesiancoordinates <strong>of</strong> a given type, say A. Although A × A equipped with projections π 1 ,π 2“is” such a datatype, one may be interested in a suitably named replica <strong>of</strong> A ×A in whichpoints are built explicitly by some constructor (say Point) and observed by dedicatedselectors (say x and y):π 1A A × AxPointπ 2Pointy A(2.94)


2.19. EXERCISES 51This rises an algebra (Point) and a coalgebra (〈x,y〉) for datatype Point:Point〈x,y〉∼= A × APointIn HASKELL one writesdata Point a = Point { x :: a, y :: a }but be warned that HASKELL delivers Point in curried form:Point :: a -> a -> Point aFinally, what is the “pointer”-equivalent in HASKELL? This corresponds to A = 1 in(2.93) and to the following HASKELL declaration:data C = C1 () | C2 BNote that HASKELL allows for a more programming-oriented alternative in this case, inwhich the unit type () is eliminated:data C = C1 | C2 BThe difference is that here C1 denotes an inhabitant <strong>of</strong> C (and so a clause f(C1a) = ...is rewritten to f C1 = ...) while above C1 denotes a (constant) function CC11 .Isomorphism (2.84) helps in comparing these two alternative situations.2.19 ExercisesExercise 2.20. Let A and B be two disjoint datatypes, that is, A ∩ B = ∅ holds. Showthat isomorphismiA ∪ B ∼ = A + B (2.95)holds. Hint: define A ∪ B A + B as i = [ emb A ,emb B ] for emb A a = a andemb B b = b, and find its inverse. By the way, why didn’t we define i simply as i def =[ id A ,id B ]?✷


52 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MINGExercise 2.21. Let distr (read: ‘distribute right’) be the bijection which witnessesisomorphism A×(B+C) ∼ = A×B+A×C. Fill in the “. . . ”in the diagram which followsso that it describes bijection distl (red: ‘distribute left’) which witnesses isomorphism(B + C) × A ∼ = B × A + C × A:(B + C) × A swap · · · distr · · ·distl··· B × A + C × A✷Exercise 2.22. In the context <strong>of</strong> exercise 2.21, proveknowing thatholds.✷[ g,h ] × f = [ g × f,h × f ] · distl (2.96)f × [ g,h ] = [ f × g,f × h ] · distrExercise 2.23. Show that (f · ap)g = f · g holds, cf. (2.71).✷Exercise 2.24. Let C const C A be the function <strong>of</strong> exercise 2.1, that is, const c = c A .Which fact is expressed by the following diagram featuring const?C const C A f AfB constB AWrite it at point-level and describe it by your own words.✷


2.20. BIBLIOGRAPHY NOTES 53Exercise 2.25.HASKELL,Establish the difference between the following two declarations indata D = D1 A | D2 B Cdata E = E1 A | E2 (B,C)for A, B and C any three predefined types. Are D and E isomorphic? If so, can you specifyand encode the corresponding isomorphism?✷2.20 Bibliography notesA few decades ago John Backus read, in his Turing Award Lecture, a revolutionary paper[Bac78]. This paper proclaimed conventional command-oriented programming languagesobsolete because <strong>of</strong> their inefficiency arising from retaining, at a high-level, the so-called“memory access bottleneck” <strong>of</strong> the underlying computation model — the well-knownvon Neumann architecture. Alternatively, the (at the time already mature) functional programmingstyle was put forward for two main reasons. Firstly, because <strong>of</strong> its potential forconcurrent and parallel computation. Secondly — and Backus emphasis was really puton this —, because <strong>of</strong> its strong algebraic basis.Backus algebra <strong>of</strong> (functional) programs was providential in alerting computer programmersthat computer languages alone are insufficient, and that only languages whichexhibit an algebra for reasoning about the objects they purport to describe will be usefulin the long run.The impact <strong>of</strong> Backus first argument in the computing science and computer architecturecommunities was considerable, in particular if assessed in quality rather than quantityand in addition to the almost contemporary structured programming trend 3 . By contrast,his second argument for changing computer programming was by and large ignored, andonly the so-called algebra <strong>of</strong> programming research minorities pursued in this direction.However, the advances in this area throughout the last two decades are impressive and canbe fully appreciated by reading a textbook written relatively recently by Bird and de Moor3 Even the C programming language and the UNIX operating system, with their implicit functionalflavour, may be regarded as subtle outcomes <strong>of</strong> the “going functional” trend.


54 CHAPTER 2. AN INTRODUCTION TO POINTFREE <strong>PROGRAM</strong>MING[BdM97]. A comprehensive review <strong>of</strong> the voluminous literature available in this area canalso be found in this book.Although the need for a pointfree algebra <strong>of</strong> programming was first identified byBackus, perhaps influenced by Iverson’s APL growing popularity in the USA at that time,the idea <strong>of</strong> reasoning and using mathematics to transform programs is much older and canbe traced to the times <strong>of</strong> McCarthy’s work on the foundations <strong>of</strong> computer programming[McC63], <strong>of</strong> Floyd’s work on program meaning [Flo67] and <strong>of</strong> Paterson and Hewitt’scomparative schematology [PH70]. Work <strong>of</strong> the so-called program transformation schoolwas already very expressive in the mid 1970s, see for instance references [BD77].The mathematics adequate for the effective integration <strong>of</strong> these related but independentlines <strong>of</strong> thought was provided by the categorial approach <strong>of</strong> Manes and Arbib compiledin a textbook [MA86] which has very strongly influenced the last decade <strong>of</strong> 20thcentury theoretical computer science.A so-called MPC (“Mathematics <strong>of</strong> Program Construction”) community has beenamong the most active in producing an integrated body <strong>of</strong> knowledge on the algebra <strong>of</strong>programming which has found in functional programming an eloquent and paradigmaticmedium. Functional programming has a tradition <strong>of</strong> absorbing fresh results from theoreticalcomputer science, algebra and category theory. Languages such as HASKELL [Bir98]have been competing to integrate the most recent developments and therefore are excellentprototyping vehicles in courses on program calculation, as happens with this book.


Chapter 3Recursion in the Pointfree StyleHow useful from a programmer’s point <strong>of</strong> view are the abstract concepts presented inthe previous chapter? Recall that a table was presented — table 2.1 — which records ananalogy between abstract type notation and the corresponding data-structures available incommon, imperative languages.This analogy is precisely our point <strong>of</strong> departure for extending the abstract notationtowards a most important field <strong>of</strong> programming: recursion.3.1 MotivationLet us consider a very common data-structure in programming: “linked-lists”. In PASCALone will writeL = ˆN;N = recordfirst: A;next: ˆNend;to specify such a data-structure L. This consists <strong>of</strong> a pointer to a node (N), where a nodeis a record structure which puts some predefined type A together with a pointer to anothernode, and so on. In the C programming language, every x ∈ L will be declared asL x;in the context <strong>of</strong> datatype definitiontypedef struct N {A first;55


56 CHAPTER 3. RECURSION IN THE POINTFREE STYLEstruct N *next;} *L;and so on.What interests us in such “first year programming course” datatype declarations?Records and pointers have already been dealt with in table 2.1. So we can use this tableto find the abstract version <strong>of</strong> datatype L, by replacing pointers by the “1 + · · ·” notationand records (structs) by the “... × ...” notation:{L = 1 + N(3.1)N = A × (1 + N)We obtain a system <strong>of</strong> two equations on unknowns L and N, in which L’s dependenceon N can be removed by substitution:{L = 1 + NN = A × (1 + N)≡ { substituting L for 1 + N in the second equation }{L = 1 + NN= A × L≡ { substituting A × L for N in the first equation }{L = 1 + A × LN= A × LSystem (3.1) is thus equivalent to:{L = 1 + A × LN = A × (1 + N)(3.2)Intuitively, L abstracts the “possibly empty” linked-list <strong>of</strong> elements <strong>of</strong> type A, while Nabstracts the “non-empty” linked-list <strong>of</strong> elements <strong>of</strong> type A. Note that L and N are independent<strong>of</strong> each other, but also that each depends on itself. Can we solve these equationsin a way such that we obtain “solutions” for L and N, in the same way we do with schoolequations such as, for instance,x = 1 + x 2? (3.3)Concerning this equation, let us recall how we would go about it in school mathematics:x = 1 + x 2


3.1. MOTIVATION 57≡ { adding − x 2to both sides <strong>of</strong> the equation }x − x 2 = 1 + x 2 − x 2≡ { − x 2 cancels x 2 }x − x 2 = 1≡ { multiplying both sides <strong>of</strong> the equation by 2 etc. }2 × x − x = 2≡ { subtraction }x = 2We very quickly get solution x = 2. However, many steps were omitted from theactual calculation. This unfolds into the longer sequence <strong>of</strong> more elementary steps whichfollows, in which notation a − b abbreviates a+(−b) and a b abbreviates a × 1 b, for b ≠ 0:x = 1 + x 2≡ { adding − x 2to both sides <strong>of</strong> the equation }x − x 2 = (1 + x 2 ) − x 2≡ { + is associative }x − x 2 = 1 + (x 2 − x 2 )≡ { − x 2 is the additive inverse <strong>of</strong> x 2 }x − x 2 = 1 + 0≡ { 0 is the unit <strong>of</strong> addition }x − x 2 = 1≡ { multiplying both sides <strong>of</strong> the equation by 2 }2 × (x − x 2 ) = 2 × 1≡ { 1 is the unit <strong>of</strong> multiplication }2 × (x − x 2 ) = 2≡ { multiplication distributes over addition }


58 CHAPTER 3. RECURSION IN THE POINTFREE STYLE2 × x − 2 × x 2 = 2≡ { 2 cancels its inverse 1 2 }2 × x − 1 × x = 2≡ { multiplication distributes over addition }(2 − 1) × x = 2≡ { 2 − 1 = 1 and 1 is the unit <strong>of</strong> multiplication }x = 2Back to (3.2), we would like to submit each <strong>of</strong> the equations, e.g.L = 1 + A × L (3.4)to a similar reasoning. Can we do it? The analogy which can be found between thisequation and (3.3) goes beyond pattern similarity. From chapter 2 we know that manyproperties required in the reasoning above hold in the context <strong>of</strong> (3.4), provided the “=”sign is replaced by the “ ∼ =” sign, that <strong>of</strong> set-theoretical isomorphism. Recall that, forinstance, + is associative (2.46), 0 is the unit <strong>of</strong> addition (2.79), 1 is the unit <strong>of</strong> multiplication(2.81), multiplication distributes over addition (2.50) etc. Moreover, the first stepabove assumed that addition is compatible (monotonic) with respect to equality,a = bc = da + c = b + da fact which still holds when numeric equality gives place to isomorphism and numericaddition gives place to coproduct:A ∼ = BC ∼ = DA + C ∼ = B + D— recall (2.44) for isos f and g.Unfortunately, the main steps in the reasoning above are concerned with two basiccancellation propertiesx + b = c ≡x × b = c ≡x = c − bx = c b(b ≠ 0)which hold about numbers but do not hold about datatypes. In fact, neither products nor


3.1. MOTIVATION 59coproducts have arbitrary inverses 1 , and so we cannot “calculate by cancellation”. Howdo we circumvent this limitation?Just think <strong>of</strong> how we would have gone about (3.3) in case we didn’t know about thecancellation properties: we would be bound to the x by 1 + x 2substitution plus the otherproperties. By performing such a substitution over and over again we would obtain. . .x = 1 + x 2≡ { x by 1 + x 2substitution followed by simplification }x = 1 + 1 + x 22= 1 + 1 2 + x 4≡ { the same as above }x = 1 + 1 2 + 1 + x 24= 1 + 1 2 + 1 4 + x 8≡ { over and over again, n-times }· · ·≡ { simplification }x =n∑i=012 i + x2 n+1≡ { sum <strong>of</strong> n first terms <strong>of</strong> a geometric progression }x = (2 − 12 n) + x2 n+1≡ { let n → ∞ }x = (2 − 0) + 0≡ { simplification }x = 2Clearly, this is a much more complicated way <strong>of</strong> finding solution x = 2 for equation(3.3). But we would have loved it in case it were the only known way, and this is preciselywhat happens with respect to (3.4). In this case we have:L = 1 + A × L≡ { substitution <strong>of</strong> 1 + A × L for L }1 The initial and terminal datatypes do have inverses — 0 is its own “additive inverse” and 1 isits own “multiplicative inverse” — but not all the others.


60 CHAPTER 3. RECURSION IN THE POINTFREE STYLEL = 1 + A × (1 + A × L)≡ { distributive property (2.50) }L ∼ = 1 + A × 1 + A × (A × L)≡ { unit <strong>of</strong> product (2.81) and associativity <strong>of</strong> product (2.32) }L ∼ = 1 + A + (A × A) × L≡ { by (2.82), (2.84) and (2.87) }L ∼ = A 0 + A 1 + A 2 × L≡ { another substitution as above and similar simplifications }L ∼ = A 0 + A 1 + A 2 + A 3 × L≡ { after (n + 1)-many similar steps }L ∼ n∑= A i + A n+1 × Li=0Bearing a large n in mind, let us deliberately (but temporarily) ignore term A n+1 ×L.Then L will be isomorphic to the sum <strong>of</strong> n-many contributions A i ,L ∼ n∑= A ii=0each <strong>of</strong> them consisting <strong>of</strong> i-long tuples, or sequences, <strong>of</strong> values <strong>of</strong> A. (Number i is saidto be the length <strong>of</strong> any sequence in A i .) Such sequences will be denoted by enumeratingtheir elements between square brackets, for instance the empty sequence [ ] which is theonly inhabitant in A 0 , the two element sequence [a 1 ,a 2 ] which belongs to A 2 provideda 1 ,a 2 ∈ A, and so on. Note that all such contributions are mutually disjoint, that is,A i ∩ A j = ∅ wherever i ≠ j. (In other words, a sequence <strong>of</strong> length i is never a sequence<strong>of</strong> length j, for i ≠ j.) If we join all contributions A i into a single set, we obtain the set<strong>of</strong> all finite sequences on A, denoted by A ⋆ and defined as follows:A ⋆def= ⋃ i≥0A i (3.5)The intuition behind taking the limit in the numeric calculation above was that termx2was getting smaller and smaller as n went larger and larger and, “in the limit”, itn+1could be ignored. By analogy, taking a similar limit in the calculation just sketched abovewill mean that, for a “sufficiently large” n, the sequences in A n are so long that it is veryunlikely that we will ever use them! So, for n → ∞ we obtainL ∼ ∞∑= A ii=0


3.2. INTRODUCING INDUCTIVE DATATYPES 61Because ∑ ∞i=0 A i is isomorphic to ⋃ ∞i=0 A i (see exercise 2.20), we finally have:L ∼ = A ⋆All in all, we have obtained A ⋆ as a solution to equation (3.4). In other words, datatypeL is isomorphic to the datatype which contains all finite sequences <strong>of</strong> some predefineddatatype A. This corresponds to the HASKELL [a] datatype, in general. Recall thatwe started from the “linked-list datatype” expressed in PASCAL or C. In fact, whereverthe C programmer thinks <strong>of</strong> linked-lists, the HASKELL programmer will think <strong>of</strong> finitesequences.But, what does equation (3.4) mean in fact? Is A ⋆ the only solution to this equation?Back to the numeric field, we know <strong>of</strong> equations which have more than one solution —for instance x = x2 +34, which admits two solutions 1 and 3 —, which have no solutionat all — for instance x = x + 1 —, or which admit an infinite number <strong>of</strong> — for instancex = x.We will address these topics in the next section about inductive datatypes and in chapter7, where the formal semantics <strong>of</strong> recursion will be made explicit. This is where the“limit” constructions used informally in this section will be shown to make sense.3.2 Introducing inductive datatypesDatatype L as defined by (3.4) is said to be recursive because L “recurs” in the definition<strong>of</strong> L itself 2 . From the discussion above, it is clear that set-theoretical equality “=” in thisequation should give place to set-theoretical isomorphism (“ ∼ =”):inL ∼ = 1 + A × L (3.6)Which isomorphism L 1 + A × L do we expect to witness (3.4)? This will dependon which particular solution to (3.4) we are thinking <strong>of</strong>. So far we have seen only one,A ⋆ . By recalling the notion <strong>of</strong> algebra <strong>of</strong> a datatype (section 2.18), so we may rephrasethe question as: which algebraA ⋆in1 + A × A ⋆do we expect to witness the tautology which arises from (3.4) by replacing unknown Lwith solution A ⋆ , that isA ⋆∼ = 1 + A × A⋆?2 By analogy, we may regard (3.3) as a “recursive definition” <strong>of</strong> number 2.


3.2. INTRODUCING INDUCTIVE DATATYPES 63[ [ ],cons · 〈hd, tl〉 ] · (= [ ] ?) = id≡ { property <strong>of</strong> sequences: cons(hd s, tl s) = s }[ [ ],id ] · (= [ ] ?) = id≡ { going pointwise (2.60) }{=[ ] a ⇒ [ [ ],id ](i 1 a)¬(= [ ] a) ⇒ [ [ ],id ](i 2 a)≡ { +-cancellation (2.38) }{=[ ] a ⇒ [ ]a¬(= [ ] a) ⇒ ida = a= a≡ { a = [ ] in one case and identity function (2.9) in the other }{a = [ ] ⇒ a¬(a = [ ]) ⇒ a = a≡ { property (p → f,f) = f holds }a = aA comment on the particular choice <strong>of</strong> terminology above: symbol in suggests thatwe are going inside, or constructing (synthesizing) values <strong>of</strong> A ⋆ ; symbol out suggests thatwe are going out, or destructing (analyzing) values <strong>of</strong> A ⋆ . We shall <strong>of</strong>ten resort to thisduality in the sequel.Are there more solutions to equation (3.6)? In trying to implement this equation, aHASKELL programmer could have written, after the declaration <strong>of</strong> type A, the followingdatatype declaration:data L = Nil () | Cons (A,L)which, as we have seen in section 2.18, can be written simply asand generates diagramdata L = Nil | Cons (A,L) (3.13)i 11 1 + A × LNilLi 2in ′ConsA × L(3.14)leading to algebra in ′ = [ Nil,Cons ].


64 CHAPTER 3. RECURSION IN THE POINTFREE STYLEHASKELL seems to have generated another solution for the equation, which it callsL. To avoid the inevitable confusion between this symbol denoting the newly createddatatype and symbol L in equation (3.6), which denotes a mathematical variable, let ususe symbol T to denote the former (T stands for “type”). This can be coped with verysimply by writing T instead <strong>of</strong> L in (3.13):data T = Nil | Cons (A,T) (3.15)In order to make T more explicit, we will write in T instead <strong>of</strong> in ′ .Some questions are on demand at this point. First <strong>of</strong> all, what is datatype T? Whatin Tare its inhabitants? Next, is T 1 + A × T an iso or not?HASKELL will help us to answer these questions. Suppose that A is a primitive numericdatatype, and that we add deriving Show to (3.15) so that we can “see” theinhabitants <strong>of</strong> the T datatype. The information associated to T is thus:Main> :i T-- type constructordata T-- constructors:Nil :: TCons :: (A,T) -> T-- instances:instance Show Tinstance Eval TBy typing NilMain> NilNil :: Twe confirm that Nil is itself an inhabitant <strong>of</strong> T, and by typing ConsMain> Cons :: (A,T) -> Twe realize that Cons is not so (as expected), but it can be used to build such inhabitants,for instance:Main> Cons(1,Nil)Cons (1,Nil) :: T


3.2. INTRODUCING INDUCTIVE DATATYPES 65orMain> Cons(2,Cons(1,Nil))Cons (2,Cons (1,Nil)) :: Tetc. We conclude that expressions involving Nil and Cons are inhabitants <strong>of</strong> type T. Arethese the only ones? The answer is yes because, by design <strong>of</strong> the HASKELL language,the constructors <strong>of</strong> type T will remain fixed once its declaration is interpreted, that is,no further constructor can be added to T. Does in T have an inverse? Yes, its inverse iscoalgebraout T : T 1 + A × Tout T Nil = i 1 NILout T (Cons(a,l)) = i 2 (a,l)(3.16)which can be straightforwardly encoded in HASKELL using the Either realization <strong>of</strong> +(recall sections 2.9 and 2.18):outT :: T -> Either () (A,T)outT Nil = Left ()outT (Cons(a,l)) = Right(a,l)In summary, isomorphismout TT ∼= 1 + A × T(3.17)in Tholds, where datatype T is inhabited by symbolic expressions which we may visualizevery conveniently as trees, for instanceCons❅ ❅❅Cons2 ❅ ❅❅ 1 Nilpicturing expression Cons(2,Cons(1,Nil)). Nil is the empty tree and Cons may be


66 CHAPTER 3. RECURSION IN THE POINTFREE STYLEregarded as the operation which adds a new root and a new branch, say a, to a tree t:Cons(a, ) =❅ t ❅❅Cons❅ ❅❅ a ❅ t ❅❅The choice <strong>of</strong> symbols T, Nil and Cons was rather arbitrary in (3.15). Therefore, analternative declaration such as, for instance,data U = Stop | Join (A,U) (3.18)would have been perfectly acceptable, generating another solution for the equation underalgebra [ Stop,Join ]. It is easy to check that (3.18) is but a renaming <strong>of</strong> Nil to Stop and<strong>of</strong> Cons to Join. Therefore, both datatypes are isomorphic, or “abstractly the same”.Indeed, any other datatype X inductively defined by a constant and a binary constructoraccepting A and X as parameters will be a solution to the equation. Because we arejust renaming symbols in a consistent way, all such solutions are abstractly the same. All<strong>of</strong> them capture the abstract notion <strong>of</strong> a list <strong>of</strong> symbols.We wrote “inductively” above because the set <strong>of</strong> all expressions (trees) which inhabitthe type is defined by induction. Such types are called inductive and we shall have a lotmore to say about them in chapter 7 .Exercise 3.1. Obviously,either (const []) (:)does not work as a HASKELL realization <strong>of</strong> the mediating arrow in diagram (3.9). Whatdo you need to write instead?✷3.3 Observing an inductive datatypeSuppose that one is asked to express a particular observation <strong>of</strong> an inductive such as Tf(3.15), that is, a function <strong>of</strong> signature B T for some target type B. Suppose, for


3.3. OBSERVING AN INDUCTIVE DATATYPE 67instance, that A is IN 0 (the set <strong>of</strong> all non-negative integers) and that we want to add allelements which occur in a T-list. Of course, we have to ensure that addition is availablein IN 0 ,add : IN 0 × IN 0 IN 0add(x,y) def = x + yand that 0 ∈ IN 0 is a value denoting “the addition <strong>of</strong> nothing”.0So constant arrowIN 0 1 is available. Of course, add(0,x) = add(x,0) = x holds, for all x ∈ IN 0 .This property means that IN 0 , together with operator add and constant 0, forms a monoid,a very important algebraic structure in computing which will be exploited intensively laterin this book. The following arrow “packaging” IN 0 , add and 0,[ 0,add ]IN 0 1 + IN 0 × IN 0(3.19)is a convenient way to express such a structure. Combining this arrow with the algebrain TT 1 + IN 0 × T(3.20)which defines T, and the function f we want to define, the target <strong>of</strong> which is B = IN 0 , weget the almost closed diagram which follows, in which only the dashed arrow is yet to befilled in:Tin T1 + IN 0 × T(3.21)fIN 0 1 + IN 0 × IN 0[ 0,add ]We know that in T = [ Nil,Cons ]. A pattern for the missing arrow is not difficult toguess: in the same way f bridges T and IN 0 on the lefthand side, it will do the same jobon the righthand side. So pattern · · · + · · · × f comes to mind (recall section 2.10), wherethe “· · ·” are very naturally filled in by identity functions. All in all, we obtain diagramT[ Nil,Cons ]1 + IN 0 × T(3.22)fid+id×fIN 0 1 + IN 0 × IN 0[ 0,add ]which pictures the following property <strong>of</strong> ff · [ Nil,Cons ] = [ 0,add ] · (id + id × f) (3.23)


68 CHAPTER 3. RECURSION IN THE POINTFREE STYLEand is easy to convert to pointwise notation:f · [ Nil,Cons ] = [ 0,add ] · (id + id × f)≡ { (2.40) on the lefthand side, (2.41) and identity id on the righthand side }[ f · Nil,f · Cons ] = [ 0,add · (id × f) ]≡ { either structural equality (2.58) }{f · Nil = 0f · Cons = add · (id × f)≡ { going pointwise }{(f · Nil)x = 0x(f · Cons)(a,x) = (add · (id × f))(a,x)≡ { composition (2.6), constant (2.12), product (2.22) and definition <strong>of</strong> add }{f Nil = 0f(Cons(a,x)) = a + f xNote that we could have used out T in diagram (3.21),Tout T1 + IN 0 × T(3.24)fid+id×fIN 0 1 + IN 0 × IN 0[ 0,add ]obtaining another version <strong>of</strong> the definition <strong>of</strong> f,f = [ 0,add ] · (id + id × f) · out T (3.25)which would lead to exactly the same pointwise recursive definition:f = [ 0,add ] · (id + id × f) · out T≡ { (2.41) and identity id on the righthand side }f = [ 0,add · (id × f) ] · out T≡ { going pointwise on out T (3.16) }{f Nil = ([ 0,add · (id × f) ] · outT )Nilf(Cons(a,x)) = ([ 0,add · (id × f) ] · out T )(a,x)≡ { definition <strong>of</strong> out T (3.16) }


3.3. OBSERVING AN INDUCTIVE DATATYPE 69{f Nil = ([ 0,add · (id × f) ] · i1 )Nilf(Cons(a,x)) = ([ 0,add · (id × f) ] · i 2 )(a,x)≡ { +-cancellation (2.38) }{f Nil = 0 Nilf(Cons(a,x)) = (add · (id × f))(a,x)≡ { simplification }{f Nil = 0f(Cons(a,x)) = a + f xPointwise f mirrors the structure <strong>of</strong> type T in having has many definition clauses asconstructors in T. Such functions are said to be defined by induction on the structure <strong>of</strong>their input type. If we repeat this calculation for IN 0 ⋆ instead <strong>of</strong> T, that is, forout = (! + 〈hd, tl〉) · (= [ ] ?)— recall (3.10) — taking place <strong>of</strong> out T , we get a “more algorithmic” version <strong>of</strong> f:f = [ 0,add ] · (id + id × f) · (! + 〈hd, tl〉) · (= [ ] ?)≡ { +-functor (2.42), identity and ×-absorption (2.25) }f = [ 0,add ] · (! + 〈hd,f · tl〉) · (= [ ] ?)≡ { +-absorption (2.41) and constant 0 }f = [ 0,add · 〈hd,f · tl〉 ] · (= [ ] ?)≡ { going pointwise on guard = [ ] ? (2.60) and simplifying }{l = [ ] ⇒ 0lf l =¬(l = [ ]) ⇒ (add · 〈hd,f · tl〉)l≡ { simplification }{l = [ ] ⇒ 0f l =¬(l = [ ]) ⇒ hd l + f(tll)The outcome <strong>of</strong> this calculation can be encoded in HASKELL syntax asf l | l == [] = 0| otherwise = head l + f (tail l)or


70 CHAPTER 3. RECURSION IN THE POINTFREE STYLEf l = if l == []then 0else head l + f (tail l)both requiring the equality predicate “==” and destructors “head” and “tail”.3.4 Synthesizing an inductive datatypeThe issue which concerns us in this section dualizes what we have just dealt with: instead<strong>of</strong> analyzing or observing an inductive type such as T (3.15), we want to be able tosynthesize (generate) particular inhabitants <strong>of</strong> T. In other words, we want to be able tospecify functions with signature B f T for some given source type B. Let B = IN 0and suppose we want f to generate, for a given natural number n > 0, the list containingall numbers less or equal to n in decreasing orderCons(n,Cons(n − 1,Cons(...,Nil)))or the empty list Nil, in case n = 0.Let us try and draw a diagram similar to (3.24) applicable to the new situation. Intrying to “re-use” this diagram, it is immediate that arrow f should be reversed. Bearingduality in mind, we may feel tempted to reverse all arrows just to see what happens.Identity functions are their own inverses, and in T takes the place <strong>of</strong> out T :fTIN 0in T1 + IN 0 × Tid+id×f 1 + IN 0 × IN 0Interestingly enough, the bottom arrow is the one which is not obvious to reverse, meaningthat we have to “invent” a particular destructor <strong>of</strong> IN 0 , sayIN 0g 1 + IN 0 × IN 0fitting in the diagram and generating the particular computational effect we have in mind.Once we do this, a recursive definition for f will pop out immediately,which is equivalent to:f = in T · (id + id × f) · g (3.26)f = [ Nil,Cons · (id × f) ] · g (3.27)


3.5. INTRODUCING (LIST) CATAS, ANAS AND HYLOS 71Because we want f 0 = Nil to hold, g (the actual generator <strong>of</strong> the computation) shoulddistinguish input 0 from all the others. One thus decomposes g as follows,IN 0= 0 ? IN 0 + IN 0 !+h 1 + IN 0 × IN 0gleaving h to fill in. This will be a split providing, on the lefthand side, for the value to beCons’ed to the output and, on the righthand side, for the “seed” to the next recursive call.Since we want the output values to be produced contiguously and in decreasing order, wemay define h = 〈id, pred〉 where, for n > 0,pred n def = n − 1 (3.28)computes the predecessor <strong>of</strong> n. Altogether, we have synthesizedFilling this in (3.27) we getg = (! + 〈id, pred〉) · (= 0 ?) (3.29)f = [ Nil,Cons · (id × f) ] · (! + 〈id, pred〉) · (= 0 ?)≡ { +-absorption (2.41) followed by ×-absorption (2.25) etc. }f = [ Nil,Cons · 〈id,f · pred〉 ] · (= 0 ?)≡ { going pointwise on guard = 0 ? (2.60) and simplifying }{n = 0 ⇒ Nilf n =¬(n = 0) ⇒ Cons(n,f (n − 1))which matches the function we had in mind:f n | n == 0 = Nil| otherwise = Cons(n,f(n-1))We shall see briefly that the constructions <strong>of</strong> the f function adding up a list <strong>of</strong> numbersin the previous section and, in this section, <strong>of</strong> the f function generating a list <strong>of</strong>numbers are very standard in algorithm design and can be broadly generalized. Let usfirst introduce some standard terminology.3.5 Introducing (list) catas, anas and hylosSuppose that, back to section 3.3, we want to multiply, rather than add, the elementsoccurring in lists <strong>of</strong> type T (3.15). How much <strong>of</strong> the program synthesis effort presentedthere can be reused in the design <strong>of</strong> the new function?


72 CHAPTER 3. RECURSION IN THE POINTFREE STYLE[ 0,add ]It is intuitive that only the bottom arrow IN 0 1 + IN 0 × IN 0 <strong>of</strong> diagram(3.24) needs to be replaced, because this is the only place where we can specify that targetdatatype IN 0 is now regarded as the carrier <strong>of</strong> another (multiplicative rather than additive)monoidal structure,[ 1,mul ]IN 0 1 + IN 0 × IN 0(3.30)for mul(x,y) def = x y. We are saying that the argument list is now to be reduced by themultiplication operator and that output value 1 is expected as the result <strong>of</strong> “nothing left tomultiply”.Moreover, in the previous section we might have wanted our number-list generator toproduce the list <strong>of</strong> even numbers smaller than a given number, in decreasing order (seeexercise 3.4). Intuition will once again help us in deciding that only arrow g in (3.26)needs to be updated.The following diagrams generalize both constructions by leaving such bottom arrowsunspecified,Tout T1 + IN 0 × TTin T1 + IN 0 × T(3.31)fBgid+id×f1 + IN 0 × BfBgid+id×f 1 + IN 0 × Band express their duality (cf. the directions <strong>of</strong> the arrows). It so happens that, for each<strong>of</strong> these diagrams, f is uniquely dependent on the g arrow, that is to say, each particularinstantiation <strong>of</strong> g will determine the corresponding f. So both gs can be regarded as“seeds” or “genetic material” <strong>of</strong> the f functions they uniquely define 3 .Following the standard terminology, we express these facts by writing f = (|g|) withrespect to the lefthand side diagram and by writing f = [(g)] with respect to the righthandside diagram. Read (|g|) as “the T-catamorphism induced by g” and [(g)] as “theT-anamorphism induced by g”. This terminology is derived from the Greek words κατα(cata) and ανα (ana) meaning, respectively, “downwards” and “upwards” (compare withthe direction <strong>of</strong> the f arrow in each diagram). The exchange <strong>of</strong> parentheses “( )” and “[ ]”in double parentheses “(| |)” and “[( )]” is aimed at expressing the duality <strong>of</strong> both concepts.We shall have a lot to say about catamorphisms and anamorphisms <strong>of</strong> a given typesuch as T. For the moment, it suffices to say that7 .g• the T-catamorphism induced by B 1 + IN 0 × B is the unique function B T3 The theory which supports the statements <strong>of</strong> this paragraph will not be dealt with until chapter(|g|)


3.5. INTRODUCING (LIST) CATAS, ANAS AND HYLOS 73which obeys to property (or is defined by)(|g|) = g · (id + id × (|g|)) · out T (3.32)which is the same as(|g|) · in T = g · (id + id × (|g|)) (3.33)• given B g 1 + IN 0 × B the T-anamorphism induced by g is the unique functionB [(g)] T which obeys to property (or is defined by)[(g)] = in T · (id + id × [(g)]) · g (3.34)From (3.31) it can be observed that T can act as a mediator between any T-anamorphism(|g|)and any T-catamorphism, that is to say, B T composes with T C , for someC h 1 + IN 0 × C . In other words, a T-catamorphism call always observe (consume)the output <strong>of</strong> a T-anamorphism. The latter produces a list <strong>of</strong> IN 0 s which is consumed bythe former. This is depicted in the diagram which follows:[(h)]Bg1 + IN 0 × B(3.35)(|g|)id+id×(|g|)Tin T1 + IN 0 × T[(h)]id+id×[(h)]Ch 1 + IN 0 × CWhat can we say about the (|g|) · [(h)] composition? It is a function from B to C which resortsto T as an intermediate data-structure and can be subject to the following calculation(cf. outermost rectangle in (3.35)):(|g|) · [(h)] = g · (id + id × (|g|)) · (id + id × [(h)]) · h≡ { +-functor (2.42) }(|g|) · [(h)] = g · ((id · id) + (id × (|g|)) · (id × [(h)])) · h≡ { identity and ×-functor (2.28) }(|g|) · [(h)] = g · (id + id × (|g|) · [(h)]) · h


74 CHAPTER 3. RECURSION IN THE POINTFREE STYLEThis calculation shows how to define Cwithout any intermediate data-structure:(|g|)·[(h)]Bin one go, that is to say, doingBg1 + IN 0 × B(3.36)(|g|)·[(h)]id+id×(|g|)·[(h)]Ch 1 + IN 0 × CAs an example, let us see what comes out <strong>of</strong> (|g|) · [(h)] for h and g respectively given by(3.29) and (3.30):(|g|) · [(h)] = g · (id + id × (|g|) · [(h)]) · h≡ { (|g|) · [(h)] abbreviated to f and instantiating h and g }f = [ 1,mul ] · (id + id × f) · (! + 〈id, pred〉) · (= 0 ?)≡ { +-functor (2.42) and identity }f = [ 1,mul ] · (! + (id × f) · 〈id, pred〉) · (= 0 ?)≡ { ×-absorption (2.25) and identity }f = [ 1,mul ] · (! + 〈id,f · pred〉) · (= 0 ?)≡ { +-absorption (2.41) and constant 1 (2.15) }f = [ 1,mul · 〈id,f · pred〉 ] · (= 0 ?)≡ { McCarthy conditional (2.59) }f = (= 0 ?) → 1,mul · 〈id,f · pred〉Going pointwise, we get — via (2.59) —andf 0 = [ 1,mul · 〈id,f · pred〉 ](i 1 0)= { +-cancellation (2.38) }1 0= { constant function (2.12) }f(n + 1) = [ 1,mul · 〈id,f · pred〉 ](i 2 (n + 1))= { +-cancellation (2.38) }1


3.5. INTRODUCING (LIST) CATAS, ANAS AND HYLOS 75mul · 〈id,f · pred〉(n + 1)= { pointwise definitions <strong>of</strong> split, identity, predecessor and mul }(n + 1) × f nIn summary, f is but the well-known factorial function:{f 0 = 1f(n + 1) = (n + 1) × f nThis result comes to no surprise if we look at diagram (3.35) for the particular g andh we have considered above and recall a popular “definition” <strong>of</strong> factorial:In fact, [(h)] n produces T-listn! = n × (n − 1) × ... × 1} {{ }n timesCons(n,Cons(n − 1,... Cons(1,Nil)))(3.37)as an intermediate data-structure which is consumed by (|g|) , the effect <strong>of</strong> which is but the“replacement” <strong>of</strong> Cons by × and Nil by 1, therefore accomplishing (3.37) and realizingthe computation <strong>of</strong> factorial.The moral <strong>of</strong> this example is that a function as simple as factorial can be decomposedinto two components (producer/consumer functions) which share a common intermediateinductive datatype. The producer function is an anamorphism which “represents” orproduces a “view” <strong>of</strong> its input argument as a value <strong>of</strong> the intermediate datatype. Theconsumer function is a catamorphism which reduces this intermediate data-structure andproduces the final result. Like factorial, many functions can be handsomely expressed bya (|g|) ·[(h)] composition for a suitable choice <strong>of</strong> the intermediate type, and <strong>of</strong> g and h. Theintermediate data-structure is said to be virtual in the sense that it only exists as a meansto induce the associated pattern <strong>of</strong> recursion and disappears by calculation.The composition (|g|) ·[(h)] <strong>of</strong> a T-catamorphism with a T-anamorphism is called a T-hylomorphism 4 and is denoted by [g,h]. Because g and h fully determine the behaviour<strong>of</strong> the [g,h] function, they can be regarded as the “genes” <strong>of</strong> the function they define. Aswe shall see, this analogy with biology will prove specially useful for algorithm analysisand classification.Exercise 3.2. A way <strong>of</strong> computing n 2 , the square <strong>of</strong> a given natural number n, is tosum up the n first odd numbers. In fact, 1 2 = 1, 2 2 = 1 + 3, 3 2 = 1 + 3 + 5, etc.,n 2 = (2n − 1) + (n − 1) 2 . Following this hint, express functionsq n def = n 2 (3.38)4 This terminology is derived from the Greek word vλoσ (hylos) meaning “matter”.


76 CHAPTER 3. RECURSION IN THE POINTFREE STYLEas a T-hylomorphism and encode it in HASKELL.✷Exercise 3.3. Write function x n as a T-hylomorphism and encode it in HASKELL.✷Exercise 3.4. The following function in HASKELL computes the T-sequence <strong>of</strong> all evennumbers less or equal to n:f n = if n


3.7. FUNCTORS 77Although a generic theory <strong>of</strong> arbitrary datatypes requires a theoretical elaborationwhich cannot be explained at once, we can move a step further by taking the two observationsabove as starting points. We shall start from the latter in order to talk genericallyabout inductive types. Then we introduce parameterization and functorial behaviour.Suppose that, as a mere notational convention, we abbreviate every expression <strong>of</strong> theform “1+IN 0 ×...” occurring in the previous section by “F ...”, e.g. 1+IN 0 ×B by F B,e.g. 1 + IN 0 × T by F Tout TT ∼= F T(3.39)etc. This is the same as introducing a datatype-level operatorin TF X def = 1 + IN 0 × X (3.40)which maps every datatype A into datatype 1 + IN 0 × A. Operator F captures the pattern<strong>of</strong> recursion which is associated to so-called “right” lists (<strong>of</strong> non-negative integers), thatis, lists which grow to the right. The slightly different pattern G X def = 1 + X × IN 0 willgenerate a different, although related, inductive typeX ∼ = 1 + X × IN 0 (3.41)— that <strong>of</strong> so-called “left” lists (<strong>of</strong> non-negative integers). And it is not difficult to think <strong>of</strong>the pattern which is merges both right and left lists and gives rise to bi-linear lists, betterknown as binary trees:X ∼ = 1 + X × IN 0 × X (3.42)One may think <strong>of</strong> many other expressions F X and guess the inductive datatype theygenerate, for instance HX def = IN 0 + IN 0 × X generating non-empty lists <strong>of</strong> non-negativeintegers (IN + 0 ). The general rule is that, given an inductive datatype definition <strong>of</strong> the formX ∼ = F X (3.43)(also called a domain equation), its pattern <strong>of</strong> recursion is captured by a so-called functorF.3.7 FunctorsThe concept <strong>of</strong> a functor F, borrowed from category theory, is a most generic and usefuldevice in programming 5 . As we have seen, F can be regarded as a datatype constructor5 The category theory practitioner must be warned <strong>of</strong> the fact that the word functor is used herein a too restrictive way. A proper (generic) definition <strong>of</strong> a functor will be provided later in this


78 CHAPTER 3. RECURSION IN THE POINTFREE STYLEwhich, given datatype A, builds a more elaborate datatype F A; given another datatypeB, builds a similarly elaborate datatype F B; and so on. But what is more importantand has the most beneficial consequences is that, if F is regarded as a functor, then itsdata-structuring effect extends smoothly to functions in the following way: suppose thatfB A is a function which observes A into B, which are parameters <strong>of</strong> F A and F B,respectively. By definition, if F is a functor then F B F A exists for every such f:F ffABF AF BF fF f extends f to F-structures and will, by definition, obey to two very basic properties: itcommutes with identityand with compositionTwo simple examples <strong>of</strong> a functor follow:F id A = id (F A) (3.44)F(g · h) = (F g) · (F h) (3.45)• Identity functor: define F X = X, for every datatype X, and F f = f. Properties(3.44) and (3.45) hold trivially just by removing symbol F wherever it occurs.• Constant functors: for a given C, define F X = C (for all datatypes X) and F f =id C , as expressed in the following diagram:AfBCCid CProperties (3.44) and (3.45) hold trivially again.In the same way functions can be unary, binary, etc., we can have functors with morethan one argument. So we get binary functors (also called bifunctors), ternary functorsetc.. Of course, properties (3.44) and (3.45) have to hold for every parameter <strong>of</strong> an n-aryfunctor. For a binary functor B, for instance, equation (3.44) becomesbook.B (id A ,id B ) = id B (A,B) (3.46)


3.8. POLYNOMIAL FUNCTORS 79Data construction Universal construct Functor DescriptionA × B 〈f, g〉 f × g ProductA + B [ f, g ] f + g CoproductB A f f A ExponentialTable 3.1: Datatype constructions and associated operators.and equation (3.45) becomesB (g · h,i · j) = B (g,i) · B (h,j) (3.47)Product and coproduct are typical examples <strong>of</strong> bifunctors. In the former case onehas B (A,B) = A × B and B (f,g) = f × g — recall (2.22). Properties (2.29) and(2.28) instantiate (3.46) and (3.47), respectively, and this explains why we called themthe functorial properties <strong>of</strong> product. In the latter case, one has B (A,B) = A + B andB (f,g) = f + g — recall (2.37) — and functorial properties (2.43) and (2.42). Finally,exponentiation is a functorial construction too: assuming A, one has F X def = X A andF f def = f · ap and functorial properties (2.73) and (2.74). All this is summarized in table3.1.Such as functions, functors may compose with each other in the obvious way: thecomposition <strong>of</strong> F and G, denoted F · G, is defined by(F · G)X def = F (G X) (3.48)(F · G)f3.8 Polynomial functorsdef= F (G f) (3.49)We may put constant, product, coproduct and identity functors together to obtain so-calledpolynomial functors, which are described by polynomial expressions, for instanceF X = 1 + A × X— recall (3.6). A polynomial functor is either• a constant functor or the identity functor, or• the (finitary) product or coproduct (sum) <strong>of</strong> other polynomial functors, or• the composition <strong>of</strong> other polynomial functors.


80 CHAPTER 3. RECURSION IN THE POINTFREE STYLESo the effect on arrows <strong>of</strong> a polynomial functor is computed in an easy and structuredway, for instance:F f= (1 + A × X)f= { sum <strong>of</strong> two functors where A is a constant and X is a variable }(1)f + (A × X)f= { constant functor and product <strong>of</strong> two functors }id 1 + (A)f × (X)f= { constant functor and identity functor }id 1 + id A × f= { subscripts dropped for simplicity }id + id × fSo, 1 + A × f denotes the same as id 1 + id A × f, or even the same as id + id × f if onedrops the subscripts.It should be clear at this point that what was referred to in section 2.10 as a “symbolicpattern” applicable to both datatypes and arrows is after all a functor in the mathematicalsense. The fact that the same polynomial expression is used to denote both the dataand the operators which structurally transform such data is <strong>of</strong> great conceptual economyand practical application. For instance, once polynomial functor (3.40) is assumed, thediagrams in (3.31) can be written as simply asTout TF TTin TF T(3.50)fBgF BF ffBg F BF fIt is useful to know that, thanks to the isomorphism laws studied in chapter 2, everypolynomial functor F may be put into the canonical form,F X ∼ = C 0 + (C 1 × X) + (C 2 × X 2 ) + · · · + (C n × X n )= ∑ ni=0 C i × X i (3.51)and that Newton’s binomial formula(A + B) n ∑∼ n = n C p × A n−p × B p (3.52)p=0


3.9. POLYNOMIAL INDUCTIVE TYPES 81can be used in such conversions. These are performed up to isomorphism, that is to say,after the conversion one gets a different but isomorphic datatype. Consider, for instance,functorF X def = A × (1 + X) 2(where A is a constant datatype) and check the following reasoning:F X = A × (1 + X) 2∼= { law (2.87) }A × ((1 + X) × (1 + X))∼= { law (2.50) }A × ((1 + X) × 1 + (1 + X) × X))∼= { laws (2.81), (2.31) and (2.50) }A × ((1 + X) + (1 × X + X × X))∼= { laws (2.81) and (2.87) }A × ((1 + X) + (X + X 2 ))∼= { law (2.46) }A × (1 + (X + X) + X 2 )∼= { canonical form obtained via laws (2.50) and (2.88) }A }{{}+ A × 2} {{ }C 0C 1×X + A }{{}C 2×X 2Exercise 3.5. Synthesize the isomorphism A + A × 2 × X + A × X 2 νA × (1 + X 2 )implicit in the above reasoning.✷3.9 Polynomial inductive typesAn inductive datatype is said to be polynomial wherever its pattern <strong>of</strong> recursion is describedby a polynomial functor, that is to say, wherever F in equation (3.43) is polynomial.For instance, datatype T (3.20) is polynomial (n = 1) and its associated polynomial


82 CHAPTER 3. RECURSION IN THE POINTFREE STYLEfunctor is canonically defined with coefficients C 0 = 1 and C 1 = IN 0 . For reasons thatwill become apparent later on, we shall always impose C 0 ≠ 0 to hold in a polynomialdatatype expressed in canonical form.Polynomial types are easy to encode in HASKELL wherever the associated functor isin canonical polynomial form, that is, wherever one hasThen we haveT ∼∑ = ni=0C i × T i(3.53)in Tin Tdef= [ f 1 ,... ,f n ]where, for i = 1,n, f i is an arrow <strong>of</strong> type T C i × T i . Since n is finite, one mayexpand exponentials according to (2.87) and encode this in HASKELL as follows:data T = C0 |C1 (C1,T) |C2 (C2,(T,T)) |... |Cn (Cn,(T, ..., T))Of course the choice <strong>of</strong> symbol Ci to realize each f i is arbitrary 6 . Several instances <strong>of</strong>polynomial inductive types (in canonical form) will be mentioned in section 3.13. Section3.17 will address the conversion between inductive datatypes induced by so-called naturaltransformations.The concepts <strong>of</strong> catamorphism, anamorphism and hylomorphism introduced in section3.5 can be extended to arbitrary polynomial types. We devote the following sectionsto explaining catamorphisms in the polynomial setting. Polynomial anamorphisms andhylomorphisms will not be dealt with until chapter 7.3.10 F-algebras and F-homomorphismsOur interest in polynomial types is basically due to the fact that, for polynomial F, equation(3.43) always has a particularly interesting solution which corresponds to our notion<strong>of</strong> a recursive datatype.6 A more traditional (but less close to (3.53)) encoding will bedata T = C0 | C1 C1 T | C2 C2 T T | ... | Cn Cn T ... T (3.54)delivering every constructor in curried form.


3.11. F-CATAMORPHISMS 83In order to explain this, we need two notions which are easy to understand: first, thatα<strong>of</strong> an F-algebra, which simply is any function α <strong>of</strong> signature A F A . A is calledthe carrier <strong>of</strong> F-algebra α and contains the values which α manipulates by computingnew A-values out <strong>of</strong> existing ones, according to the F-pattern (the “type” <strong>of</strong> the algebra).As examples, consider [ 0,add ] (3.19) and in T (3.20), which are both algebras <strong>of</strong> typeF X = 1+IN 0 × X. The type <strong>of</strong> an algebra clearly determines its form. For instance, anyalgebra α <strong>of</strong> type F X = 1+X ×X will be <strong>of</strong> form [ α 1 ,α 2 ], where α 1 is a constant andα 2 is a binary operator. So monoids are algebras <strong>of</strong> this type 7 .Secondly, we introduce the notion <strong>of</strong> an F-homomorphism which is but a functionobserving a particular F-algebra α into another F-algebra β:AαF Af · α = β · (F f) (3.55)fBβF BF fClearly, f can be regarded as a structural translation between A and B, that is, A andB have a similar structure 8 . Note that — thanks to (3.44) — identity functions arealways (trivial) F-homomorphisms and that — thanks to (3.45) — these homomorphismscompose, that is, the composition <strong>of</strong> two F-homomorphisms is an F-homomorphism.3.11 F-catamorphismsAn F-algebra can be epic, monic or both, that is, iso. Iso F-algebras are particularlyrelevant to our discussion because they describe solutions to the X ∼ = F X equation (3.43).Moreover, for polynomial F a particular iso F-algebra always exists, which is denoted byinµF F µF and has special properties. First, its carrier is the smallest among thecarriers <strong>of</strong> other iso F-algebras, and this is why it is denoted by µF — µ for “minimal” 9 .Second, it is the so-called initial F-algebra. What does this mean?It means that, for every F-algebra α there exists one and only one F-homomorphismbetween in and α. This unique arrow mediating in and α is therefore determined byα itself, and is called the F-catamorphism generated by α. This construct, which wasintroduced in 3.5, is in general denoted by (|α|) F:7 But not every algebra <strong>of</strong> this type is a monoid, since the type <strong>of</strong> an algebra only fixes its syntaxand does not impose any properties such as associativity, etc.8 Cf. homomorphism = homo (the same) + morphos (structure, shape).9 µF means the least fixpoint solution <strong>of</strong> equation X ∼ = FX, as will be described in chapter 7 .


84 CHAPTER 3. RECURSION IN THE POINTFREE STYLEµFinF µF(3.56)f=(|α|) FAαF AF(|α|) FWe will drop the F subscript in (|α|) Fwherever deducible from the context, and <strong>of</strong>ten callα the “gene” <strong>of</strong> (|α|) F.As happens with splits, eithers and transposes, the uniqueness <strong>of</strong> the catamorphismconstruct is captured by a universal property established in the class <strong>of</strong> all F-homomorphisms:k = (|α|) ⇔ k · in = α · F k (3.57)According to the experience gathered from section 2.12 onwards, a few properties can beexpected as consequences <strong>of</strong> (3.57). For instance, one may wonder about the “gene” <strong>of</strong>the identity catamorphism. Just let k = id in (3.57) and see what happens:id = (|α|) ⇔ id · in = α · F id= { identity (2.10) and F is a functor (3.44) }id = (|α|) ⇔ in = α · id= { identity (2.10) once again }id = (|α|) ⇔ in = α= { α replaced by in and simplifying }id = (|in|)Thus one finds out that the genetic material <strong>of</strong> the identity catamorphism is the initialalgebra in. Which is the same as establishing the reflection property <strong>of</strong> catamorphisms:Cata-reflection :µFinF µF(|in|) = id µF (3.58)(|in|)µF F µFinF(|in|)In a more intuitive way, one might have observed that (|in|) is, by definition <strong>of</strong> in, theunique arrow mediating µF and itself. But another arrow <strong>of</strong> the same type is alreadyknown: the identity id µF . So these two arrows must be the same.Another property following immediately from (3.57), for k = (|α|), is


3.11. F-CATAMORPHISMS 85Cata-cancellation :Because in is iso, this law can be rephrased as followswhere out denotes the inverse <strong>of</strong> in:(|α|) · in = α · F (|α|) (3.59)(|α|) = α · F (|α|) · out (3.60)µFout∼= F µFinNow, let f be F-homomorphism (3.55) between F-algebras α and β. How does itrelate to (|α|) and (|β|)? Note that f · (|α|) is an arrow mediating µF and B. But B is thecarrier <strong>of</strong> β and (|β|) is the unique arrow mediating µF and B. So the two arrows are thesame:Cata-fusion :(|α|)µFAfBinαβF µFF AF BF(|α|)F ff · (|α|) = (|β|) if f · α = β · F f (3.61)Of course, this law is also a consequence <strong>of</strong> the universal property, for k = f · (|α|):f · (|α|) = (|β|) ⇔(f · (|α|)) · in = β · F (f · (|α|))⇔ { composition is associative and F is a functor (3.45) }f · ((|α|) · in) = β · (F f) · (F (|α|))⇔ { cata-cancellation (3.59) }f · α · F (|α|) = β · F f · F (|α|)⇔ { require f to be a F-homomorphism (3.55) }f · α · F (|α|) = f · α · F (|α|) ∧ f · α = β · F f⇐ { simplify }f · α = β · F fThe presentation <strong>of</strong> the absorption property <strong>of</strong> catamorphisms entails the very importantissue <strong>of</strong> parameterization and deserves to be treated in a separate section, as follows.


86 CHAPTER 3. RECURSION IN THE POINTFREE STYLE3.12 Parameterization and type functorsBy analogy with what we have done about splits (product), eithers (coproduct) and transposes(exponential), we now look forward to identifying F-catamorphisms which exhibitfunctorial behaviour.Suppose that one wishes to square all numbers which are members <strong>of</strong> lists <strong>of</strong> type T(3.20). It can be checked that(|[ Nil,Cons · (sq × id) ]|) (3.62)sqwill do this for us, where IN 0 IN 0 is given by (3.38). This catamorphism, whichconverted to pointwise notation is nothing but function h which follows{hNil = Nilh(Cons(a,l)) = Cons(sqa,hl)maps type T to itself. This is because sq maps IN 0 to IN 0 . Now suppose that, instead <strong>of</strong> sq,fone would like to apply a given function B IN 0 (for some B other than IN 0 ) to allelements <strong>of</strong> the argument list. It is easy to see that it suffices to replace f for sq in (3.62).However, the output type no longer is T, but rather type T ′ ∼ = 1 + B × T ′ .Types T and T ′ are very close to each other. They share the same “shape” (recursivepattern) and only differ with respect to the type <strong>of</strong> elements — IN 0 in T and B in T ′ . Thissuggests that these two types can be regarded as instances <strong>of</strong> a more generic list datatypeListList X ∼ = 1 + X × List X(3.63)in=[ Nil,Cons ]in which the type <strong>of</strong> elements X is allowed to vary. Thus one has T = List IN 0 andT ′ = List B.fBy inspection, it can be checked that, for every B A ,maps List A to ListB. Moreover, for f = id one has:(|[ Nil,Cons · (f × id) ]|) (3.64)(|[ Nil,Cons · (id × id) ]|)= { by the ×-functor-id property (2.29) and identity }(|[ Nil,Cons ]|)= { cata-reflection (3.58) }id


3.12. PARAMETERIZATION AND TYPE FUNCTORS 87Therefore, by definingListfdef= (|[ Nil,Cons · (f × id) ]|)what we have just seen can be written thus:List id A= id List AThis is nothing but law (3.44) for F replaced by List. Moreover, it will not be too difficultto check thatList (g · f) = List g · List falso holds — cf. (3.45). Altogether, this means that List can be regarded as a functor.In programming terminology one says that List X (the “lists <strong>of</strong> Xs datatype”) is parametricand that, by instantiating parameter X, one gets ground lists such as lists <strong>of</strong> integers,booleans, etc. The illustration above deepens one’s understanding <strong>of</strong> parameterizationby identifying the functorial behaviour <strong>of</strong> the parametric datatype along with itsparameter instantiations.All this can be broadly generalized and leads to what is commonly known by a typefunctor. First <strong>of</strong> all, it should be clear that the generic formatT ∼ = F Tadopted so far for the definition <strong>of</strong> an inductive type is not sufficiently detailed becauseit does not provide a parametric view <strong>of</strong> T. For simplicity, let us suppose (for the momement)that only one parameter is identified in T. Then we may factor this out via typevariable X and write (overloading symbol T)TX ∼ = B(X,TX)where B is called the type’s base functor. Binary functor B(X,Y ) is given this namebecause it is the basis <strong>of</strong> the whole inductive type definition. By instantiation <strong>of</strong> X oneobtains F. In the example above, B (X,Y ) = 1 + X × Y and in fact F Y = B (IN 0 ,Y ) =1 + IN 0 × Y , recall (3.40). Moreover, one hasF f = B (id,f) (3.65)and so every F-homomorphism can be written in terms <strong>of</strong> the base-functor <strong>of</strong> F, e.g.instead <strong>of</strong> (3.55).f · α = β · B (id,f)


88 CHAPTER 3. RECURSION IN THE POINTFREE STYLETX will be referred to as the type functor generated by B:TX }{{}type functor∼= B(X,TX)} {{ }base functorWe proceed to the description <strong>of</strong> its functorial behaviour — Tf — for a given B A .As far as typing rules are concerned, we shall havefBTBfT fATASo we should be able to express Tf as a B (A, )-catamorphism (|g|):ATAin T AB (A,TA)fT f=(|g|)B (id,T f)B TB B (A,TB)gAs we know that in TB is the standard constructor <strong>of</strong> values <strong>of</strong> type TB, we may put itinto the diagram too:ATAin T AB (A,TA)fT f=(|g|)B (id,T f)B TB gB (A,TB)in T BB (B,TB)The catamorphism’s gene g will be synthesized by filling the dashed arrow in the diagramwith the “obvious” B (f,id), whereby one getsTfdef= (|in TB · B (f,id)|) (3.66)and a final diagram, where in T A is abbreviated by in A (ibid. in T B by in B ):ATAin AB (A,TA)fT f=(|in B·B (f,id)|)B TB B (B,TB)in BB (f,id)B (id,T f)B (A,TB)


3.12. PARAMETERIZATION AND TYPE FUNCTORS 89Next, we proceed to derive the useful law <strong>of</strong> cata-absorption(|g|) · Tf = (|g · B (f,id)|) (3.67)as a consequence <strong>of</strong> the laws studied in section 3.11. Our target is to show that, fork = (|g|) · Tf in (3.57), one gets α = g · B (f,id):(|g|) · Tf = (|α|)⇔ { type-functor definition (3.66) }(|g|) · (|in B · B (f,id)|) = (|α|)⇐ { cata-fusion (3.61) }(|g|) · in B · B (f,id) = α · B (id,(|g|))⇔ { cata-cancellation (3.59) }g · B (id,(|g|)) · B (f,id) = α · B (id,(|g|))⇔ { B is a bi-functor (3.47) }g · B (id · f,(|g|) · id) = α · B (id,(|g|))⇔ { id is natural (2.11) }g · B (f · id,id · (|g|)) = α · B (id,(|g|))⇔ { (3.47) again, this time from left to right }g · B (f,id) · B (id,(|g|)) = α · B (id,(|g|))⇐ { obvious }g · B (f,id) = αThe following diagram pictures this property <strong>of</strong> catamorphisms:ATAin AB (A,TA)fCT f(|g|)TCDB (C,TC)in CgB (C,D)B (f,id)B (id,(|g|))B (f,id)B(id,T f)B (A,TC)B (A,D)B(id,(|g|))It remains to show that (3.66) indeed defines a functor. This can be verified by checkingproperties (3.44) and (3.45) for F = T :


90 CHAPTER 3. RECURSION IN THE POINTFREE STYLE• Property type-functor-id, cf. (3.44):Tid= { by definition (3.66) }(|in B · B (id,id)|)= { B is a bi-functor (3.46) }(|in B · id|)= { identity and cata-reflection (3.58) }id• Property type-functor, cf. (3.45) :T(f · g)= { by definition (3.66) }(|in B · B (f · g,id)|)= { id · id = id and B is a bi-functor (3.47) }(|in B · B (f,id) · B (g,id)|)= { cata-absorption (3.67) }(|in B · B (f,id)|) · Tg= { again cata-absorption (3.67) }(|in B |) · Tf · Tg= { cata-reflection (3.58) followed by identity }Tf · Tg3.13 A catalogue <strong>of</strong> standard polynomial inductivetypesThe following table contains a collection <strong>of</strong> standard polynomial inductive types and associatedbase type bi-functors, which are in canonical form (3.53). The table contains twoextra columns which may be used as bookmarks for equations (3.65) and (3.66), respec-


3.13. A CATALOGUE OF STANDARD POLYNOMIAL INDUCTIVE TYPES91tively 10 :Description TX B (X,Y ) B (id,f) B (f,id)“Right” Lists List X 1 + X × Y id + id × f id + f × id“Left” Lists LList X 1 + Y × X id + f × id id + id × fNon-empty Lists NList X X + X × Y id + id × f f + f × idBinary Trees BTree X 1 + X × Y 2 id + id × f 2 id + f × id“Leaf” Trees LTree X X + Y 2 id + f 2 f + id(3.68)All type functors T in this table are unary. In general, one may think <strong>of</strong> inductivedatatypes which exhibit more than one type parameter. Should n parameters be identifiedin T, then this will be based on an n + 1-ary base functor B, cf.T(X 1 ,...,X n ) ∼ = B(X1 ,...,X n ,T(X 1 ,... ,X n ))So, every n + 1-ary polynomial functor B(X 1 ,...,X n ,X n+1 ) can be identified as thebasis <strong>of</strong> an inductive n-ary type functor (the convention is to stick to the canonical formand reserve the last variable X n+1 for the “recursive call”). While type bi-functors (n = 2)are <strong>of</strong>ten found in programming, the situation in which n > 2 is relatively rare. Forinstance, the combination <strong>of</strong> leaf-trees with binary-trees in (3.68) leads to the so-called“full tree” type bi-functorDescription T(X 1 ,X 2 ) B(X 1 ,X 2 ,Y ) B(id,id,f) B(f,g,id)“Full” Trees FTree(X 1 ,X 2 ) X 1 + X 2 × Y 2 id + id × f 2 f + g × id (3.69)As we shall see later on, these types are widely used in programming. In the actualencoding <strong>of</strong> these types in HASKELL, exponentials are normally expanded to productsaccording to (2.87), see for instancedata BTree a = Empty | Node(a, (BTree a, BTree a))Moreover, one may chose to curry the type constructors as in, e.g.data BTree a = Empty | Node a (BTree a) (BTree a)Exercise 3.6. Write as a catamorphisms• the function which counts the number <strong>of</strong> elements <strong>of</strong> a non-empty list (type NListin (3.68)).10 Since (id A ) 2 = id (A2 ) one writes id 2 for id in this table.


92 CHAPTER 3. RECURSION IN THE POINTFREE STYLE• the function which computes the maximum element <strong>of</strong> a binary-tree <strong>of</strong> natural numbers.✷Exercise 3.7. Characterize the function which is defined by (|[ [],h ]|) for each <strong>of</strong> thefollowing definitions <strong>of</strong> h:h(x,(y 1 ,y 2 )) = y 1 + [x] + y 2 (3.70)h = + · (singl × +) (3.71)h = + · ( + × singl) · swap (3.72)assuming singl a = [a]. Identify in (3.68) which datatypes are involved as base functors.✷Exercise 3.8. Write as a catamorphism the function which computes the frontier <strong>of</strong> a tree<strong>of</strong> type LTree (3.68), listed from left to right.✷3.14 Functors and type functors in HASKELLThe concept <strong>of</strong> a (unary) functor is provided in HASKELL in the form <strong>of</strong> a particular classexporting the fmap operator:class Functor f wherefmap :: (a -> b) -> (f a -> f b)So fmap g encodes F g once we declare F as an instance <strong>of</strong> class Functor. The mostpopular use <strong>of</strong> fmap has to do with HASKELL lists, as allowed by declarationinstance Functor [] wherefmap f [] = []fmap f (x:xs) = f x : fmap f xs


3.15. THE MUTUAL-RECURSION LAW 93in language’s Standard Prelude.In order to encode the type functors we have seen so far we have to do the sameconcerning their declaration. For instance, should we writeinstance Functor BTreewhere fmap f =cataBTree ( inBTree . (id -|- (f >< id)) )concerning the binary-tree datatype <strong>of</strong> (3.68) and assuming appropriate declarations <strong>of</strong>cataBTree and inBTree, then fmap is overloaded and used across such binary-trees.Bi-functors can be added easily by writingclass BiFunctor f wherebmap :: (a -> b) -> (c -> d) -> (f a c -> f b d)Exercise 3.9. Declare all datatypes in (3.68) in HASKELL notation and turn them intoHASKELL type functors, that is, define fmap in each case.✷Exercise 3.10. Declare datatype (3.69) in HASKELL notation and turn it into an instance<strong>of</strong> class BiFunctor.✷3.15 The mutual-recursion lawThe theory developed so far for building (and reasoning about) recursive functions doesn’tcope with mutual recursion. As a matter <strong>of</strong> fact, the pattern <strong>of</strong> recursion <strong>of</strong> a givencata(ana,hylo)morphism involves only the recursive function being defined, even thoughmore than once, in general, as dictated by the relevant base functor.It turns out that rules for handling mutual recursion are surprisingly simple to calculate.As motivation, recall section 2.10 where, by mixing products with coproducts, weobtained a result — the exchange rule (2.47) — which stemmed from putting together thetwo universal properties <strong>of</strong> product and coproduct, (2.55) and (2.57), respectively.


94 CHAPTER 3. RECURSION IN THE POINTFREE STYLEThe question we want to address in this section is <strong>of</strong> the same brand: what can onetell about catamorphisms which output pairs <strong>of</strong> values? By (2.55), such catamorphismsare bound to be splits, as are the corresponding genes:TinF T(|〈h,k〉|)F(|〈h,k〉|)A × B F (A × B)〈h,k〉As we did for the exchange rule, we put (2.55) and the universal property <strong>of</strong> catamorphisms(3.57) against each other and calculate:〈f,g〉 = (|〈h,k〉|)≡ { cata-universal (3.57) }〈f,g〉 · in = 〈h,k〉 · F 〈f,g〉≡ { ×-fusion (2.24) twice }〈f · in,g · in〉 = 〈h · F 〈f,g〉,k · F 〈f,g〉〉≡ { (2.56) }f · in = h · F 〈f,g〉 ∧ g · in = k · F 〈f,g〉The rule thus obtained,{f · in = h · F 〈f,g〉g · in = k · F 〈f,g〉≡ 〈f,g〉 = (|〈h,k〉|) (3.73)is referred to as the mutual recursion law (or as “Fokkinga’s law”) and is useful in combiningtwo mutually recursive functions f and gTinF TTinF TfF 〈f,g〉A F (A × B)hgF 〈f,g〉B F (A × B)kinto a single catamorphism.When applied from left to right, law (3.73) is surprisingly useful in optimizing recursivefunctions in a way which saves redundant traversals <strong>of</strong> the input inductive type T.Let us take the Fibonacci function as example:fib 0 = 1fib 1 = 1fib(n + 2) = fib(n + 1) + fib n


3.15. THE MUTUAL-RECURSION LAW 95It can be shown that fib is a hylomorphism <strong>of</strong> type LTree (3.68), fib = [count,fibd], forcount = [ 1,add ], add(x,y) = x + y and fibd n = if n < 2 then i 1 Nil else i 2 (n −1,n − 2). This hylo-factorization <strong>of</strong> fib tells its internal algorithmic structure: the dividestep [(fibd)] builds a tree whose number <strong>of</strong> leaves is a Fibonacci number; the conquer step(|count|) just counts such leaves.There is, <strong>of</strong> course, much re-calculation in this hylomorphism. Can we improve itsperformance? The clue is to regard the two instances <strong>of</strong> fib in the recursive branch asmutually recursive over the natural numbers. This clue is suggested not only by fibhaving two base cases (so, perhaps it hides two functions) but also by the lookahead n+2in the recursive clause.We start by defining a function which reduces such a lookahead by 1,f n = fib(n + 1)Clearly, f(n + 1) = fib(n + 2) = f n + fib n and f 0 = fib 1 = 1. Putting f and fibtogther,f 0 = 1f(n + 1) = f n + fib nfib 0 = 1fib(n + 1) = f nwe obtain two mutually recursive functions over the natural numbers (IN 0 ) which transforminto pointfree equalitiesoverf · [ 0,suc ] = [ 1,add · 〈f,fib〉 ]fib · [ 0,suc ] = [ 1,f ]1 + IN 0IN 0 ∼= } {{ }(3.74)in=[ 0,suc ]FIN 0Reverse +-absorption (2.41) will further enable us to rewrite the above int<strong>of</strong> · in = [ 1,add ] · F 〈f,fib〉fib · in = [ 1,π 1 ] · F 〈f,fib〉thus bringing functor F f = id + f explicit and preparing for mutual recursion removal:f · in = [ 1,add ] · F 〈f,fib〉fib · in = [ 1,π 1 ] · F 〈f,fib〉


96 CHAPTER 3. RECURSION IN THE POINTFREE STYLE≡ { (3.73) }〈f,fib〉 = (|〈[ 1,add ],[ 1,π 1 ]〉|)≡ { exchange law (2.47) }〈f,fib〉 = (|[ 〈1,1〉, 〈add,π 1 〉 ]|)≡ { going pointwise and denoting 〈f,fib〉 by fib ′ }{fib ′ 0 = (1,1)fib ′ (n + 1) = (x + y,x) where (x,y) = fib ′ nSince fib = π 2 · fib ′ we easily recover fib from fib ′ and obtain the intended linearversion <strong>of</strong> Fibonacci (encoded in Haskell):fib n = y where (x,y) = fib’ nfib’ 0 = (1,1)fib’ (n+1) = (x+y,x)where (x,y) = fib’ nThis version <strong>of</strong> fib is actually the semantics <strong>of</strong> the “for-loop” one would write in animperative language which would initialize two global variables x,y := 1,1, loop overassignment x,y := x + y,x and yield the result in y. In the C programming language,one would writeint fib(int n){int x=1; int y=1; int i;for (i=1;i


3.15. THE MUTUAL-RECURSION LAW 97answer is given by functionthat is,i 2i=1,nns n def = ∑ns 0 = 0ns(n + 1) = (n + 1) 2 + ns nin Haskell. However, this hylomorphism is inefficient because each iteration involvesanother hylomorphism computing square numbers.One way <strong>of</strong> improving ns is to introduce function bnm n def = (n + 1) 2 and expressthis over (3.74),bnm 0 = 1bnm(n + 1) = 2n + 3 + bnm nhoping to blend ns with bnm using the mutual recursion law. However, the same problemarises in bnm itself, which now depends on term 2n + 3. We invent lin n def = 2 n + 3 andrepeat the process, thus obtaining:By redefininglin 0 = 3lin(n + 1) = 2 + lin nbnm ′ 0 = 1bnm ′ (n + 1) = lin n + bnm ′ nns ′ 0 = 0ns ′ (n + 1) = bnm ′ n + ns ′ nwe obtain three functions — ns ′ , bnm ′ and lin — mutually recursive over the polynomialbase F g = id + g <strong>of</strong> the natural numbers.Exercise 3.11 below shows how to extend (3.73) to three mutually recursive functions(3.75). (From this it is easy to extend it further to the n-ary case.) It is routine work toshow that, by application <strong>of</strong> (3.75) to the above three functions, one obtains the linearversion <strong>of</strong> ns which follows:


98 CHAPTER 3. RECURSION IN THE POINTFREE STYLEns’’ n = let (a,b,c) = aux n in awhereaux 0 = (0,1,3)aux(n+1) = let (a,b,c) = aux nin (a+b,b+c,2+c)In retrospect, note that (in general) not every system <strong>of</strong> n mutually recursive functions⎧⎪⎨⎪⎩f 1 = φ 1 (f 1 ,...,f n ).f n = φ n (f 1 ,...,f n )involving n functions and n functional combinators φ 1 ,... ,φ n can be handled by a suitablyextended version <strong>of</strong> (3.73). This only happens if all f i have the same “shape”, thatis, if they share the same base functor F.Exercise 3.11. Show that law (3.73) generalizes to more than two mutually recursivefunctions, in this case three:✷⎧⎪⎨⎪⎩f · in = h · F 〈f, 〈g,j〉〉g · in = k · F 〈f, 〈g,j〉〉j · in = l · F 〈f, 〈g,j〉〉≡ 〈f, 〈g,j〉〉 = (|〈h, 〈k,l〉〉|) (3.75)Exercise 3.12. The exponential function e x : IR → IR (where “e” denotes Euler’snumber) can be defined in several ways, one being the calculation <strong>of</strong> Taylor series:e x =∞∑n=0x nn!(3.76)The following function, in Haskell,exp :: Double -> Integer -> Doubleexp x 0 = 1exp x (n+1) = xˆ(n+1) / fac (n+1) + (exp x n)computes an approximation <strong>of</strong> e x , where the second parameter tells how many terms tocompute. For instance, while exp 1 1 = 2.0, exp 1 10 yields 2.7182818011463845.


3.15. THE MUTUAL-RECURSION LAW 99Function exp x n performs badly for n larger and larger: while exp 1 100 runs instantaneously,exp 1 1000 takes around 9 seconds, exp 1 2000 takes circa 33 seconds, and soon.Decompose exp into mutually recursive functions so as to apply (3.75) and obtain thefollowing linear version:exp x n = let (e,b,c) = aux x nin e whereaux x 0 = (1,2,x)aux x (n+1) = let (e,s,h) = aux x nin (e+h,s+1,(x/s)*h)✷Exercise 3.13. From the following basic properties <strong>of</strong> addition and multiplication,show that a ∗ n is the “for-loop” (a+) n 0.✷a ∗ 0 = 0 (3.77)a ∗ 1 = a (3.78)a ∗ (b + c) = a ∗ b + a ∗ c (3.79)Exercise 3.14. Show that, for all n ∈ IN 0 , n = suc n 0. Hint: use cata-reflexion (3.58).✷As example <strong>of</strong> application <strong>of</strong> (3.73) for T other than IN 0 , consider the following recursivepredicate which checks whether a (non-empty) list is ordered,ord : A + 2ord[a] = TRUEord(cons(a,l)) = a ≥ (listMaxl) ∧ (ordl)


100 CHAPTER 3. RECURSION IN THE POINTFREE STYLEwhere ≥ is assumed to be a total order on datatype A andcomputes the greatest element <strong>of</strong> a given list <strong>of</strong> As:listMax = (|[ id,max ]|) (3.80)A +[ singl,cons ]A + A × A +listMaxA[ id,max ]A + A × Aid+id×listMax(In the diagram, singl a = [a].)Predicate ord is not a catamorphism because <strong>of</strong> the presence <strong>of</strong> listMax l in therecursive branch. However, the following diagram depicting ordA +[ singl,cons ]A + A × A +ord2 A + A × (A × 2)[ TRUE,α ]id+id×〈listMax,ord〉(where α(a,(m,b)) def = a ≥ m ∧ b) suggests the possibility <strong>of</strong> using the mutual recursionlaw. One only has to find a way <strong>of</strong> letting listMax depend also on ord, which isn’tdifficult: for any A + g B , one hasA +[ singl,cons ]A + A × A +listMaxid+id×〈listMax,g〉A A + A × (A × B)[ id,max·(id×π 1 ) ]where the extra presence <strong>of</strong> g is cancelled by projection π 1 .For B = 2 and g = ord we are in position to apply Fokkinga’s law and obtain:〈listMax,ord〉 = (|〈[ id,max · (id × π 1 ) ],[ TRUE,α ]〉|)= { exchange law (2.47) }(|[ 〈id, TRUE〉, 〈max · (id × π 1 ),α〉 ]|)Of course, ord = π 2 · 〈listMax,ord〉. By denoting the above synthesized catamorphismby aux, we end up with the following version <strong>of</strong> ord:ordl = let (a,b) = auxlin b


3.16. “BANANA-SPLIT”: A COROLLARY OF THE MUTUAL-RECURSION LAW101whereaux : A + A × 2aux[a] = (a, TRUE)aux(cons(a,l)) = letin(m,b) = auxl(max(a,m),(a > m ∧ b))3.16 “Banana-split”: a corollary <strong>of</strong> the mutual-recursionlawLet h = i · F π 1 and k = j · F π 2 in (3.73). Thenf · in = (i · F π 1 ) · F 〈f,g〉≡ { composition is associative and F is a functor }f · in = i · F (π 1 · 〈f,g〉)≡ { by ×-cancellation (2.20) }f · in = i · F f≡ { by cata-cancellation }f = (|i|)Similarly, from k = j · F π 2 we getThen, from (3.73), we getthat isg = (|j|)〈(|i|),(|j|)〉 = (|〈i · F π 1 ,j · F π 2 〉|)〈(|i|),(|j|)〉 = (|(i × j) · 〈F π 1 ,Fπ 2 〉|) (3.81)by (reverse) ×-absorption (2.25).This law provides us with a very useful tool for “parallel loop” inter-combination:“loops” (|i|) and (|j|) are fused together into a single “loop” (|(i × j) · 〈F π 1 ,Fπ 2 〉|). Theneed for this kind <strong>of</strong> calculation arises very <strong>of</strong>ten. Consider, for instance, the functionwhich computes the average <strong>of</strong> a non-empty list <strong>of</strong> natural numbers,average def = (/) · 〈sum,length〉 (3.82)


102 CHAPTER 3. RECURSION IN THE POINTFREE STYLEwhere sum and length are the expected IN + catamorphisms:sum = (|[ id,+ ]|)length = (|[ 1,succ · π 2 ]|)As defined by (3.82), function average performs two independent traversals <strong>of</strong> the argumentlist before division (/) takes place. Banana-split will fuse such two traversals into asingle one (see function aux below), thus leading to a function which will run ”twice asfast”:average l =x/ywhere(x,y) = aux laux[a] = (a,1)aux(cons(a,l)) = (a + x,y + 1)where (x,y) = aux l(3.83)Exercise 3.15. Calculate (3.83) from (3.82). Which <strong>of</strong> these two versions <strong>of</strong> the samefunction is easier to understand?✷3.17 Inductive datatype isomorphismnot yet available3.18 Bibliography notesIt is <strong>of</strong>ten the case that the expressive power <strong>of</strong> a particular programming language orparadigm is counter-productive in the sense that too much freedom is given to programmers.Sooner or later, these will end up writing unintelligible (authorship dependent) codewhich will become a burden to whom has to maintain it. Such has been the case <strong>of</strong> imperativeprogramming in the past (inc. assembly code), where the unrestricted use <strong>of</strong> gotoinstructions eventually gave place to if-then-else, while and repeat structuredprogramming constructs.A similar trend has been observed over the last decades at a higher programminglevel: arbitrary recursion and/or (side) effects have been considered harmful in functionalprogramming. Instead, programmers have been invited to structure their code around


3.18. BIBLIOGRAPHY NOTES 103generic program devices such as eg. fold/unfold combinators, which bring discipline torecursion. One witnesses progress in the sense that the loss <strong>of</strong> freedom is balanced by theincrease <strong>of</strong> formal semantics and the availability <strong>of</strong> program calculi.Such disciplined programming combinators have been extended from list-processingto other inductive structures thanks to one <strong>of</strong> the most significant advances in programmingtheory over the last decade: the so-called functorial approach to datatypes whichoriginated mainly from [MA86], was popularized by [Mal90] and reached textbook formatin [BdM97]. A comfortable basis for exploiting polymorphism [Wad89], the “datatypesas functors” moto has proved beneficial at a higher level <strong>of</strong> abstraction, giving birth topolytypism [JJ96].The literature on anas, catas and hylos is vast (see eg. [MH95], [JJ98], [GHA01]) andit is part <strong>of</strong> a broader discipline which has become known as the mathematics <strong>of</strong> programconstruction [Bac04]. This chapter provides an introduction to such as discipline. Onlythe calculus <strong>of</strong> catamorphisms is presented. The corresponding theory <strong>of</strong> anamorphismsand hylomorphisms demands further mathematical machinery and will won’t be dealtwith before chapters 5 and 7. The results on mutual recursion presented in this chapterwere pionered by Maarten Fokkinga [Fok92].


104 CHAPTER 3. RECURSION IN THE POINTFREE STYLE


Chapter 4Why Monads MatterIn this chapter we present a powerful device in state-<strong>of</strong>-the-art programming, that <strong>of</strong> amonad. The monad concept is nowadays <strong>of</strong> primary importance in computing sciencebecause it makes it possible to describe computational effects as disparate as input/output,comprehension notation, state variable updating, context dependence, partial behaviouretc. in an elegant and uniform way.Our motivation to this concept will start from a well-known problem in functionalprogramming (and computing as a whole) — that <strong>of</strong> coping with undefined computations.4.1 Partial functionsConsider the IR to IR functiong x def = 1/xClearly, g is undefined for x = 0 because g 0 = 1/0 is so big a real number that it cannotbe properly evaluated. In fact, the HASKELL output for g 0 = 1/0 is just “panic”:Main> g 0Program error: {primDivDouble 1.0 0.0}Main>Functions such as g above are called partial functions because they cannot be appliedto all <strong>of</strong> their inputs (i.e., they diverge for some <strong>of</strong> their inputs). Partial functions arevery common in mathematics or programming — for other examples think <strong>of</strong> e.g. listprocessingfunctions head and tail.105


106 CHAPTER 4. WHY MONADS MATTERPanic is very dangerous in programming. In order to avoid this kind <strong>of</strong> behaviourone has two alternatives, either ensuring that every call to g x is protected — i.e., thecontexts which wrap up such calls ensure pre-condition x ≠ 0, or one raises exceptions,i.e. explicit error values. In the former case, mathematical pro<strong>of</strong>s need to be carried outin order to guarantee safety (that is, pre-condition compliance). The overall effect is torestrict the domain <strong>of</strong> the partial function. In the latter case one goes the other wayround, by extending the co-domain (vulg. range) <strong>of</strong> the function so that it accommodatesexceptional outputs. In this way one might define, in HASKELL:data ExtReal = Ok Real | Errorand then redefineg :: Real -> ExtRealg 0 = Errorg n = Ok 1/nIn general, one might define parametric typedata Ext a = Ok a | Errorin order to extend an arbitrary data type a with its (polymorphic) exception (or errorvalue). Clearly, one hasExt A ∼ = MaybeA ∼ = 1 + ASo, in abstract terms, one may regard as partial every function <strong>of</strong> signaturefor some A and B 1 .g1 + A B4.2 Putting partial functions togetherDo partial functions compose? Their types won’t match in general:g1 + B Af1 + C B1 In conventional programming, every function delivering a pointer as result — as in e.g. the Cprogramming language — can be regarded as one <strong>of</strong> these functions.


4.2. PUTTING PARTIAL FUNCTIONS TOGETHER 107Clearly, we have to extend f — which is itself a partial function — to some f ′ able toaccept arguments from 1 + B:1i 1... i 2f ′1 + B1 + C BfgAThe most “obvious” instance <strong>of</strong> the ellipsis (...) in the diagram above is i 1 and thiscorresponds to what is called strict composition: an exception produced by the producerfunction g is propagated to the output <strong>of</strong> the consumer function f:f • g def = [ i 1 ,f ] · g (4.1)Expressed in terms <strong>of</strong> Ext, composite function f • g works as follows:where(f • g)a = f ′ (g a)f ′ Error= Errorf ′ (Ok b) = f bAltogether, we have the following Haskell expression for the meaning <strong>of</strong> f • g:\a -> f’ (g a)where f’ Error = Errorf’ (Ok b) = f bNote that the adopted extension <strong>of</strong> f can be decomposed — by reverse +-absorption(2.41) — intoas displayed in diagramf ′ = [ i 1 ,id ] · (id + f)1 + (1 + C)id+f1 + BgA[ i 1 ,id ]f1 + C B


108 CHAPTER 4. WHY MONADS MATTERAll in all, we have the following version <strong>of</strong> (4.1):f • g def = [ i 1 ,id ] · (id + f) · gDoes this functional composition scheme have a unit, that is, is there a u such thatf • u = f = u • f (4.2)holds? Clearly, if it exists, it must bear type 1 + A A . Let us solve (4.2) for u:uf • u = f = u • f≡ { substitution }[ i 1 ,f ] · u = f = [ i 1 ,u ] · f⇐ { let u = i 2 }[ i 1 ,f ] · i 2 = f = [ i 1 ,i 2 ] · f ∧ u = i 2≡ { by +-cancellation (2.38) and +-reflection (2.39) }f = f = id · f ∧ u = i 2⇐ { identity }u = i 24.3 ListsIn contrast to partial functions, which can produce no output, let us now consider functionswhich deliver too many outputs, for instance, lists <strong>of</strong> output values:B ⋆gAC ⋆fBFunctions f and g do not compose but once again one can think <strong>of</strong> extending the consumerfunction (f) by mapping it along the output <strong>of</strong> the producer function (g):(C ⋆ ) ⋆ f⋆B ⋆C ⋆ fB


4.4. MONADS 109To complete the process, one has to flatten the nested-sequence output in (C ⋆ ) ⋆ via the obviouslist-catamorphism C ⋆ concat (C ⋆ ) ⋆ , where concat def = (|[ [ ], + ]|). In summary:as captured in the following diagram:f • g def = concat · f ⋆ · g (4.3)(C ⋆ ) ⋆f ⋆B ⋆gAconcatC ⋆fBExercise 4.1.(4.3).✷Show that singl (recall exercise 3.7) is the unit u <strong>of</strong> • in the context <strong>of</strong>Exercise 4.2. Encode in HASKELL a pointwise version <strong>of</strong> (4.3). Hint: first apply (list)cata-absorption (3.67).✷4.4 MonadsBoth function composition schemes (4.1) and (4.3) above share the same polytypic pattern:the output <strong>of</strong> the producer function is “F-times” more elaborate than the input <strong>of</strong> theconsumer function, where F is some parametric datatype — F X = 1+X in case <strong>of</strong> (4.1),and F X = X ⋆ in case <strong>of</strong> (4.3). Then a composition scheme is devised for such functions,which is displayed inF(F C)F fF BgAµF C fB


110 CHAPTER 4. WHY MONADS MATTERand is given byµf • g def = µ · F f · g (4.4)where F A F 2 A is a suitable polymorphic function. Together with a unit functionF A uA and µ, datatype F will form a so-called monad type, <strong>of</strong> which 1 + and( ) ⋆ are the two examples seen above.Arrow µ · F f is called the extension <strong>of</strong> f. Functions µ and u are referred to as themonad’s multiplication and unit, respectively. The monadic composition scheme (4.4) iscalled Kleisli composition.fA monadic arrow FB A conveys the idea <strong>of</strong> a function which produces anoutput <strong>of</strong> “type” B “wrapped by F”, where datatype F describes some kind <strong>of</strong> (computational)“effect”. The monad’s unit FB B is a primitive monadic arrow whichuproduces (i.e. promotes, injects, wraps) data together with such an effect.The monad concept is nowadays <strong>of</strong> primary importance in computing science becauseit makes it possible to describe computational effects as disparate as input/output, statevariable updating, context dependence, partial behaviour (seen above) etc. in an elegantand uniform way. Moreover, the monad’s operators exhibit notable properties which makeit possible to reason about such computational effects.The remainder <strong>of</strong> this section is devoted to such properties. First <strong>of</strong> all, the propertiesimplicit in the following diagrams will be required for F to be regarded as a monad:Multiplication :F 2 AµF 3 Aµ · µ = µ · F µ (4.5)µF AµF µF 2 AUnit :F 2 AµF Auid µF AF uF 2 Aµ · u = µ · F u = id (4.6)Simple but beautiful symmetries apparent in these diagrams make it easy to memorizetheir laws and check them for particular cases. For instance, for the (1 + ) monad, law(4.6) will read as follows:[ i 1 ,id ] · i 2 = [ i 1 ,id ] · (id + i 2 ) = id


4.4. MONADS 111These equalities are easy to check.In laws (4.5) and (4.6), the different instances <strong>of</strong> µ and u are differently typed, asthese are polymorphic and exhibit natural properties:µ-natural :fAF fF AµF 2 AB F B F 2 B µF 2 fF f · µ = µ · F 2 f (4.7)u-natural :fAF AF fuAB F B B ufF f · u = u · f (4.8)The simplest <strong>of</strong> all monads is the identity monad F X def = X, which is such thatµ = id, u = id and f • g = f · g. So — in a sense — one may be think <strong>of</strong> all thefunctional discipline studied so far as a particular case <strong>of</strong> a wider discipline in which anarbitrary monad is present.4.4.1 Properties involving (Kleisli) compositionThe following properties arise from the definitions and monadic properties presentedabove:f • (g • h) = (f • g) • h (4.9)u • f = f = f • u (4.10)(f • g) · h = f • (g · h) (4.11)(f · g) • h = f • (F g · h) (4.12)id • id = µ (4.13)Properties (4.9) and (4.10) are the monadic counterparts <strong>of</strong>, respectively, (2.8) and (2.10),meaning that monadic composition preserves the properties <strong>of</strong> normal functional composition.In fact, for the identity monad, these properties coincide with each other.Above we have shown that property (4.10) holds for the list monad, recall (4.2). Ageneral pro<strong>of</strong> can be produced similarly. We select property (4.9) as an illustration <strong>of</strong> the


112 CHAPTER 4. WHY MONADS MATTERrôle <strong>of</strong> the monadic properties:f • (g • h)= { definition (4.4) twice }µ · F f · (µ · F g · h)= { µ is natural (4.7) }µ · µ · F(F f) · F g · h= { functor F }µ · µ · F(F f · g) · h= { definition (4.4) }µ · (F f · g) • h= { definition (4.4) }(f • g) • hExercise 4.3. Check the other laws above.✷4.5 Monadic application (binding)The monadic extension <strong>of</strong> functional application ap (2.67) is another operator ap ′ whichis intended to be “tolerant” in face <strong>of</strong> any F’ed argument x:(F B) A × F A ap′ Bap ′ (f,x) = f ′ x = (µ · F f)x(4.14)If in curry/flipped format, monadic application is called binding and denoted by symbol“>>=”, looking very much like postfix functional application,((F B) A ) F A >= F B (4.15)that is:x >>= fdef= (µ · F f)x (4.16)


4.6. SEQUENCING AND THE DO-NOTATION 113This operator will exhibit properties arising from its definition and the basic monadicproperties, e.g.x >>= u≡ { definition (4.16) }(µ · F u)x≡ { law (4.6) }(id)x≡ { identity function }xAt pointwise level, one may chain monadic compositions from left to right, e.g.(((x >>= f 1 ) >>= f 2 ) >>= ...f n−1 ) >>= f nfor functions A f 1 F B 1 , B 1f 2 F B 2 , . . . B n−1f n F B n .4.6 Sequencing and the do-notationGiven two monadic values x and y, it becomes possible to “sequence” them, thus obtaininganother <strong>of</strong> such value, by defining the following operator:For instance, within the finite-list monad, one hasx >> y def = x >>= y (4.17)[1,2] >> [3,4] = (concat · [3,4] ⋆ )[1,2] = concat[[3,4],[3,4]] = [3,4,3,4]Because this operator is associative (prove this as an exercise), one may iterate it tomore than two arguments and write, for instance,x 1 >> x 2 >> ... >> x nThis leads to the popular do notation, which is another piece <strong>of</strong> (pointwise) notationwhich makes sense in a monadic context:do x 1 ;x 2 ;... ;x ndef= x 1 >> do x 2 ;...;x nfor n ≥ 1. For n = 1 one trivially hasdo x 1def= x 1


114 CHAPTER 4. WHY MONADS MATTER4.7 Generators and comprehensionsThe do-notation accepts a variant in which the arguments <strong>of</strong> the >> operator are “generators”<strong>of</strong> the forma ← x (4.18)where, for a <strong>of</strong> type A, x is an inhabitant <strong>of</strong> monadic type F A. One may regard a ← x asmeaning “let a be taken from x”. Then the do-notation extends as follows:do a ← x 1 ;x 2 ;...;x ndef= x 1 >>= λa.(do x 2 ;... ;x n ) (4.19)Of course, we should now allow for the x i to range over terms involving variable a. Forinstance (again in the list-monad), by writingwe meando a ← [1,2,3];[a 2 ] (4.20)[1,2,3] >>= λa.[a 2 ]= concat((λa.[a 2 ]) ⋆ [1,2,3])= concat[[1],[4],[9]]= [1,4,9]The analogy with classical set-theoretic ZF-notation, whereby one might write {a 2 |a ∈ {1,2,3}} to describe the set <strong>of</strong> the first three perfect squares, calls for the followingnotation,[ a 2 | a ← [1,2,3] ] (4.21)as a “shorthand” <strong>of</strong> (4.20). This is an instance <strong>of</strong> the so-called comprehension notation,which can be defined in general as follows:[ e | a 1 ← x 1 ,...,a n ← x n ] = do a 1 ← x 1 ;...;a n ← x n ;u(e) (4.22)Alternatively, comprehensions can be defined as follows, where p,q stand for arbitrarygenerators:[t] = ut (4.23)[ f x | x ← l ] = (F f)l (4.24)[ t | p,q ] = µ[ [ t | q ] | p ] (4.25)Note, however, that comprehensions are not restricted to lists or sets — they can bedefined for any monad F.


4.8. MONADS IN HASKELL 1154.8 Monads in HASKELLIn the Standard Prelude for HASKELL, one finds the following minimal definition <strong>of</strong> theMonad class,class Monad m wherereturn :: a -> m a(>>=) :: m a -> (a -> m b) -> m bwhere return refers to the unit <strong>of</strong> m, on top <strong>of</strong> which the “sequence” operator(>>) :: m a -> m b -> m bfail :: String -> m ais defined byp >> q= p >>= \ _ -> qas expected. This class is instantiated for finite sequences ([]), Maybe and IO.The µ multiplication operator is function join in module Monad.hs:join :: (Monad m) => m (m a) -> m ajoin x = x >>= idThis is easily justified:join x = x >>= id (4.26)= { definition (4.16) }(µ · F id)x= { functors commute with identity (3.44) }(µ · id)x= { law (2.10) }µ xIn Mpi.hs we define (Kleisli) monadic composition in terms <strong>of</strong> the binding operator:(.!) :: Monad a => (b -> a c) -> (d -> a b) -> d -> a c(f .! g) a = (g a) >>= f


116 CHAPTER 4. WHY MONADS MATTER4.8.1 Monadic I/OIO, a parametric datatype whose inhabitants are special values called actions or commands,is a most relevant monad. Actions perform the interconnection between HASKELLand the environment (file system, operating system). For instance, getLine :: IO Stringis a particular action. Parameter String refers to the fact that this action “delivers” —or extracts — a string from the environent. This meaning is clearly conveyed by the typeString assigned to symbol l indo l ← getLine;... l ...which is consistent with typing rule for generators (4.18). Sequencing corresponds to the“;” syntax in most programming languages (e.g. C) and the do-notation is particularyintuitive in the IO-context.Examples <strong>of</strong> functions delivering actions areandFilePathCharreadFileputChar IOString IO()— both produce I/O commands as result.As is to be expected, the implementation <strong>of</strong> the IO monad in HASKELL — availablefrom the Standard Prelude — is not totally visible, for it is bound to deal with theintrincacies <strong>of</strong> the underlying machine:instance Monad IO where(>>=) = primbindIOreturn = primretIORather interesting is the way IO is regarded as a functor:fmap f x = x >>= (return . f)This goes the other way round, the monadic structure “helping” in defining the functorstructure, everything consistent with the underlying theory:x >>= (u · f) = (µ · IO(u · f))x= { functors commute with composition }(µ · IOu · IOf)x= { law (4.6) for F = IO }(IO f)x= { definition <strong>of</strong> fmap }(fmap f)x


4.9. THE STATE MONAD 117For enjoyable reading on monadic input/output in HASKELL see [Hud00], chapter 18.Exercise 4.4. Use the do-notation and the comprehension notation to output the followingtruth-table, in HASKELL:✷p / q False TrueFalse False FalseTrue False TrueExercise 4.5. Extend the Maybe monad to the following “error message” exceptionhandling datatype:data Error a = Err String | Ok a deriving ShowIn case <strong>of</strong> several error messages issued in a do sequence, how many turn up on thescreen? Which ones?✷4.9 The state monadNB: this section is still very draftyThe so-called state monad is a monad whose inhabitants are state-transitions encoding aparticular brand <strong>of</strong> state-based automaton known as Mealy machine. Given a set A (inputalphabet), a set B (output alphabet) and a set <strong>of</strong> states S, a deterministic Mealy machine(DMM) is specified by a transition function <strong>of</strong> typeA × Sδ B × S (4.27)Wherever (b,s ′ ) = δ(a,s), we say that the machine has transitions a|b s ′


118 CHAPTER 4. WHY MONADS MATTERand refer to s as the before state, and to s ′ as the after state.It is clear from (4.27) that δ can be expressed as the split <strong>of</strong> two functions f and g,δ = 〈f,g〉, as depicted in the following diagram:a♣ ✲✲f✲b = f(a,s)✲✲♣gs✓S✒✏✛✑s ′ = g(a,s)(4.28)The information recorded in the state <strong>of</strong> a DMM is either meaningless to the user <strong>of</strong>the machine (as in eg. the case <strong>of</strong> states represented by numbers) or too complex to behandled explicitly (as is the case <strong>of</strong> eg. the data kept in a large database). So, it is convenientto abstract from it. Such an abstraction leads to the state monad in the followingway: recalling (2.75), we simply curry δA δ (B × S) S} {{ }(St S) B(4.29)thus “shifting” the input state to the output. In this way, δ a is a function capturing allstate-transitions (and corresponding outputs) for input a. For instance, the function whichappends a new element to the back <strong>of</strong> a queue,enq(a,s)def= s + [a]can be converted into a DMM by adding to it a dummy output <strong>of</strong> type 1 and then transposing:enqueue : A → (1 × S) Senqueue a def = 〈!,( +[a])〉(4.30)Action enqueue performs enq on the state while acknowledging it by issuing an output<strong>of</strong> type 1.


4.9. THE STATE MONAD 119Unit and multiplication.Let us show that(St S) A ∼ = (A × S) S (4.31)forms a monad. As we shall see, the fact that the values <strong>of</strong> this monad are functionsbrings the theory <strong>of</strong> exponentiation to the forefront. (Thus a review <strong>of</strong> section 2.14 isrecommended.) Notation ̂f will be used to abbreviate uncurry f. Thus the followingvariant <strong>of</strong> universal law (2.67),whose cancellationis written pointwise as follows:̂k = f ⇔ f = ap · (k × id) (4.32)̂k = ap · (k × id) (4.33)̂k(c,a) = (k c)a (4.34)First <strong>of</strong> all, what is the functor behind (4.31)? Fixing the state space S, we obtainon objects andFX def = (X × S) S (4.35)Ffdef= (f × id) S (4.36)on functions, where ( ) S is the exponential functor (2.71).The unit <strong>of</strong> this monad is the transpose <strong>of</strong> the simplest <strong>of</strong> all Mealy machines — theidentity:u : A → (A × S) Su = id(4.37)Let us see what this means:u = id≡ { (2.67) }ap · (u × id) = id≡ { introducing variables }ap(u a,s) = (a,s)≡ { definition <strong>of</strong> a }(u a)s = (a,s)


120 CHAPTER 4. WHY MONADS MATTERFrom the type <strong>of</strong> µ, for this monad,((A × S) S × S) S µ (A × S) Sone figures out µ = x S (recalling the exponential functor as defined by (2.71)) for x<strong>of</strong> type ((A × S) S × S)x (A × S) . This, on its turn, is easily recognized as aninstance <strong>of</strong> the ap polymorphic function (2.67), which is such that ap = id, ̂ recall (2.69).Altogether, we defineµ = ap S (4.38)Let us check the meaning <strong>of</strong> µ by applying it to an action expressed as in diagram(2.75):µ〈f,g〉 = ap S 〈f,g〉≡ { (2.71) }µ〈f,g〉 = ap · 〈f,g〉≡ { extensional equality (2.5) }µ〈f,g〉s = ap(f s,g s)≡ { definition <strong>of</strong> ap }µ〈f,g〉s = (f s)(g s)We find out that µ “unnests” the action inside f by applying it to the state delivered by g.Checking the monadic laws.µ · u = id first,The calculation <strong>of</strong> (4.6) is made in two parts, checkingµ · u= { definitions }ap S · id= { exponentials absorption (2.72) }ap · id= { reflection (2.69) }id


4.9. THE STATE MONAD 121and then checking µ · (Fu) = id:µ · (Fu)= { (4.38,4.36) }ap S · (id × id) S= { functor }(ap · (id × id)) S= { cancellation (2.68) }id S= { functor }idThe pro<strong>of</strong> <strong>of</strong> (4.5) is also not difficult once supported by the laws <strong>of</strong> exponentials.Kleisli composition.Let us calculate f • g for this monad:f • g= { (4.4) }µ · F f · g= { (4.38) ; (4.36) }ap S · (f × id) S · g= { ( ) S is a functor }(ap · (f × id)) S · g= { (4.32) }̂f S · g= { cancellation }̂f S · ĝ= { absorption (2.72) }̂f · ĝIn summary, we have:f • g = ̂f · ĝ (4.39)


122 CHAPTER 4. WHY MONADS MATTERLet us use this in calculating lawwhere push and pop are such thatpop • push = u (4.40)push : A → (1 × S) Spush ̂ def = 〈!,̂:〉for S the datatype <strong>of</strong> finite lists. We reason:(4.41)pop : 1 → (A × S) Ŝpop def = 〈head,tail〉 · π 2(4.42)pop • push= { (4.39) }̂pop · ̂ push= { (4.41, 4.42) }〈head,tail〉 · π 2 · 〈!, ̂(:)〉= { (2.20, 2.24) }〈head,tail〉 · ̂(:)= { out · in = id (lists) }id= { (4.37) }uBind. The effect <strong>of</strong> binding a state transition x to a state-monadic function h is calculatedin a similar way:x >>= h= { (4.16) }(µ · Fh)x= { (4.38) and (4.36) }(ap S · (h × id) S )x= { ( ) S is a functor }


4.9. THE STATE MONAD 123(ap · (h × id)) S x= { cancellation (4.33) }ĥ S x= { exponential functor (2.71) }ĥ · xLet us unfold ĥ · x by splitting x into its components two components f and g:〈f,g〉 >>= h = ĥ · 〈f,g〉≡ { go pointwise }(〈f,g〉 >>= h)s = ĥ(〈f,g〉s)≡ { (2.18) }(〈f,g〉 >>= h)s =≡ { (4.34) }ĥ(f s,g s)(〈f,g〉 >>= h)s = h(f s)(g s)In summary, for a given “before state” s, g s is the intermediate state upon which f s runsand yields the output and (final) “after state”.Two prototypical inhabitants <strong>of</strong> the state monad: get and put.actions are defined as follows, in the PF-style:These genericgetputdef= 〈id,id〉 (4.43)def= 〈!,π 1 〉 (4.44)Action g retrieves the data stored in the state while put (which can also be writtenwhereput s = modify(s) (4.45)modify fdef= 〈!,f〉 (4.46)updates the state via state-to-state function f) stores a particular value in the state.The following is an example, in Haskell, <strong>of</strong> the standard use <strong>of</strong> get/put in managingcontext data, in this case a counter. The function decorates each node <strong>of</strong> a BTree (recallthis datatype from page 91) with its position in the tree:


124 CHAPTER 4. WHY MONADS MATTERdecBTree Empty = return EmptydecBTree (Node (a,(t1,t2))) =do n


Part IIMoving Away From (Pure)Functions125


Bibliography[ABH + 92] C. Aarts, R.C. Backhouse, P. Hoogendijk, E.Voermans, and J. van derWoude. A relational theory <strong>of</strong> datatypes, December 1992. Available fromwww.cs.nott.ac.uk/˜rcb.[Bac78][Bac04][Bar01]J. Backus. Can programming be liberated from the von Neumann style? afunctional style and its algebra <strong>of</strong> programs. CACM, 21(8):613–639, August1978.R.C. Backhouse. Mathematics <strong>of</strong> Program Construction. Univ. <strong>of</strong> Nottingham,2004. Draft <strong>of</strong> book in preparation. 608 pages.L.S. Barbosa. Components as Coalgebras. <strong>University</strong> <strong>of</strong> Minho, December2001. Ph. D. thesis.[BD77] R.M. Burstall and J. Darlington. A transformation system for developingrecursive programs. JACM, 24(1):44–67, January 1977.[BdM97] R. Bird and O. de Moor. Algebra <strong>of</strong> Programming. Series in ComputerScience. Prentice-Hall International, 1997. C.A.R. Hoare, series editor.[Bir98][Flo67][Fok92][GHA01]R. Bird. Introduction to Functional Programming. Series in Computer Science.Prentice-Hall International, 2nd edition, 1998. C.A.R. Hoare, serieseditor.R.W. Floyd. Assigning meanings to programs. In J.T. Schwartz, editor, MathematicalAspects <strong>of</strong> Computer Science, volume 19, pages 19–32. AmericanMathematical Society, 1967. Proc. Symposia in Applied Mathematics.M.M. Fokkinga. Law and Order in Algorithmics. PhD thesis, <strong>University</strong> <strong>of</strong>Twente, Dept INF, Enschede, The Netherlands, 1992.Jeremy Gibbons, Graham Hutton, and Thorsten Altenkirch. When is a functiona fold or an unfold?, 2001. WGP, July 2001 (slides).283


284 BIBLIOGRAPHY[G<strong>of</strong>84] J. Le G<strong>of</strong>f. Calendário, volume I, chapter 8, pages 260–292. I.N.-C.M.,1984. EINAUDI Encyclopedia (Portuguese translation).[HM93][Hud00][JJ96]Graham Hutton and Erik Meijer. Monadic parsing in Haskell. Journal <strong>of</strong>Functional Programming, 8(4), 1993.P. Hudak. The Haskell School <strong>of</strong> Expression - Learning Functional ProgrammingThrough Multimedia. Cambridge <strong>University</strong> Press, 1st edition, 2000.ISBN 0-521-64408-9.J. Jeuring and P. Jansson. Polytypic programming. In Advanced FunctionalProgramming, number 1129 in LNCS, pages 68–114. Springer, 1996.[JJ98] P. Jansson and J. Jeuring. Polylib — a library <strong>of</strong> polytypic functions. InWorkshop on Generic Programming (WGP’98), Marstrand, Sweden, 1998.[Jon03][LV03]S.L. Peyton Jones. Haskell 98 Language and Libraries. Cambridge <strong>University</strong>Press, Cambridge, UK, 2003. Also published as a Special Issue <strong>of</strong> theJournal <strong>of</strong> Functional Programming, 13(1) Jan. 2003.R. Lämmel and J. Visser. A Strafunski Application Letter. In V. Dahl andP.L. Wadler, editors, Proc. <strong>of</strong> Practical Aspects <strong>of</strong> Declarative Programming(PADL’03), volume 2562 <strong>of</strong> LNCS, pages 357–375. Springer-Verlag, January2003.[MA86] E.G. Manes and M.A. Arbib. Algebraic Approaches to Program Semantics.Texts and Monographs in Computer Science. Springer-Verlag, 1986.D. Gries, series editor.[Mal90][McC63][MH95][Mog89]G. Malcolm. Data structures and program transformation. Science <strong>of</strong> ComputerProgramming, 14:255–279, 1990.J. McCarthy. Towards a mathematical science <strong>of</strong> computation. In C.M. Popplewell,editor, Proc. <strong>of</strong> IFIP 62, pages 21–28, Amsterdam-London, 1963.North-Holland Pub. Company.E. Meijer and G. Hutton. Bananas in space: Extending fold and unfold toexponential types. In S. Peyton Jones, editor, Proceedings <strong>of</strong> FunctionalProgramming Languages and Computer Architecture (FPCA95), 1995.Eugenio Moggi. Computational lambda-calculus and monads. In Proceedings4th Annual IEEE Symp. on Logic in Computer Science, LICS’89, PacificGrove, CA, USA, 5–8 June 1989, pages 14–23. IEEE Computer Society Press,Washington, DC, 1989.


BIBLIOGRAPHY 285[NR69]P. Naur and B. Randell, editors. S<strong>of</strong>tware Engineering: Report on a conferencesponsored by the NATO SCIENCE COMMITTEE, Garmisch, Germany,7th to 11th October 1968. Scientific Affairs Division, NATO, 1969.[Oli08a] J.N. <strong>Oliveira</strong>. Extended static checking by calculation using the pointfreetransform, 2008. Tutorial paper (56 p.) accepted for publication by Springer-Verlag, LNCS series.[Oli08b] J.N. <strong>Oliveira</strong>. Transforming Data by Calculation. In GTTSE’07, volume 5235<strong>of</strong> LNCS, pages 134–195. Springer, 2008.[OR06][PH70][Wad89]J.N. <strong>Oliveira</strong> and C.J. Rodrigues. Pointfree factorization <strong>of</strong> operation refinement.In FM’06, volume 4085 <strong>of</strong> LNCS, pages 236–251. Springer-Verlag,2006.M.S. Paterson and C.E. Hewitt. Comparative schematology. In Project MACConference on Concurrent Systems and Parallel Computation, pages 119–127, August 1970.P.L. Wadler. Theorems for free! In 4th International Symposium on FunctionalProgramming Languages and Computer Architecture, pages 347–359,London, Sep. 1989. ACM.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!