J.N. Oliveira PROGRAM DESIGN BY CALCULATION University of ...

J.N. OliveiraPROGRAM DESIGN BYCALCULATION(DRAFT)University of Minho(in preparation)

J.N. OliveiraPROGRAM DESIGN BYCALCULATION(DRAFT)University of Minho(in preparation)

iiCONTENTS2.19 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.20 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 533 Recursion in the Pointfree Style 553.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Introducing inductive datatypes . . . . . . . . . . . . . . . . . . . 613.3 Observing an inductive datatype . . . . . . . . . . . . . . . . . . 663.4 Synthesizing an inductive datatype . . . . . . . . . . . . . . . . . 703.5 Introducing (list) catas, anas and hylos . . . . . . . . . . . . . . . 713.6 Inductive types more generally . . . . . . . . . . . . . . . . . . . 763.7 Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.8 Polynomial functors . . . . . . . . . . . . . . . . . . . . . . . . . 793.9 Polynomial inductive types . . . . . . . . . . . . . . . . . . . . . 813.10 F-algebras and F-homomorphisms . . . . . . . . . . . . . . . . . 823.11 F-catamorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . 833.12 Parameterization and type functors . . . . . . . . . . . . . . . . . 863.13 A catalogue of standard polynomial inductive types . . . . . . . . 903.14 Functors and type functors in HASKELL . . . . . . . . . . . . . . 923.15 The mutual-recursion law . . . . . . . . . . . . . . . . . . . . . . 933.16 “Banana-split”: a corollary of the mutual-recursion law . . . . . . 1013.17 Inductive datatype isomorphism . . . . . . . . . . . . . . . . . . 1023.18 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 1024 Why Monads Matter 1054.1 Partial functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.2 Putting partial functions together . . . . . . . . . . . . . . . . . . 1064.3 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.4 Monads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1094.4.1 Properties involving (Kleisli) composition . . . . . . . . . 1114.5 Monadic application (binding) . . . . . . . . . . . . . . . . . . . 1124.6 Sequencing and the do-notation . . . . . . . . . . . . . . . . . . 1134.7 Generators and comprehensions . . . . . . . . . . . . . . . . . . 1144.8 Monads in HASKELL . . . . . . . . . . . . . . . . . . . . . . . . 1154.8.1 Monadic I/O . . . . . . . . . . . . . . . . . . . . . . . . 1164.9 The state monad . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174.10 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 124

CONTENTSiiiII Moving Away From (Pure) Functions 1255 Quasi-inductive datatypes 1275.1 Introducing Non-inductive Datatypes . . . . . . . . . . . . . . . . 1275.2 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . 1335.3 Well-founded coalgebras and induction . . . . . . . . . . . . . . 1345.4 Datatype invariants and proof obligations . . . . . . . . . . . . . 1365.5 Binary relations and finite mappings . . . . . . . . . . . . . . . . 1385.6 Finite mapping induction principle . . . . . . . . . . . . . . . . . 1405.7 An overview of the powerset and finite mapping algebras . . . . . 1415.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1415.9 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 143III Calculating with Relations 1456 Introduction to Relational Calculation 1476.1 Functions are not enough . . . . . . . . . . . . . . . . . . . . . . 1476.2 Relational composition and converse . . . . . . . . . . . . . . . . 1516.3 Relational equality . . . . . . . . . . . . . . . . . . . . . . . . . 1526.4 Specifications as “properties” . . . . . . . . . . . . . . . . . . . . 1546.5 Relational approach . . . . . . . . . . . . . . . . . . . . . . . . . 1546.6 Pre/post specification style . . . . . . . . . . . . . . . . . . . . . 1556.7 From predicates to relations . . . . . . . . . . . . . . . . . . . . . 1566.8 Basic relational combinators . . . . . . . . . . . . . . . . . . . . 1586.9 Converse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1606.10 Meet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1616.11 Pointwise vs pointfree notation . . . . . . . . . . . . . . . . . . . 1626.12 Orders and their taxonomy . . . . . . . . . . . . . . . . . . . . . 1646.13 Derived combinators . . . . . . . . . . . . . . . . . . . . . . . . 1666.14 Entireness and simplicity . . . . . . . . . . . . . . . . . . . . . . 1686.15 Surjectiveness and injectiveness . . . . . . . . . . . . . . . . . . 1686.16 Binary relation taxonomy . . . . . . . . . . . . . . . . . . . . . . 1696.17 Reasoning about functions . . . . . . . . . . . . . . . . . . . . . 1706.18 Galois connections . . . . . . . . . . . . . . . . . . . . . . . . . 1746.19 Converse in a Galois connection . . . . . . . . . . . . . . . . . . 1776.20 Functions in a Galois connection . . . . . . . . . . . . . . . . . . 1776.21 Relational division . . . . . . . . . . . . . . . . . . . . . . . . . 178

ivCONTENTS6.22 Meet and join . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1826.23 Relational split . . . . . . . . . . . . . . . . . . . . . . . . . . . 1836.24 Relational either . . . . . . . . . . . . . . . . . . . . . . . . . . 1846.25 Meaning of VDM-SL specs . . . . . . . . . . . . . . . . . . . . . 1886.26 Relational semantics of VDM-SL . . . . . . . . . . . . . . . . . 1906.27 Relational McCarthy conditional . . . . . . . . . . . . . . . . . . 1916.28 Reasoning about VDM-SL . . . . . . . . . . . . . . . . . . . . . 1926.29 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 1967 An Introduction to Relational Hylomorphisms 1977.1 “How” does one specify? . . . . . . . . . . . . . . . . . . . . . . 1977.2 Divide-and-conquer (formally) . . . . . . . . . . . . . . . . . . . 1987.3 Relators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1997.4 Properties of relators . . . . . . . . . . . . . . . . . . . . . . . . 2007.5 Equations and fixpoints . . . . . . . . . . . . . . . . . . . . . . . 2027.6 Solving (Fixpoint) Equations . . . . . . . . . . . . . . . . . . . . 2037.7 Solving (Fixpoint) Equations III . . . . . . . . . . . . . . . . . . 2047.8 Solving relational equations . . . . . . . . . . . . . . . . . . . . 2057.9 Laws of the Fixpoint Calculus . . . . . . . . . . . . . . . . . . . 2067.10 µ-fusion theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 2097.11 Hylo(cata)-fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.12 Hylo(ana)-fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 2127.13 Examples: VDM collective types . . . . . . . . . . . . . . . . . . 2137.14 Relational cata(ana)morphisms . . . . . . . . . . . . . . . . . . . 2147.15 Inductive coreflexives . . . . . . . . . . . . . . . . . . . . . . . . 2157.16 Hylos as unique solutions . . . . . . . . . . . . . . . . . . . . . . 2187.17 Accessibility and membership . . . . . . . . . . . . . . . . . . . 2197.18 Hylo-factorization Theorem . . . . . . . . . . . . . . . . . . . . 2227.19 Virtual data-structuring . . . . . . . . . . . . . . . . . . . . . . . 2237.20 Final note on inductive relation ≺ . . . . . . . . . . . . . . . . . 2247.21 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 2248 Theorems for Free 2258.1 Parametric polymorphism: why? . . . . . . . . . . . . . . . . . . 2258.2 Free theorem of type t . . . . . . . . . . . . . . . . . . . . . . . 2268.3 Reynolds arrow operator . . . . . . . . . . . . . . . . . . . . . . 2278.4 Pointwise version of FT . . . . . . . . . . . . . . . . . . . . . . . 2298.5 Second example: FT of (| |) . . . . . . . . . . . . . . . . . . . . . 230

CONTENTSv8.6 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 231IV Data Refinement by Calculation 2339 On Data Representation 2359.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2359.2 Data refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . 2379.3 Representation relations . . . . . . . . . . . . . . . . . . . . . . . 2389.4 Right invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . 2399.5 Refinement inequations . . . . . . . . . . . . . . . . . . . . . . . 2399.6 Relational representation . . . . . . . . . . . . . . . . . . . . . . 2419.7 Functional representation . . . . . . . . . . . . . . . . . . . . . . 2429.8 Concrete invariants . . . . . . . . . . . . . . . . . . . . . . . . . 2429.9 A fundamental iso abstraction . . . . . . . . . . . . . . . . . . . 2439.10 Pointfree untot = (i ◦ 1·) . . . . . . . . . . . . . . . . . . . . . . . 2449.11 Properties of ≤ . . . . . . . . . . . . . . . . . . . . . . . . . . . 2459.12 Structural data refinement . . . . . . . . . . . . . . . . . . . . . . 2469.13 The finite map bifunctor . . . . . . . . . . . . . . . . . . . . . . 2519.14 Transposing relations . . . . . . . . . . . . . . . . . . . . . . . . 2549.15 Transposing finite relations . . . . . . . . . . . . . . . . . . . . . 2559.16 Recursive data refinement . . . . . . . . . . . . . . . . . . . . . . 2569.17 Recursion “removal” . . . . . . . . . . . . . . . . . . . . . . . . 2579.18 Closure and wellfoundedness . . . . . . . . . . . . . . . . . . . . 2609.19 Object oriented Data Implementation . . . . . . . . . . . . . . . . 2619.20 Multiple inheritance . . . . . . . . . . . . . . . . . . . . . . . . . 2619.21 Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 26210 On Algorithmic Refinement 26510.1 Implicit/explicit refinement . . . . . . . . . . . . . . . . . . . . . 26510.2 Handling refinement equations . . . . . . . . . . . . . . . . . . . 26810.3 Solving refinement equations . . . . . . . . . . . . . . . . . . . . 26810.4 Properties of ⊢ . . . . . . . . . . . . . . . . . . . . . . . . . . . 26910.5 Stepwise refinement . . . . . . . . . . . . . . . . . . . . . . . . . 27010.6 Refinement is a partial order . . . . . . . . . . . . . . . . . . . . 27110.7 Main refinement strategies . . . . . . . . . . . . . . . . . . . . . 27210.8 Data refinement in full . . . . . . . . . . . . . . . . . . . . . . . 27310.9 Analysis of refinement equation . . . . . . . . . . . . . . . . . . 274

viCONTENTS10.10Solving refinement equations . . . . . . . . . . . . . . . . . . . . 27510.11Functional solutions . . . . . . . . . . . . . . . . . . . . . . . . . 27510.12Calculation of while/for loops . . . . . . . . . . . . . . . . . 27710.13Bibliography notes . . . . . . . . . . . . . . . . . . . . . . . . . 277A Haskell Support Library in Haskell 279A.0.1 Set.hs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279B Solutions to Selected Exercises 281

List of ExercisesExercise 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Exercise 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Exercise 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Exercise 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Exercise 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Exercise 2.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Exercise 2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercise 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercise 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercise 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Exercise 2.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Exercise 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Exercise 2.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Exercise 2.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Exercise 2.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Exercise 2.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Exercise 2.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Exercise 2.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Exercise 2.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Exercise 2.20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Exercise 2.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Exercise 2.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Exercise 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66Exercise 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Exercise 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76vii

0 LIST OF EXERCISESExercise 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Exercise 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Exercise 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Exercise 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Exercise 3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Exercise 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Exercise 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Exercise 3.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Exercise 3.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Exercise 3.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Exercise 3.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Exercise 3.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Exercise 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Exercise 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Exercise 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112Exercise 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Exercise 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Exercise 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132Exercise 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132Exercise 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134Exercise 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136Exercise 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138Exercise 5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141Exercise 5.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142Exercise 5.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142Exercise 5.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

PreambleThis textbook in preparation has arisen from the author’s research and teachingexperience. Its main aim is to provide software practitioners with a calculationalapproach to the design of software artifacts ranging from simple algorithms andfunctions to the specification and realization of information systems. Put in otherwords, the book invites software designers to raise standards and adopt maturedevelopment techniques found in other engineering disciplines, which (as a rule)are rooted on a sound mathematical basis so as to enable algebraic reasoning.It is interesting to note that while coining the phrase software engineering inthe 1960s, our colleagues of the time were already promising such high qualitystandards. The terminology seems to date from the Garmisch NATO conferencein 1968, from whose report [NR69] the following excerpt is quoted:In late 1967 the Study Group recommended the holding of a working conferenceon Software Engineering. The phrase ‘software engineering’ wasdeliberately chosen as being provocative, in implying the need for softwaremanufacture to be based on the types of theoretical foundations and practicaldisciplines, that are traditional in the established branches of engineering.Provocative or not, the need for sound theoretical foundations has clearly been underconcern since the very beginning of the discipline. However, how “scientific”do such foundations turn out to be, now that four decades have since elapsed?Ten years ago, Richard Bird and Oege de Moore published a textbook [BdM97]in whose preface C.A.R. Hoare writes:Programming notation can be expressed by “formulæ and equations (...)which share the elegance of those which underlie physics and chemistry orany other branch of basic science”.The formulæ and equations mentioned in this quotation are those of the disciplineknown as the Algebra of Programming. Many others have contributed to this body1

2 LIST OF EXERCISESof knowledge, notably Roland Backhouse and his colleagues at Eindhoven andNottingham, see eg. [ABH + 92] and [Bac04], among many others. Unfortunately,both of these references are still unpublished.When the author of this draft textbook decided to teach Algebra of Programmingto 2nd year students of the Minho degrees in computer science, back to1998, he found [BdM97] too difficult for the students to follow, mainly becauseof its explicit categorial (allegorical) flavour. So he decided to start writing slidesand notes helping the students to read the book. Eventually, such notes becamechapters 2 to 4 of the current version of the monograph. The same procedurewas taken when teaching the relational approach of [BdM97] to 4th and 5th yearstudents (today at master level).This draft book is by and large incomplete, most chapters being still in slideform. Such chapters are omitted from the current print-out.Braga, University of Minho, December 2008José N. Oliveira

Part ICalculating with Functions9

Chapter 2An Introduction to PointfreeProgrammingEverybody is familiar with the concept of a function since the school desk. The functionalintuition traverses mathematics from end to end because it has a solid semantics rooted ona well-known mathematical system — the class of “all” sets and set-theoretical functions.Functional programming literally means “programming with functions”. Programminglanguages such as LISP or HASKELL allow us to program with functions. However,the functional intuition is far more reaching than producing code which runs on a computer.Since the pioneering work of John McCarthy — the inventor of LISP — in theearly 1960s, one knows that other branches of programming can be structured, or expressedfunctionally. The idea of producing programs by calculation, that is to say, thatof calculating efficient programs out of abstract, inefficient ones has a long tradition infunctional programming.This book is structured around the idea that functional programming can be used asa basis for teaching programming as a whole, from the successor function n ↦→ n + 1 tolarge information system design.This chapter provides a light-weight introduction to the theory of functional programming.Its emphasis is on explaining how to construct new functions out of other functionsusing a minimal set of predefined functional combinators. This leads to a programmingstyle which is point free in the sense that function descriptions dispense with variables(definition points).Many technical issues are deliberately ignored and deferred to later chapters. Mostprogramming examples will be provided in the HASKELL functional programming language.Appendix A includes the listings of some HASKELL modules which complementthe HUGS Standard Prelude (which is based very closely on the Standard Prelude forHASKELL 1.4.) and help to “animate” the main concepts introduced in this chapter.11

12 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMING2.1 Introducing functions and typesThe definition of a functionf : A B (2.1)can be regarded as a kind of “process” abstraction: it is a “black box” which produces anoutput once it is supplied with an input:x(∈ A) ✲ f ✲ f x(∈ B)From another viewpoint, f can be regarded as a kind of “contract”: it commits itselfto producing a B-value provided it is supplied with an A-value. How is such a valueproduced? In many situations one wishes to ignore it because one is just using functionf. In others, however, one may want to inspect the internals of the “black box” in orderto know the function’s computation rule. For instance,succ : IN INsucc n def = n + 1expresses the computation rule of the successor function — the function succ which finds“the next natural number” — in terms of natural number addition and of natural number1. What we above meant by a “contract” corresponds to the signature of the function,which is expressed by arrow IN IN in the case of succ and which, by the way, canbe shared by other functions, e.g. sq n def = n 2 .In programming terminology one says that succ and sq have the same “type”. Typesplay a prominent rôle in functional programming (as they do in other programming paradigms).Informally, they provide the “glue”, or interfacing material, for putting functions togetherto obtain more complex functions. Formally, a “type checking” discipline canbe expressed in terms of compositional rules which check for functional expression wellformedness.It has become standard to use arrows to denote function signatures or function types,recall (2.1). In this book the following variants will be used interchangeably to denotethe fact that function f accepts arguments of type A and produces results of type B: f :

2.2. FUNCTIONAL APPLICATION 13fB A , f : A B , B A or A f B . This corresponds to writing f:: a -> b in the HASKELL functional programming language, where type variablesare denoted by lowercase letters. A will be referred to as the domain of f and B will bereferred to as the codomain of f. Both A and B are symbols which denote sets of values,very often called types.2.2 Functional applicationWhat do we want functions for? If we ask this question to a physician or engineer theanswer is very likely to be: one wants functions for modelling and reasoning about thebehaviour of real things.For instance, function distance t = 60×t could be written by a school physics studentto model the distance (in, say, kilometers) a car will drive (per hour) at average speed60km/hour. When questioned about how far the car has gone in 2.5 hours, such a modelprovides an immediate answer: just evaluate distance 2.5 to obtain 150km.So we get a naïve purpose of functions: we want them to be applied to argumentsin order to obtain results. Functional application is denoted by juxtaposition, e.g. f afor Bf (xy).fAand a ∈ A, and associates to the left: f xy denotes (f x)y rather than2.3 Functional equality and compositionApplication is not everything we want to do with functions. Very soon our physics studentwill be able to talk about properties of the distance model, for instance that propertydistance (2 × t) = 2 × (distance t) (2.2)holds. Later on, we could learn from her or him that the same property can be restated asdistance (twice t) = twice (distance t), by introducing function twice x def = 2 ×x. Or evensimply asdistance · twice = twice · distance (2.3)where “·” denotes function-arrow chaining, as suggested by drawingIRtwice distance IR twiceIRIRdistance(2.4)

14 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGwhere both space and time are modelled by real numbers.This trivial example illustrates some relevant facets of the functional programmingparadigm. Which version of the property presented above is “better”? the version explicitlymentioning variable t and requiring parentheses (2.2)? the version hiding variable tbut resorting to function twice (2.3)? or even drawing (2.4)?Expression (2.3) is clearly more compact than (2.2). The trend for notation economyand compactness is well-known throughout the history of mathematics. In the 16th century,for instance, algebrists would write 12.cu.˜p.18.ce.˜p.27.co.˜p.17 for what is nowadayswritten as 12x 3 +18x 2 +27x+17. We may find such syncopated notation odd, but shouldnot forget that at its time it was replacing even more obscure expression denotations.Why do people look for compact notations? A compact notation leads to shorterdocuments (less lines of code in programming) in which patterns are easier to identifyand to reason about. Properties can be stated in clear-cut, one-line long equations whichare easy to memorize. And diagrams such as (2.4) can be easily drawn which enable usto visualize maths in a graphical format.Some people will argue that such compact “pointfree” notation (that is, the notationwhich hides variables, or function “definition points”) is too cryptic to be useful as a practicalprogramming medium. In fact, pointfree programming languages such as Iverson’sAPL or Backus’ FP have been more respected than loved by the programmers community.Virtually all commercial programming languages require variables and so implement themore traditional “pointwise” notation.Throughout this book we will adopt both, depending upon the context. Our chosenprogramming medium — HASKELL — blends the pointwise and pointfree programmingstyles in a quite successful way. In order to switch from one to the other, we need two“bridges”: one lifting equality to the functional level and the other lifting application.Concerning equality, note that the “=” sign in (2.2) differs from that in (2.3): whilethe former states that two real numbers are the same number, the latter states that twoIR IR functions are the same function. Formally, we will say that two functionsf,g : B A are equal if they agree at pointwise-level, that isf = g iff 〈∀ a : a ∈ A : f a = B g a〉 (2.5)where = B denotes equality at B-level.Concerning application, the pointfree style replaces it by the more generic conceptof functional composition suggested by function-arrow chaining: wherever two functionsare such that the target type of one of them, say BfgAis the same as the source typeof the other, say C B , then another function can be defined, C A — calledthe composition of f and g, or “f after g” — which “glues” f and g together:(f · g)a def = f (g a) (2.6)f·g

2.3. FUNCTIONAL EQUALITY AND COMPOSITION 15This situation is pictured by the following arrow-diagramBgA(2.7)f f·gCor by block-diagrama✲gg a✲f ✲ f (g a)Therefore, the type-rule associated to functional composition can be expressed as follows:BCBfgf·gCAAComposition is certainly the most basic of all functional combinators. It is the firstkind of “glue” which comes to mind when programmers need to combine, or chain functions(or processes) to obtain more elaborate functions (or processes) 1 . This is becauseof one of its most relevant properties,which shares the pattern of, for instance(f · g) · h = f · (g · h) (2.8)(a + b) + c = a + (b + c)and so is called the associative property of composition. This enables us to move parenthesesaround in pointfree expressions involving functional compositions, or even to omitthem, for instance by writing f ·g·h·i as an abbreviation of ((f ·g)·h)·i, or of (f ·(g·h))·i,or of f ·((g ·h)·i), etc. For a chain of n-many function compositions the notation ○ n i=1 f iwill be acceptable as abbreviation of f 1 · · · · · f n .1 It even has a place in script languages such as UNIX’s, where f | g is the shell counterpartof g · f, for appropriate “processes” f and g.

16 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMING2.4 Identity functionsHow free are we to fulfill the “give me an A and I will give you a B” contract of equation(2.1)? In general, the choice of f is not unique. Some fs will do as little as possiblewhile others will laboriously compute non-trivial outputs. At one of the extremes, wefind functions which “do nothing” for us, that is, the added-value of their output whencompared to their input amounts to nothing:f a = aIn this case B = A, of course, and f is said to be the identity function on A:id A : A Aid A a def = a(2.9)Note that every type X “has” its identity id X . Subscripts will be omitted whereverimplicit in the context. For instance, the arrow notation IN IN saves us from writingid IN , etc.. So, we will often refer to “the” identity function rather than to “an” identityfunction.How useful are identity functions? At first sight, they look fairly uninteresting. Butthe interplay between composition and identity, captured by the following equation,idf · id = id · f = f (2.10)will be appreciated later on. This property shares the pattern of, for instance,a + 0 = 0 + a = aThis is why we say that id is the unit of composition. In a diagram, (2.10) looks like this:AfBididABf(2.11)Note the graphical analogy of diagrams (2.4) and (2.11). Diagrams of this kind are verycommon and express important properties of functions, as we shall see further on.2.5 Constant functionsOpposite to the identity functions, which do not lose any information, we find functionswhich lose all (or almost all) information. Regardless of their input, the output of thesefunctions is always the same value.

2.6. MONICS AND EPICS 17Let C be a nonempty data domain and let and c ∈ C. Then we define the everywherec function as follows, for arbitrary A:c : A Cca def = c(2.12)The following property defines constant functions at pointfree level,and is depicted by a diagram similar to (2.11):c · f = c (2.13)CcA(2.14)idCcBfNote that, strictly speaking, symbol c denotes two different functions in diagram (2.14):one, which we should have written c A , accepts inputs from A while the other, which weshould have written c B , accepts inputs from B:c B · f = c A (2.15)This property will be referred to as the constant-fusion property.As with identity functions, subscripts will be omitted wherever implicit in the context.Exercise 2.1. The HUGS Standard Prelude provides for constant functions: you writeconst c for c. Check that HUGS assigns the same type to expressions f . const cand const (f c), for every f and c. What else can you say about these functionalexpressions? Justify.✷2.6 Monics and epicsIdentity functions and constant functions are the limit points of the functional spectrumwith respect to information preservation. All the other functions are in between: they lose“some” information, which is regarded as uninteresting for some reason. This remarksupports the following aphorism about a facet of functional programming: it is the art

18 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGof transforming or losing information in a controlled and precise way. That is to say,the art of constructing the exact observation of data which fits in a particular context orrequirement.How do functions lose information? Basically in two different ways: they may be“blind” enough to confuse different inputs, by mapping them onto the same output, orthey may ignore values of their codomain. For instance, c confuses all inputs by mappingthem all onto c. Moreover, it ignores all values of its codomain apart from c.Functions which do not confuse inputs are called monics (or injective functions)fand obey the following property: B A is monic if, for every pair of functionsAh,kC , if f · h = f · k then h = k, cf. diagramBfAhkC(f is “cancellable on the left”).It is easy to check that “the” identity function is monic,id · h = id · k ⇒ h = k≡ { by (2.10) }h = k ⇒ h = k≡ { predicate logic }TRUEand that any constant function c is not monic:c · h = c · k ⇒ h = k≡ { by (2.15) }c = c ⇒ h = k≡ { function equality is reflexive }TRUE ⇒ h = k≡ { predicate logic }h = kSo the implication does not hold in general (only if h = k).Functions which do not ignore values of their codomain are called epics (or surjectivefunctions) and obey the following property: AfB is epic if, for every pair of

2.7. ISOS 19functions Ch,kA , if h · f = k · f then h = k, cf. diagramCkhAfB(f is “cancellable on the right”).As expected, identity functions are epic:h · id = k · id ⇒ h = k≡ { by (2.10) }h = k ⇒ h = k≡ { predicate logic }TRUEExercise 2.2. Under what circumstances is a constant function epic? Justify.✷2.7 IsosA function BfAwhich is both monic and epic is said to be iso (an isomorphism, ora bijective function). In this situation, f always has a converse (or inverse) B f ◦ A ,which is such thatf · f ◦ = id B ∧ f ◦ · f = id A (2.16)(i.e. f is invertible).Isomorphisms are very important functions because they convert data from one “format”,say A, to another format, say B, without losing information. So f and and f ◦ arefaithful protocols between the two formats A and B. Of course, these formats contain thesame “amount” of information, although the same data adopts a different “shape” in eachof them. In mathematics, one says that A is isomorphic to B and one writes A ∼ = B toexpress this fact.Isomorphic data domains are regarded as “abstractly” the same. Note that, in general,there is a wide range of isos between two isomorphic data domains. For instance, let

20 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGWeekday be the set of weekdays,Weekday ={Sunday,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday}and let symbol 7 denote the set {1,2,3,4,5,6,7}, which is the initial segment of IN containingexactly seven elements. The following function f, which associates each weekdaywith its “ordinal” number,f : Weekday 7f Monday = 1f Tuesday = 2f Wednesday = 3f Thursday = 4f Friday = 5f Saturday = 6f Sunday = 7is iso (guess f ◦ ). Clearly, f d = i means “d is the i-th day of the week”. But note thatfunction g d def = rem(f d,7) + 1 is also an iso between Weekday and 7. While f regardsMonday the first day of the week, g places Sunday in that position. Both f and g arewitnesses of isomorphismWeekday ∼ = 7 (2.17)Finally, note that all classes of functions referred to so far — constants, identities,epics, monics and isos — are closed under composition, that is, the composition of twoconstants is a constant, the composition of two epics is epic, etc.2.8 Gluing functions which do not compose — productsFunction composition has been presented above as the basis for gluing functions togetherin order to build more complex functions. However, not every two functions can be gluedtogether by composition. For instance, functions f : A C and g : B C donot compose with each other because the domain of one of them is not the codomain ofthe other. However, both f and g share the same domain C. So, something we can do

2.8. GLUING FUNCTIONS WHICH DO NOT COMPOSE — PRODUCTS 21about gluing f and g together is to draw a diagram expressing this fact, something likeA BfCg Because f and g share the same domain, their outputs can be paired, that is, we maywrite ordered pair (f c,g c) for each c ∈ C. Such pairs belong to the Cartesian product ofA and B, that is, to the setA × B def = {(a,b) | a ∈ A ∧ b ∈ B}So we may think of the operation which pairs the outputs of f and g as a new functioncombinator 〈f,g〉 defined as follows:〈f,g〉 : C A × Bdef〈f,g〉 c = (f c,g c)(2.18)Function combinator 〈f,g〉 is pronounced “f split g” (or “pair f and g”) and can bedepicted by the following “block”, or “data flow” diagram:✲f✲ f cc✲g✲ g cFunction 〈f,g〉 keeps the information of both f and g in the same way Cartesian productA × B keeps the information of A and B. So, in the same way A data or B data can beretrieved from A × B data via the implicit projections π 1 or π 2 ,Aπ 1A × Bπ 2 B (2.19)defined byπ 1 (a,b) = a andπ 2 (a,b) = bf and g can be retrieved from 〈f,g〉 via the same projections:π 1 · 〈f,g〉 = f and π 2 · 〈f,g〉 = g (2.20)

22 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGThis fact (or pair of facts) will be referred to as the ×-cancellation property and is illustratedin the following diagram which puts things together:π 1A A × Bf〈f,g〉Cπ 2g B(2.21)In summary, the type-rule associated to the “split” combinator is expressed byABA × BfgCC〈f,g〉A split arises wherever two functions do not compose but share the same domain.What about gluing two functions which fail such a requisite, e.g.CfgAB...?CDThe 〈f,g〉 split combination does not work any more. But a way to “approach” the domainsof f and g, C and D respectively, is to regard them as targets of the projections π 1and π 2 of C × D:fAπ 1A × Bπ 2 BgCπ 1C × Dπ 2 DFrom this diagram 〈f · π 1 ,g · π 2 〉 arisesπ 1π 2A A × B B〈f·π 1 ,g·π 2 〉f·π 1 g·π 2C × Dmapping C × D to A × B. It corresponds to the “parallel” application of f and g whichis suggested by the following data-flow diagram:

2.8. GLUING FUNCTIONS WHICH DO NOT COMPOSE — PRODUCTS 23c✲f✲ f cd✲g✲ g dFunctional combination 〈f ·π 1 ,g·π 2 〉 appears very often and deserves special notation— it will be expressed by f × g. So, by definition, we havef × g def = 〈f · π 1 ,g · π 2 〉 (2.22)which is pronounced “product of f and g” and has typing-ruleABA × Bfgf×gCDC × D(2.23)Note the overloading of symbol “×”, which is used to denote both Cartesian product andfunctional product. This choice of notation will be fully justified later on.What is the interplay among functional combinators f · g (composition), 〈f,g〉 (split)and f ×g (product) ? Composition and split relate to each other via the following property,known as ×-fusion:π 1π 2A A × B B 〈g,h〉 gh g·f C h·ff D〈g,h〉 · f = 〈g · f,h · f〉 (2.24)This shows that split is right-distributive with respect to composition. Left-distributivitydoes not hold but there is something we can say about f · 〈g,h〉 in case f = i × j:(i × j) · 〈g,h〉= { by (2.22) }〈i · π 1 ,j · π 2 〉 · 〈g,h〉= { by ×-fusion (2.24) }

24 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMING〈(i · π 1 ) · 〈g,h〉,(j · π 2 ) · 〈g,h〉〉= { by (2.8) }〈i · (π 1 · 〈g,h〉),j · (π 2 · 〈g,h〉)〉= { by ×-cancellation (2.20) }〈i · g,j · h〉The law we have just derived is known as ×-absorption. (The intuition behind this terminologyis that “split absorbs ×”, as a special kind of fusion.) It is a consequence of×-fusion and ×-cancellation and is depicted as follows:iAπ 1π 1A × Bi×jD D × Eg〈g,h〉Cπ 2π 2h BjE(i × j) · 〈g,h〉 = 〈i · g,j · h〉 (2.25)This diagram provides us with two further results about products and projections whichcan be easily justified:i · π 1 = π 1 · (i × j) (2.26)j · π 2 = π 2 · (i × j) (2.27)Two special properties of f × g are presented next. The first one expresses a kind of“bi-distribution” of × with respect to composition:(g · h) × (i · j) = (g × i) · (h × j) (2.28)We will refer to this property as the ×-functor property. The other property, which wewill refer to as the ×-functor-id property, has to do with identity functions:id A × id B = id A×B (2.29)These two properties will be identified as the functorial properties of product. This choiceof terminology will be explained later on.Let us finally analyse the particular situation in which a split is built involving projectionsπ 1 and π 2 only. These exhibit interesting properties, for instance 〈π 1 ,π 2 〉 = id.This property is known as ×-reflexion and is depicted as follows:π 1A A × Bidπ A×B 1π 2π 2 A × B B〈π 1 ,π 2 〉 = id A×B (2.30)

2.8. GLUING FUNCTIONS WHICH DO NOT COMPOSE — PRODUCTS 25What about 〈π 2 ,π 1 〉? This corresponds to a diagramπ 1B B × A〈ππ 2 ,π 1 〉 2π 2π 1 A × Bwhich looks very much the same if submitted to a 180 o clockwise rotation (thus A andB swap with each other). This suggests that swap (the name we adopt for 〈π 2 ,π 1 〉) is itsown inverse, as can be checked easily as follows:swap · swap A= { by definition swap def = 〈π 2 ,π 1 〉 }〈π 2 ,π 1 〉 · swap= { by ×-fusion (2.24) }〈π 2 · swap,π 1 · swap〉= { definition of swap twice }〈π 2 · 〈π 2 ,π 1 〉,π 1 · 〈π 2 ,π 1 〉〉= { by ×-cancellation (2.20) }〈π 1 ,π 2 〉= { by ×-reflexion (2.30) }idTherefore, swap is iso and establishes the following isomorphismA × B ∼ = B × A (2.31)which is known as the commutative property of product.The “product datatype” A×B is essential to information processing and is available invirtually every programming language. In HASKELL one writes (A,B) to denote A ×B,for A and B two predefined datatypes, fst to denote π 1 and snd to denote π 2 . In the Cprogramming language this datatype is called the “struct datatype”,struct {A first;B second;};

26 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGwhile in PASCAL it is called the “record datatype”:recordfirst: A;second: Bend;Isomorphism (2.31) can be re-interpreted in this context as a guarantee that one does notlose (or gain) anything in swapping fields in record datatypes. C or PASCAL programmersknow also that record-field nesting has the same status, that is to say that, for instance,datatyperecordF: A;S: recordF: B;S: C;endend;is abstractly the same asrecordF: recordF: A;S: Bend;S: C;end;In fact, this is another well-known isomorphism, known as the associative propertyof product:(A × B) × C , which is pronounced “asso-This is established by A × (B × C)ciate to the right” and is defined byA × (B × C) ∼ = (A × B) × C (2.32)assocrassocr def= 〈π 1 · π 1 , 〈π 2 · π 1 ,π 2 〉〉 (2.33)Section A.0.1 in the appendix lists an extension to the HUGS Standard Prelude, calledSet.hs, which makes isomorphisms such as swap and assocr available. In this module,the concrete syntax chosen for 〈f,g〉 is split f g and the one chosen for f × g is f>< g.Exercise 2.3. Show that assocr is iso by conjecturing its inverse assocl and proving thatfunctional equality assocr · assocl = id holds.✷Exercise 2.4. Use (2.22) to prove properties (2.28) and (2.29).✷

2.9. GLUING FUNCTIONS WHICH DO NOT COMPOSE — COPRODUCTS272.9 Gluing functions which do not compose — coproductsThe split functional combinator arose in the previous section as a kind of glue for combiningtwo functions which do not compose but share the same domain. The “dual” situationof two non-composable functions f : C A and g : C B which howevershare the same codomain is depicted inA B f g CIt is clear that the kind of glue we need in this case should make it possible to apply f incase we are on the “A-side” or to apply g in case we are on the “B-side” of the diagram.Let us write [ f,g ] to denote the new kind of combinator. Its codomain will be C. Whatabout its domain?We need to describe the datatype which is “either an A or a B”. Since A and B aresets, we may think of A ∪ B as such a datatype. This works in case A and B are disjointsets, but wherever the intersection A ∩ B is non-empty it is undecidable whether a valuex ∈ A ∩ B is an “A-value” or a “B-value”. In the limit, if A = B then A ∪ B = A =B, that is to say, we have not invented a new datatype at all. These difficulties can becircumvented by resorting to disjoint union:Ai 1 A + BThe values of A+B can be thought of as “copies” of A or B values which are “stamped”with different tags in order to guarantee that values which are simultaneously in A and Bdo not get mixed up. The tagging functions i 1 and i 2 are called injections:i 2Bi 1 a = (t 1 ,a) , i 2 b = (t 2 ,b) (2.34)Knowing the exact values of tags t 1 and t 2 is not essential to understanding the conceptof a disjoint union. It suffices to know that i 1 and i 2 tag differently and consistently. Forinstance, the following realizations of A + B in the C programming language,struct {int tag; /* 1,2 */union {A ifA;B ifB;} data;};

2.9. GLUING FUNCTIONS WHICH DO NOT COMPOSE — COPRODUCTS29+-cancellation :i 1A A + B gCi 2[ g,h ]h B[ g,h ] · i 1 = g , [ g,h ] · i 2 = h (2.38)+-reflexion :i 1A A + B i 1i 2id A+Bi 2 A + BB[ i 1 ,i 2 ] = id A+B (2.39)+-fusion :i 1i 2A A + B B [ g,h ]g h f·g C f·h f Df · [ g,h ] = [ f · g,f · h ] (2.40)+-absorption :Ai 1A + Bi 2B[ g,h ] · (i + j) = [ g · i,h · j ] (2.41)ii 1i+jD D + E gCi 2[ g,h ]h Ej+-functor :(g · h) + (i · j) = (g + i) · (h + j) (2.42)+-functor-id :id A + id B = id A+B (2.43)

30 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGIn summary, the typing-rules of the either and sum combinators are as follows:CCCfg[ f,g ]ABA + BCDC + Dfgf+gABA + B(2.44)Exercise 2.5.so that factBy analogy (duality) with swap, show that [ i 2 ,i 1 ] is its own inverse andA + B ∼ = B + A (2.45)holds.✷Exercise 2.6. Dualize (2.33), that is, write the iso which witnesses factA + (B + C) ∼ = (A + B) + C (2.46)from right to left. Use the either syntax available from the HUGS Standard Prelude toencode this iso in HASKELL.✷2.10 Mixing products and coproductsDatatype constructions A ×B and A+B have been introduced above as devices requiredfor expressing the codomain of splits (A×B) or the domain of eithers (A+B). Therefore,a function mapping values of a coproduct (say A+B) to values of a product (say A ′ ×B ′ )can be expressed alternatively as an either or as a split. In the first case, both componentsof the either combinator are splits. In the latter, both components of the split combinatorare eithers.This exchange of format in defining such functions is known as the exchange law. Itstates the functional equality which follows:[ 〈f,g〉, 〈h,k〉 ] = 〈[ f,h ],[ g,k ]〉 (2.47)

2.10. MIXING PRODUCTS AND COPRODUCTS 31It can be checked by type-inference that both the left-hand side and the right-hand side expressionsof this equality have type B × D A + C , for B A , D A ,hkB C and D C .An example of a function which is in the exchange-law format is isomorphismA × (B + C) undistr(A × B) + (A × C)(2.48)(pronounce undistr as “un-distribute-right”) which is defined byfgundistrdef= [ id × i 1 ,id × i 2 ] (2.49)and witnesses the fact that product distributes through coproduct:A × (B + C) ∼ = (A × B) + (A × C) (2.50)In this context, suppose that we know of three functions D A , E Bhand F C . By (2.44) we infer E + F B + C . Then, by (2.23) we inferg+hf×(g+h)D × (E + F) A × (B + C)(2.51)So, it makes sense to combine products and sums of functions and the expressions whichdenote such combinations have the same “shape” (or symbolic pattern) as the expressionswhich denote their domain and range — the ... × (· · · + · · ·) “shape” in this example. Infact, if we abstract such a pattern via some symbol, say F — that is, if we definefgF(α,β,γ)def= α × (β + γ)F(f,g,h)— then we can write F(D,E,F) F(A,B,C) for (2.51).This kind of abstraction works for every combination of products and coproducts. Forinstance, if we now abstract the right-hand side of (2.48) via patternG(α,β,γ)def= (α × β) + (α × γ)we have G(f,g,h) = (f × g) + (f × h), a function which maps G(A,B,C) = (A ×B) + (A × C) onto G(D,E,F) = (D × E) + (D × F). All this can be put in a diagramF(A,B,C)undistr G(A,B,C)F(f,g,h)F(D,E,F)G(f,g,h)G(D,E,F)

32 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGwhich unfolds toA × (B + C)undistr (A × B) + (A × C)(2.52)f×(g+h)(f×g)+(f×h)D × (E + F) (D × E) + (D × F)once the F and G patterns are instantiated. An interesting topic which stems from (completing)this diagram will be discussed in the next section.Exercise 2.7. Apply the exchange law to undistr.✷Exercise 2.8. Complete the “?”s in diagram[ x,y ]? id+id×f? ?[ k,g ]and then solve the implicit equation for x and y.✷Exercise 2.9. Repeat exercise 2.8 with respect to diagram✷? h+〈i,j〉 ?x+y?id+id×f

2.11. NATURAL PROPERTIES 332.11 Natural propertiesLet us resume discussion about undistr and the two other functions in diagram (2.52).What about using undistr itself to close this diagram, at the bottom? Note that definition(2.49) works for D, E and F in the same way it does for A, B and C. (Indeed, theparticular choice of symbols A, B and C in (2.48) was rather arbitrary.) Therefore, weget:A × (B + C)f×(g+h)undistr (A × B) + (A × C)(f×g)+(f×h)D × (E + F) (D × E) + (D × F)undistrwhich expresses a very important property of undistr:(f × (g + h)) · undistr = undistr · ((f × g) + (f × h)) (2.53)This is called the natural property of undistr. This kind of property (often calledfree instead of natural) is not a privilege of undistr. As a matter of fact, every functioninterfacing patterns such as F or G above will exhibit its own natural property. Furthermore,we have already quoted natural properties without mentioning it. Recall (2.10), forinstance. This property (establishing id as the unit of composition) is, after all, the naturalproperty of id. In this case we have F α = G α = α, as can be easily observed in diagram(2.11).In general, natural properties are described by diagrams in which two “copies” of theoperator of interest are drawn as horizontal arrows:AfF AF fφG AB F B G BφG f(F f) · φ = φ · (G f) (2.54)Note that f is universally quantified, that is to say, the natural property holds for everyf : B A .Diagram (2.54) corresponds to unary patterns F and G. As we have seen with undistr,other functions (g,h etc.) come into play for multiary patterns. A very important rôle willbe assigned throughout this book to these F,G, etc. “shapes” or patterns which are sharedby pointfree functional expressions and by their domain and codomain expressions. Fromchapter 3 onwards we will refer to them by their proper name — “functor” — whichis standard in mathematics and computer science. Then we will also explain the namesassigned to properties such as, for instance, (2.28) or (2.42).

34 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGExercise 2.10. Show that (2.26) and (2.27) are natural properties. Dualize these properties.Hint: recall diagram (2.41).✷Exercise 2.11. Establish the natural properties of the swap (2.31) and assocr (2.33)isomorphisms.✷2.12 Universal propertiesFunctional constructs 〈f,g〉 and [ f,g ] (and their derivatives f×g and f+g ) provide goodillustration about what is meant by a program combinator in a compositional approach toprogramming: the combinator is put forward equipped with a concise set of propertieswhich enable programmers to transform programs, reason about them and perform usefulcalculations. This raises a programming methodology which is scientific and stable.Such properties bear standard names such as cancellation, reflexion, fusion, absorptionetc.. Where do these come from? As a rule, for each combinator to be defined onehas to define suitable constructions at “interface”-level 2 , e.g. A × B and A + B. Theseare not chosen or invented at random: each is defined in a way such that the associatedcombinator is uniquely defined. This is assured by a so-called universal property fromwhich the others can derived.Take product A × B, for instance. Its universal property states that, for each pair offfarrows A C and B C , there exists an arrow A × B C such that〈f,g〉k = 〈f,g〉⇔{π1 · k = fπ 2 · k = g(2.55)holds — recall diagram (2.21) — for all A × B C . This equivalence states that〈f,g〉 is the unique arrow satisfying the property on the right. In fact, read (2.55) in the ⇒direction and let k be 〈f,g〉. Then π 1 · 〈f,g〉 = f and π 2 · 〈f,g〉 = g will hold, meaningthat 〈f,g〉 effectively obeys the property on the right. In other words, we have derived2 In the current context, programs “are” functions and program-interfaces “are” the datatypesinvolved in functional signatures.k

2.12. UNIVERSAL PROPERTIES 35×-cancellation (2.20). Reading (2.55) in the ⇐ direction we understand that, if some ksatisfies such properties, then it “has to be” the same arrow as 〈f,g〉.It is easy to see other properties of 〈f,g〉 arising from (2.55). For instance, for k = idwe get ×-reflexion (2.30),{π1 · id = fid = 〈f,g〉 ⇔π 2 · id = g≡ { by (2.10) }{π1 = fid = 〈f,g〉 ⇔π 2 = g≡ { by substitution of f and g }id = 〈π 1 ,π 2 〉and for k = 〈i,j〉 · h we get ×-fusion (2.24):〈i,j〉 · h = 〈f,g〉 ⇔{π1 · (〈i,j〉 · h) = fπ 2 · (〈i,j〉 · h) = g≡ { composition is associative (2.8) }{(π1 · 〈i,j〉) · h = f〈i,j〉 · h = 〈f,g〉 ⇔(π 2 · 〈i,j〉) · h = g≡ { by ×-cancellation (just derived) }{i · h = f〈i,j〉 · h = 〈f,g〉 ⇔j · h = g≡ { by substitution of f and g }〈i,j〉 · h = 〈i · h,j · h〉It will take about the same effort to derive split structural equality〈i,j〉 = 〈f,g〉⇔{i = fj = g(2.56)from universal property (2.55) — just let k = 〈i,j〉.Similar arguments can be built around coproduct’s universal property,k = [ f,g ] ⇔{k · i1 = fk · i 2 = g(2.57)

36 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGfrom which structural equality of eithers can be inferred,[ i,j ] = [ f,g ] ⇔{i = fj = g(2.58)as well as the other properties we know about this combinator.Exercise 2.12. Derive +-cancellation (2.38), +-reflexion (2.39) and +-fusion (2.40)from universal property (2.57). Then derive the exchange law (2.47) from the universalproperty of product (2.55) or coproduct (2.57).✷2.13 Guards and McCarthy’s conditionalMost functional programming languages and notations cater for pointwise conditionalexpressions of the formif (p x) then (g x) else (h x)meaning {p x ⇒ g x¬(p x) ⇒ hxpfor some given predicate Bool A , some “then”-function B Ahgand some“else”-function B A . Bool is the primitive datatype containing truth values FALSEand TRUE.Can such expressions be written in the pointfree style? They can, provided we introducethe so-called “McCarthy conditional” functional formwhich is defined byp → g,hp → g,h def = [ g,h ] · p? (2.59)a definition we can understand provided we know the meaning of the “p?” construct.We call A + Ap?Aa guard, or better, the guard associated to a given predicate

2.13. GUARDS AND MCCARTHY’S CONDITIONAL 37pBool A . Every predicate p gives birth to its own guard p? which, at point-level, isdefined as follows:(p?)a ={p a ⇒ i1 a¬(p a) ⇒ i 2 a(2.60)In a sense, guard p? is more “informative” than p alone: it provides information about theoutcome of testing p on some input a, encoded in terms of the coproduct injections (i 1 fora true outcome and i 2 for a false outcome, respectively) without losing the input a itself.The following fact, which we will refer to as McCarthy’s conditional fusion law, is aconsequence of +-fusion (2.40):f · (p → g,h) = p → f · g,f · h (2.61)We shall introduce and define instances of predicate p as long as they are needed. Aparticularly important assumption of our notation should, however, be mentioned at thispoint: we assume that, for every datatype A, the equality predicate Bool A × A isdefined in a way which guarantees three basic properties: reflexivity (a = A a for everya), transitivity (a = A b and b = A c implies a = A c) and symmetry (a = A b iff b = A a).Subscript A in = A will be dropped wherever implicit in the context.In HASKELL programming, the equality predicate for a type becomes available bydeclaring the type as an instance of class Eq, which exports equality predicate (==).This does not, however, guarantee the reflexive, transitive and symmetry properties, whichneed to be proved by dedicated mathematical arguments.Exercise 2.13. Prove that the following equality between two conditional expressionsk(if p x then f x else hx,if p x then g x else ix)= if p x then k(f x,g x) else k(hx,ix)holds by rewriting it in the pointfree style (using the McCarthy’s conditional combinator)and applying the exchange law (2.47), among others.✷= AExercise 2.14. Prove law (2.61).✷

38 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGExercise 2.15. From (2.59) and propertyp? · f = (f + f) · (p · f)? (2.62)infer(p → f,g) · h = (p · h) → (f · h),(g · h) (2.63)✷2.14 Gluing functions which do not compose — exponentialsNow that we have made the distinction between the pointfree and pointwise functionalnotations reasonably clear, it is instructive to revisit section 2.2 and identify functionalapplication as the “bridge” between the pointfree and pointwise worlds. However, weshould say “a bridge” rather than “the bridge”, for in this section we enrich such an interfacewith another “bridge” which is very relevant to programming.Suppose we are given the task to combine two functions B C × A and D A .It is clear that none of the combinations f · g, 〈f,g〉 or [ f,g ] is well-typed. So, f and gcannot be put together directly — they require some extra interfacing.Note that 〈f,g〉 would be well-defined in case the C component of f’s domain couldbe somehow “ignored”. Suppose, in fact, that in some particular context the first argumentof f happens to be “irrelevant”, or to be frozen to some c ∈ C. It is easy to derive a newfunctionf c : A Bf c a def = f(c,a)from f which combines nicely with g via the split combinator: 〈f c ,g〉 is well-definedand bears type B × D A . For instance, suppose that C = A and f is the equalitypredicate = on A. Then Bool= cAis the “equal to c” predicate on A values:= c a def = a = c (2.64)fg

2.14. GLUING FUNCTIONS WHICH DO NOT COMPOSE — EXPONENTIALS39As another example, recall function twice (2.3) which could be defined as × 2 using thenew notation.However, we need to be more careful about what is meant by f c . Such as functionalapplication, expression f c interfaces the pointfree and the pointwise levels — it involvesa function (f) and a value (c). But, for B C × A , there is a major distinctionbetween f c and f c — while the former denotes a value of type B, i.e. f c ∈ B, f cdenotes a function of type B A . We will say that f c ∈ B A by introducing a newdatatype construct which we will refer to as the exponential:fB Adef= {g | g : B A } (2.65)There are strong reasons to adopt the B A notation to the detriment of the more obviousB ← A or A → B alternatives, as we shall see shortly.The B A exponential datatype is therefore inhabited by functions from A to B, that isto say, functional declaration g : B A means the same as g ∈ B A . And what dowe want functions for? We want to apply them. So it is natural to introduce the applyoperatorap : Bapap(f,a) def = f awhich applies a function f to an argument a.fB A × ABack to generic binary function B C × A , let us now think of the operationwhich, for every c ∈ C, produces f c ∈ B A . This can be regarded as a function ofsignature B A C which expresses f as a kind of C-indexed family of functions ofsignature B A . We will denote such a function by f (read f as “f transposed”).Intuitively, we want f and f to be related to each other by the following property:f(c,a) = (f c)a (2.66)Given c and a, both expressions denote the same value. But, in a sense, f is more tolerantthan f: while the latter is binary and requires both arguments (c,a) to become availablebefore application, the former is happy to be provided with c first and with a later on, ifactually required by the evaluation process.Similarly to A × B and A + B, exponential B A involves a universal property,k = f ⇔ f = ap · (k × id) (2.67)from which laws for cancellation, reflexion and fusion can be derived:

40 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGExponentials cancellation :B A B A × A ap Bf f×idC fC × Af = ap · (f × id) (2.68)Exponentials reflexion :id B AB A B A × A ap Bid B A ×id AB Aap B A × Aap = id B A (2.69)Exponentials fusion :B A B A × A ap Bgg g×id C C × A g·(f×id)f f×idD D × Ag · (f × id) = g · f (2.70)Note that the cancellation law is nothing but fact (2.66) written in the pointfree style.Is there an absorption law for exponentials? The answer is affirmative but first weneed to introduce a new functional combinator which arises as the transpose of f · ap inthe following diagram:D A × A apf·ap×idB A × Aap Df BWe shall denote this by f A and its type-rule is as follows:CC A ff ABB A

2.14. GLUING FUNCTIONS WHICH DO NOT COMPOSE — EXPONENTIALS41It can be shown that, once A and CgfBare fixed, f A is the function which acceptssome input function B A as argument and produces function f · g as result (seeexercise 2.23). So f A is the “compose with f” functional combinator:Now we are ready to understand the laws which follow:(f A )g def = f · g (2.71)Exponentials absorption :D A D A × A ap Df A f A ×id fB Ag g×idCB A × Aap Bg C × Af · g = f A · g (2.72)Exponentials-functor :(g · h) A = g A · h A (2.73)Exponentials-functor-id :id A = id (2.74)To conclude this section we need to explain why we have adopted the apparentlyesoteric B A notation for the “function from A to B” data type. Let us introduce thefollowing operatorcurry fdef= f (2.75)which maps a function f to its transpose f. This operator, which is very familiar tofunctional programmers, maps functions in some function space B C×A to functions in(B A ) C . Its inverse (known as thêfunction) also exists. In the HUGS Standard Preludewe find them declared as follows:curry :: ((a,b) -> c) -> (a -> b -> c)curry f x y = f (x,y)uncurry :: (a -> b -> c) -> ((a,b) -> c)uncurry f p = f (fst p) (snd p)

42 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGFrom (2.75) it is obvious see that writing f or curry f is a matter of taste, the latter beingmore in the tradition of functional programming. For instance, the fusion law (2.70) canbe re-written ascurry (g · (f × id)) = curry g · fand so on.It is known from mathematics that curry and̂are isos witnessing the following isomorphismwhich is at the core of the theory of functional programming:B C×A ∼ = (B A ) C (2.76)Fact (2.76) clearly resembles a well known equality concerning numeric exponentials,b c×a = (b a ) c . But other known facts about numeric exponentials, e.g. a b+c = a b × a c or(b × c) a = b a × c a find their counterpart in functional exponentials. The counterpart ofthe former,A B+C ∼ = A B × A C (2.77)arises from the uniqueness of the either combination: every pair of functions (f,g) ∈A B × A C leads to a unique function [ f,g ] ∈ A B+C and vice-versa, every function inA B+C is the either of some function in A B and of another in A C .The function exponentials counterpart of the second fact about numeric exponentialsabove is(B × C) A ∼ = B A × C A (2.78)This can be justified by a similar argument concerning the uniqueness of the split combinator〈f,g〉.What about other facts valid for numeric exponentials such as a 0 = 1 and 1 a = 1? Weneed to know what 0 and 1 mean as datatypes. Such elementary datatypes are presentedin the section which follows.Exercise 2.16. Load module Set.hs (cf. section A.0.1) into the HUGS interpreter andcheck the types assigned to the following functional expressions:curry ap\f -> ap . ( f >< id)uncurry . curryWhich of these is functionally equivalent to the uncurry function and why? Which ofthese are functionally equivalent to identity functions? Justify.✷

2.15. ELEMENTARY DATATYPES 432.15 Elementary datatypesSo far we have talked mostly about arbitrary datatypes represented by capital letters A,B, etc. (lowercase a, b, etc. in the HASKELL illustrations). We also mentioned IR, Booland IN and, in particular, the fact that we can associate to each natural number n its initialsegment n = {1,2,... ,n}. We extend this to IN 0 by stating 0 = {} and, for n > 0,n + 1 = {n + 1} ∪ n.Initial segments can be identified with enumerated types and are regarded as primitivedatatypes in our notation. We adopt the convention that primitive datatypes are writtenin the sans serif font and so, strictly speaking, n is distinct from n: the latter denotes anatural number while the former denotes a datatype.Datatype 0Among such enumerated types, 0 is the smallest because it is empty. This is the Voiddatatype in HASKELL, which has no constructor at all. Datatype 0 (which we tend towrite simply as 0) may not seem very “useful” in practice but it is of theoretical interest.For instance, it is easy to check that the following “obvious” properties hold:Datatype 1A + 0 ∼ = A (2.79)A × 0 ∼ = 0 (2.80)Next in the sequence of initial segments we find 1, which is singleton set {1}. How usefulis this datatype? Note that every datatype A containing exactly one element is isomorphicto {1}, e.g. A = {NIL}, A = {0}, A = {1}, A = {FALSE}, etc.. We represent this classof singleton types by 1.Recall that isomorphic datatypes have the same expressive power and so are “abstractlyidentical”. So, the actual choice of inhabitant for datatype 1 is irrelevant, and wecan replace any particular singleton set by another without losing information. This isevident from the following relevant facts involving 1:A × 1 ∼ = A (2.81)A 0 ∼ = 1 (2.82)We can read (2.81) informally as follows: if the second component of a record (“struct”)cannot change, then it is useless and can be ignored. Selector π 1 is, in this context, an isomapping the left-hand side of (2.81) to its right-hand side. Its inverse is 〈id,c〉 where cis a particular choice of inhabitant for datatype 1. Concerning (2.82), A 0 denotes the setof all functions from the empty set to some A. What does (2.82) mean? It simply tells

44 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGus that there is only one function in such a set — the empty function mapping “no” valueat all. This fact confirms our choice of notation once again (compare with a 0 = 1 in anumeric context).Next, we may wonder about facts1 A ∼ = 1 (2.83)A 1 ∼ = A (2.84)which are the functional exponentiation counterparts of 1 a = 1 and a 1 = a. Fact (2.83)is valid: it means that there is only one function mapping A to some singleton set {c}— the constant function c. There is no room for another function in 1 A because onlyc is available as output value. Fact (2.84) is also valid: all functions in A 1 are (singlevalued) constant functions and there are as many constant functions in such a set as thereare elements in A.In summary, when referring to datatype 1 we will mean an arbitrary singleton type,and there is a unique iso (and its inverse) between two such singleton types. The HASKELLrepresentative of 1 is datatype (), called the unit type, which contains exactly constructor(). It may seem confusing to denote the type and its unique inhabitant by the same symbolbut it is not, since HASKELL keeps track of types and constructors in separate symbolsets.Finally, what can we say about 1+A? Every function Bf1 + Aobserving thistype is bound to be an either [ b 0 ,g ] for b 0 ∈ B and B A . This is very similar tothe handling of a pointer in C or PASCAL: we “pull a rope” and either we get nothing (1)or we get something useful of type B. In such a programming context “nothing” abovemeans a predefined value NIL. This analogy supports our preference in the sequel for NILas canonical inhabitant of datatype 1. In fact, we will refer to 1 + A (or A + 1) as the“pointer to A” datatype. This corresponds to the Maybe type constructor of the HUGSStandard Prelude.gDatatype 2Let us inspect the 1 + 1 instance of the “pointer” construction just mentioned above. Anyfobservation B 1 + 1 can be decomposed in two constant functions: f = [ b 1 ,b 2 ].Now suppose that B = {b 1 ,b 2 } (for b 1 ≠ b 2 ). Then 1 + 1 ∼ = B will hold, for whateverchoice of inhabitants b 1 and b 2 . So we are in a situation similar to 1: we will use symbol 2to represent the abstract class of all such Bs containing exactly two elements. Therefore,we can write:1 + 1 ∼ = 2

2.16. FINITARY PRODUCTS AND COPRODUCTS 45Of course, Bool = {TRUE, FALSE} and initial segment 2 = {1,2} are in this abstractclass. In the sequel we will show some preference for the particular choice of inhabitantsb 1 = TRUE and b 2 = FALSE, which enables us to use symbol 2 in places where Bool isexpected.Exercise 2.17. Relate HASKELL expressionsandeither (split (const True) id) (split (const False) id)\f->(f True, f False)to the following isomorphisms involving generic elementary type 2:Apply the exchange law (2.47) to the first expression above.✷2 × A ∼ = A + A (2.85)A × A ∼ = A 2 (2.86)2.16 Finitary products and coproductsIn section 2.8 it was suggested that product could be regarded as the abstraction behinddata-structuring primitives such as struct in C or record in PASCAL. Similarly, coproductswere suggested in section 2.9 as abstract counterparts of C unions or PASCALvariant records. For a finite A, exponential B A could be realized as an array in any ofthese languages. These analogies are captured in table 2.1.In the same way C structs and unions may contain finitely many entries, as mayPASCAL (variant) records, product A × B extends to finitary product A 1 × ... × A n , forn ∈ IN, also denoted by Π n i=1 A i, to which as many projections π i are associated as thenumber n of factors involved. Of course, splits become n-ary as well〈f 1 ,...,f n 〉 : A 1 × ... × A n Bfor f i : A i B , i = 1,n.Dually, coproduct A + B is extensible to the finitary sum A 1 + · · · + A n , for n ∈ IN,also denoted by ∑ nj=1 A j , to which as many injections i j are assigned as the number n ofterms involved. Similarly, eithers become n-aryfor f i : BA i[ f 1 ,...,f n ] : A 1 + ... + A n B, i = 1,n.

46 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGAbstract notation PASCAL C/C++ DescriptionA × BA + BrecordP: A;S: Bend;recordcasetag: integerof x =1: (P:A);2: (S:B)struct {A first;B second;};struct {int tag; /* 1,2 */union {A ifA;B ifB;} data;};RecordsVariant recordsend;B A array[A] of B B ...[A] Arrays1 + A ˆA A *... PointersTable 2.1: Abstract notation versus programming language data-structures.Datatype nNext after 2, we may think of 3 as representing the abstract class of all datatypes containingexactly three elements. Generalizing, we may think of n as representing the abstractclass of all datatypes containing exactly n elements. Of course, initial segment n willbe in this abstract class. (Recall (2.17), for instance: both Weekday and 7 are abstractlyrepresented by 7.) Therefore,andn ∼ = 1 + · · · + 1} {{ }nA × ... × A} {{ }nA + ... + A} {{ }n∼= A n (2.87)∼= n × A (2.88)hold.Exercise 2.18. On the basis of table 2.1, encode undistr (2.49) in C or PASCAL.Compare your code with the HASKELL pointfree and pointwise equivalents.✷

2.17. INITIAL AND TERMINAL DATATYPES 472.17 Initial and terminal datatypesAll properties studied for binary splits and binary eithers extend to the finitary case. Forthe particular situation n = 1, we will have 〈f〉 = [ f ] = f and π 1 = i 1 = id, ofcourse. For the particular situation n = 0, finitary products “degenerate” to 1 and finitarycoproducts “degenerate” to 0. So diagrams (2.21) and (2.36) are reduced to〈〉1 0CThe standard notation for the empty split 〈〉 is ! C , where subscript C can be omitted ifimplicit in the context. By the way, this is precisely the only function in 1 C , recall (2.83).Dually, the standard notation for the empty either [ ] is ? C , where subscript C can also beomitted. By the way, this is precisely the only function in C 0 , recall (2.82).In summary, we may think of 0 and 1 as, in a sense, the “extremes” of the wholedatatype spectrum. For this reason they are called initial and terminal, respectively. Weconclude this subject with the presentation of their main properties which, as we havesaid, are instances of properties we have stated for products and coproducts.C[ ]Initial datatype reflexion :? 0 =id 00? 0 = id 0 (2.89)Initial datatype fusion :0 ? BA fB? Af·? A =? B (2.90)Terminal datatype reflexion :! 1 =id 11! 1 = id 1 (2.91)

48 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGTerminal datatype fusion :1 ! A · f =! B (2.92)A! Af! BBExercise 2.19. Particularize the exchange law (2.47) to empty products and emptycoproducts, i.e. 1 and 0.✷2.18 Sums and products in HASKELLWe conclude this chapter with an analysis of the main primitive available in HASKELLfor creating datatypes: the data declaration. Suppose we declaredata CostumerId = P Int | CC Intmeaning to say that, for some company, a client is identified either by its passport numberor by its credit card number, if any. What does this piece of syntax precisely mean?If we enquire the HUGS interpreter about what it knows about CostumerId, thereply will contain the following information:Main> :i CostumerId-- type constructordata CostumerId-- constructors:P :: Int -> CostumerIdCC :: Int -> CostumerIdIn general, let A and B be two known datatypes. Via declarationdata C = C1 A | C2 B (2.93)

50 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGLet us use this in calculating the inverse inv of [ C1,C2 ]:Therefore:inv · [ C1,C2 ] = id≡ { by +-fusion (2.40) }[ inv · C1,inv · C2 ] = id≡ { by +-reflexion (2.39) }[ inv · C1,inv · C2 ] = [ i 1 ,i 2 ]≡ { either structural equality (2.58) }inv · C1 = i 1 ∧ inv · C2 = i 2inv : C A + Binv(C1a) = i 1 ainv(C2b) = i 2 bIn summary, C1 is a “renaming” of injection i 1 , C2 is a “renaming” of injection i 2 and Cis “renamed” replica of A + B:C[ C1,C2 ]A + B[ C1,C2 ] is called the algebra of datatype C and its inverse inv is called the coalgebraof C. The algebra contains the constructors of C1 and C2 of type C, that is, it is usedto “build” C-values. In the opposite direction, co-algebra inv enables us to “destroy” orobserve values of C:Cinv∼= A + B[ C1,C2 ]Algebra/coalgebras also arise about product datatypes. For instance, suppose that onewishes to describe datatype Point inhabited by pairs (x 0 ,y 0 ), (x 1 ,y 1 ) etc. of Cartesiancoordinates of a given type, say A. Although A × A equipped with projections π 1 ,π 2“is” such a datatype, one may be interested in a suitably named replica of A ×A in whichpoints are built explicitly by some constructor (say Point) and observed by dedicatedselectors (say x and y):π 1A A × AxPointπ 2Pointy A(2.94)

2.19. EXERCISES 51This rises an algebra (Point) and a coalgebra (〈x,y〉) for datatype Point:Point〈x,y〉∼= A × APointIn HASKELL one writesdata Point a = Point { x :: a, y :: a }but be warned that HASKELL delivers Point in curried form:Point :: a -> a -> Point aFinally, what is the “pointer”-equivalent in HASKELL? This corresponds to A = 1 in(2.93) and to the following HASKELL declaration:data C = C1 () | C2 BNote that HASKELL allows for a more programming-oriented alternative in this case, inwhich the unit type () is eliminated:data C = C1 | C2 BThe difference is that here C1 denotes an inhabitant of C (and so a clause f(C1a) = ...is rewritten to f C1 = ...) while above C1 denotes a (constant) function CC11 .Isomorphism (2.84) helps in comparing these two alternative situations.2.19 ExercisesExercise 2.20. Let A and B be two disjoint datatypes, that is, A ∩ B = ∅ holds. Showthat isomorphismiA ∪ B ∼ = A + B (2.95)holds. Hint: define A ∪ B A + B as i = [ emb A ,emb B ] for emb A a = a andemb B b = b, and find its inverse. By the way, why didn’t we define i simply as i def =[ id A ,id B ]?✷

52 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMINGExercise 2.21. Let distr (read: ‘distribute right’) be the bijection which witnessesisomorphism A×(B+C) ∼ = A×B+A×C. Fill in the “. . . ”in the diagram which followsso that it describes bijection distl (red: ‘distribute left’) which witnesses isomorphism(B + C) × A ∼ = B × A + C × A:(B + C) × A swap · · · distr · · ·distl··· B × A + C × A✷Exercise 2.22. In the context of exercise 2.21, proveknowing thatholds.✷[ g,h ] × f = [ g × f,h × f ] · distl (2.96)f × [ g,h ] = [ f × g,f × h ] · distrExercise 2.23. Show that (f · ap)g = f · g holds, cf. (2.71).✷Exercise 2.24. Let C const C A be the function of exercise 2.1, that is, const c = c A .Which fact is expressed by the following diagram featuring const?C const C A f AfB constB AWrite it at point-level and describe it by your own words.✷

2.20. BIBLIOGRAPHY NOTES 53Exercise 2.25.HASKELL,Establish the difference between the following two declarations indata D = D1 A | D2 B Cdata E = E1 A | E2 (B,C)for A, B and C any three predefined types. Are D and E isomorphic? If so, can you specifyand encode the corresponding isomorphism?✷2.20 Bibliography notesA few decades ago John Backus read, in his Turing Award Lecture, a revolutionary paper[Bac78]. This paper proclaimed conventional command-oriented programming languagesobsolete because of their inefficiency arising from retaining, at a high-level, the so-called“memory access bottleneck” of the underlying computation model — the well-knownvon Neumann architecture. Alternatively, the (at the time already mature) functional programmingstyle was put forward for two main reasons. Firstly, because of its potential forconcurrent and parallel computation. Secondly — and Backus emphasis was really puton this —, because of its strong algebraic basis.Backus algebra of (functional) programs was providential in alerting computer programmersthat computer languages alone are insufficient, and that only languages whichexhibit an algebra for reasoning about the objects they purport to describe will be usefulin the long run.The impact of Backus first argument in the computing science and computer architecturecommunities was considerable, in particular if assessed in quality rather than quantityand in addition to the almost contemporary structured programming trend 3 . By contrast,his second argument for changing computer programming was by and large ignored, andonly the so-called algebra of programming research minorities pursued in this direction.However, the advances in this area throughout the last two decades are impressive and canbe fully appreciated by reading a textbook written relatively recently by Bird and de Moor3 Even the C programming language and the UNIX operating system, with their implicit functionalflavour, may be regarded as subtle outcomes of the “going functional” trend.

54 CHAPTER 2. AN INTRODUCTION TO POINTFREE PROGRAMMING[BdM97]. A comprehensive review of the voluminous literature available in this area canalso be found in this book.Although the need for a pointfree algebra of programming was first identified byBackus, perhaps influenced by Iverson’s APL growing popularity in the USA at that time,the idea of reasoning and using mathematics to transform programs is much older and canbe traced to the times of McCarthy’s work on the foundations of computer programming[McC63], of Floyd’s work on program meaning [Flo67] and of Paterson and Hewitt’scomparative schematology [PH70]. Work of the so-called program transformation schoolwas already very expressive in the mid 1970s, see for instance references [BD77].The mathematics adequate for the effective integration of these related but independentlines of thought was provided by the categorial approach of Manes and Arbib compiledin a textbook [MA86] which has very strongly influenced the last decade of 20thcentury theoretical computer science.A so-called MPC (“Mathematics of Program Construction”) community has beenamong the most active in producing an integrated body of knowledge on the algebra ofprogramming which has found in functional programming an eloquent and paradigmaticmedium. Functional programming has a tradition of absorbing fresh results from theoreticalcomputer science, algebra and category theory. Languages such as HASKELL [Bir98]have been competing to integrate the most recent developments and therefore are excellentprototyping vehicles in courses on program calculation, as happens with this book.

Chapter 3Recursion in the Pointfree StyleHow useful from a programmer’s point of view are the abstract concepts presented inthe previous chapter? Recall that a table was presented — table 2.1 — which records ananalogy between abstract type notation and the corresponding data-structures available incommon, imperative languages.This analogy is precisely our point of departure for extending the abstract notationtowards a most important field of programming: recursion.3.1 MotivationLet us consider a very common data-structure in programming: “linked-lists”. In PASCALone will writeL = ˆN;N = recordfirst: A;next: ˆNend;to specify such a data-structure L. This consists of a pointer to a node (N), where a nodeis a record structure which puts some predefined type A together with a pointer to anothernode, and so on. In the C programming language, every x ∈ L will be declared asL x;in the context of datatype definitiontypedef struct N {A first;55

56 CHAPTER 3. RECURSION IN THE POINTFREE STYLEstruct N *next;} *L;and so on.What interests us in such “first year programming course” datatype declarations?Records and pointers have already been dealt with in table 2.1. So we can use this tableto find the abstract version of datatype L, by replacing pointers by the “1 + · · ·” notationand records (structs) by the “... × ...” notation:{L = 1 + N(3.1)N = A × (1 + N)We obtain a system of two equations on unknowns L and N, in which L’s dependenceon N can be removed by substitution:{L = 1 + NN = A × (1 + N)≡ { substituting L for 1 + N in the second equation }{L = 1 + NN= A × L≡ { substituting A × L for N in the first equation }{L = 1 + A × LN= A × LSystem (3.1) is thus equivalent to:{L = 1 + A × LN = A × (1 + N)(3.2)Intuitively, L abstracts the “possibly empty” linked-list of elements of type A, while Nabstracts the “non-empty” linked-list of elements of type A. Note that L and N are independentof each other, but also that each depends on itself. Can we solve these equationsin a way such that we obtain “solutions” for L and N, in the same way we do with schoolequations such as, for instance,x = 1 + x 2? (3.3)Concerning this equation, let us recall how we would go about it in school mathematics:x = 1 + x 2

3.1. MOTIVATION 57≡ { adding − x 2to both sides of the equation }x − x 2 = 1 + x 2 − x 2≡ { − x 2 cancels x 2 }x − x 2 = 1≡ { multiplying both sides of the equation by 2 etc. }2 × x − x = 2≡ { subtraction }x = 2We very quickly get solution x = 2. However, many steps were omitted from theactual calculation. This unfolds into the longer sequence of more elementary steps whichfollows, in which notation a − b abbreviates a+(−b) and a b abbreviates a × 1 b, for b ≠ 0:x = 1 + x 2≡ { adding − x 2to both sides of the equation }x − x 2 = (1 + x 2 ) − x 2≡ { + is associative }x − x 2 = 1 + (x 2 − x 2 )≡ { − x 2 is the additive inverse of x 2 }x − x 2 = 1 + 0≡ { 0 is the unit of addition }x − x 2 = 1≡ { multiplying both sides of the equation by 2 }2 × (x − x 2 ) = 2 × 1≡ { 1 is the unit of multiplication }2 × (x − x 2 ) = 2≡ { multiplication distributes over addition }

58 CHAPTER 3. RECURSION IN THE POINTFREE STYLE2 × x − 2 × x 2 = 2≡ { 2 cancels its inverse 1 2 }2 × x − 1 × x = 2≡ { multiplication distributes over addition }(2 − 1) × x = 2≡ { 2 − 1 = 1 and 1 is the unit of multiplication }x = 2Back to (3.2), we would like to submit each of the equations, e.g.L = 1 + A × L (3.4)to a similar reasoning. Can we do it? The analogy which can be found between thisequation and (3.3) goes beyond pattern similarity. From chapter 2 we know that manyproperties required in the reasoning above hold in the context of (3.4), provided the “=”sign is replaced by the “ ∼ =” sign, that of set-theoretical isomorphism. Recall that, forinstance, + is associative (2.46), 0 is the unit of addition (2.79), 1 is the unit of multiplication(2.81), multiplication distributes over addition (2.50) etc. Moreover, the first stepabove assumed that addition is compatible (monotonic) with respect to equality,a = bc = da + c = b + da fact which still holds when numeric equality gives place to isomorphism and numericaddition gives place to coproduct:A ∼ = BC ∼ = DA + C ∼ = B + D— recall (2.44) for isos f and g.Unfortunately, the main steps in the reasoning above are concerned with two basiccancellation propertiesx + b = c ≡x × b = c ≡x = c − bx = c b(b ≠ 0)which hold about numbers but do not hold about datatypes. In fact, neither products nor

3.1. MOTIVATION 59coproducts have arbitrary inverses 1 , and so we cannot “calculate by cancellation”. Howdo we circumvent this limitation?Just think of how we would have gone about (3.3) in case we didn’t know about thecancellation properties: we would be bound to the x by 1 + x 2substitution plus the otherproperties. By performing such a substitution over and over again we would obtain. . .x = 1 + x 2≡ { x by 1 + x 2substitution followed by simplification }x = 1 + 1 + x 22= 1 + 1 2 + x 4≡ { the same as above }x = 1 + 1 2 + 1 + x 24= 1 + 1 2 + 1 4 + x 8≡ { over and over again, n-times }· · ·≡ { simplification }x =n∑i=012 i + x2 n+1≡ { sum of n first terms of a geometric progression }x = (2 − 12 n) + x2 n+1≡ { let n → ∞ }x = (2 − 0) + 0≡ { simplification }x = 2Clearly, this is a much more complicated way of finding solution x = 2 for equation(3.3). But we would have loved it in case it were the only known way, and this is preciselywhat happens with respect to (3.4). In this case we have:L = 1 + A × L≡ { substitution of 1 + A × L for L }1 The initial and terminal datatypes do have inverses — 0 is its own “additive inverse” and 1 isits own “multiplicative inverse” — but not all the others.

60 CHAPTER 3. RECURSION IN THE POINTFREE STYLEL = 1 + A × (1 + A × L)≡ { distributive property (2.50) }L ∼ = 1 + A × 1 + A × (A × L)≡ { unit of product (2.81) and associativity of product (2.32) }L ∼ = 1 + A + (A × A) × L≡ { by (2.82), (2.84) and (2.87) }L ∼ = A 0 + A 1 + A 2 × L≡ { another substitution as above and similar simplifications }L ∼ = A 0 + A 1 + A 2 + A 3 × L≡ { after (n + 1)-many similar steps }L ∼ n∑= A i + A n+1 × Li=0Bearing a large n in mind, let us deliberately (but temporarily) ignore term A n+1 ×L.Then L will be isomorphic to the sum of n-many contributions A i ,L ∼ n∑= A ii=0each of them consisting of i-long tuples, or sequences, of values of A. (Number i is saidto be the length of any sequence in A i .) Such sequences will be denoted by enumeratingtheir elements between square brackets, for instance the empty sequence [ ] which is theonly inhabitant in A 0 , the two element sequence [a 1 ,a 2 ] which belongs to A 2 provideda 1 ,a 2 ∈ A, and so on. Note that all such contributions are mutually disjoint, that is,A i ∩ A j = ∅ wherever i ≠ j. (In other words, a sequence of length i is never a sequenceof length j, for i ≠ j.) If we join all contributions A i into a single set, we obtain the setof all finite sequences on A, denoted by A ⋆ and defined as follows:A ⋆def= ⋃ i≥0A i (3.5)The intuition behind taking the limit in the numeric calculation above was that termx2was getting smaller and smaller as n went larger and larger and, “in the limit”, itn+1could be ignored. By analogy, taking a similar limit in the calculation just sketched abovewill mean that, for a “sufficiently large” n, the sequences in A n are so long that it is veryunlikely that we will ever use them! So, for n → ∞ we obtainL ∼ ∞∑= A ii=0

3.2. INTRODUCING INDUCTIVE DATATYPES 61Because ∑ ∞i=0 A i is isomorphic to ⋃ ∞i=0 A i (see exercise 2.20), we finally have:L ∼ = A ⋆All in all, we have obtained A ⋆ as a solution to equation (3.4). In other words, datatypeL is isomorphic to the datatype which contains all finite sequences of some predefineddatatype A. This corresponds to the HASKELL [a] datatype, in general. Recall thatwe started from the “linked-list datatype” expressed in PASCAL or C. In fact, whereverthe C programmer thinks of linked-lists, the HASKELL programmer will think of finitesequences.But, what does equation (3.4) mean in fact? Is A ⋆ the only solution to this equation?Back to the numeric field, we know of equations which have more than one solution —for instance x = x2 +34, which admits two solutions 1 and 3 —, which have no solutionat all — for instance x = x + 1 —, or which admit an infinite number of — for instancex = x.We will address these topics in the next section about inductive datatypes and in chapter7, where the formal semantics of recursion will be made explicit. This is where the“limit” constructions used informally in this section will be shown to make sense.3.2 Introducing inductive datatypesDatatype L as defined by (3.4) is said to be recursive because L “recurs” in the definitionof L itself 2 . From the discussion above, it is clear that set-theoretical equality “=” in thisequation should give place to set-theoretical isomorphism (“ ∼ =”):inL ∼ = 1 + A × L (3.6)Which isomorphism L 1 + A × L do we expect to witness (3.4)? This will dependon which particular solution to (3.4) we are thinking of. So far we have seen only one,A ⋆ . By recalling the notion of algebra of a datatype (section 2.18), so we may rephrasethe question as: which algebraA ⋆in1 + A × A ⋆do we expect to witness the tautology which arises from (3.4) by replacing unknown Lwith solution A ⋆ , that isA ⋆∼ = 1 + A × A⋆?2 By analogy, we may regard (3.3) as a “recursive definition” of number 2.

3.2. INTRODUCING INDUCTIVE DATATYPES 63[ [ ],cons · 〈hd, tl〉 ] · (= [ ] ?) = id≡ { property of sequences: cons(hd s, tl s) = s }[ [ ],id ] · (= [ ] ?) = id≡ { going pointwise (2.60) }{=[ ] a ⇒ [ [ ],id ](i 1 a)¬(= [ ] a) ⇒ [ [ ],id ](i 2 a)≡ { +-cancellation (2.38) }{=[ ] a ⇒ [ ]a¬(= [ ] a) ⇒ ida = a= a≡ { a = [ ] in one case and identity function (2.9) in the other }{a = [ ] ⇒ a¬(a = [ ]) ⇒ a = a≡ { property (p → f,f) = f holds }a = aA comment on the particular choice of terminology above: symbol in suggests thatwe are going inside, or constructing (synthesizing) values of A ⋆ ; symbol out suggests thatwe are going out, or destructing (analyzing) values of A ⋆ . We shall often resort to thisduality in the sequel.Are there more solutions to equation (3.6)? In trying to implement this equation, aHASKELL programmer could have written, after the declaration of type A, the followingdatatype declaration:data L = Nil () | Cons (A,L)which, as we have seen in section 2.18, can be written simply asand generates diagramdata L = Nil | Cons (A,L) (3.13)i 11 1 + A × LNilLi 2in ′ConsA × L(3.14)leading to algebra in ′ = [ Nil,Cons ].

64 CHAPTER 3. RECURSION IN THE POINTFREE STYLEHASKELL seems to have generated another solution for the equation, which it callsL. To avoid the inevitable confusion between this symbol denoting the newly createddatatype and symbol L in equation (3.6), which denotes a mathematical variable, let ususe symbol T to denote the former (T stands for “type”). This can be coped with verysimply by writing T instead of L in (3.13):data T = Nil | Cons (A,T) (3.15)In order to make T more explicit, we will write in T instead of in ′ .Some questions are on demand at this point. First of all, what is datatype T? Whatin Tare its inhabitants? Next, is T 1 + A × T an iso or not?HASKELL will help us to answer these questions. Suppose that A is a primitive numericdatatype, and that we add deriving Show to (3.15) so that we can “see” theinhabitants of the T datatype. The information associated to T is thus:Main> :i T-- type constructordata T-- constructors:Nil :: TCons :: (A,T) -> T-- instances:instance Show Tinstance Eval TBy typing NilMain> NilNil :: Twe confirm that Nil is itself an inhabitant of T, and by typing ConsMain> Cons :: (A,T) -> Twe realize that Cons is not so (as expected), but it can be used to build such inhabitants,for instance:Main> Cons(1,Nil)Cons (1,Nil) :: T

3.2. INTRODUCING INDUCTIVE DATATYPES 65orMain> Cons(2,Cons(1,Nil))Cons (2,Cons (1,Nil)) :: Tetc. We conclude that expressions involving Nil and Cons are inhabitants of type T. Arethese the only ones? The answer is yes because, by design of the HASKELL language,the constructors of type T will remain fixed once its declaration is interpreted, that is,no further constructor can be added to T. Does in T have an inverse? Yes, its inverse iscoalgebraout T : T 1 + A × Tout T Nil = i 1 NILout T (Cons(a,l)) = i 2 (a,l)(3.16)which can be straightforwardly encoded in HASKELL using the Either realization of +(recall sections 2.9 and 2.18):outT :: T -> Either () (A,T)outT Nil = Left ()outT (Cons(a,l)) = Right(a,l)In summary, isomorphismout TT ∼= 1 + A × T(3.17)in Tholds, where datatype T is inhabited by symbolic expressions which we may visualizevery conveniently as trees, for instanceCons❅ ❅❅Cons2 ❅ ❅❅ 1 Nilpicturing expression Cons(2,Cons(1,Nil)). Nil is the empty tree and Cons may be

66 CHAPTER 3. RECURSION IN THE POINTFREE STYLEregarded as the operation which adds a new root and a new branch, say a, to a tree t:Cons(a, ) =❅ t ❅❅Cons❅ ❅❅ a ❅ t ❅❅The choice of symbols T, Nil and Cons was rather arbitrary in (3.15). Therefore, analternative declaration such as, for instance,data U = Stop | Join (A,U) (3.18)would have been perfectly acceptable, generating another solution for the equation underalgebra [ Stop,Join ]. It is easy to check that (3.18) is but a renaming of Nil to Stop andof Cons to Join. Therefore, both datatypes are isomorphic, or “abstractly the same”.Indeed, any other datatype X inductively defined by a constant and a binary constructoraccepting A and X as parameters will be a solution to the equation. Because we arejust renaming symbols in a consistent way, all such solutions are abstractly the same. Allof them capture the abstract notion of a list of symbols.We wrote “inductively” above because the set of all expressions (trees) which inhabitthe type is defined by induction. Such types are called inductive and we shall have a lotmore to say about them in chapter 7 .Exercise 3.1. Obviously,either (const []) (:)does not work as a HASKELL realization of the mediating arrow in diagram (3.9). Whatdo you need to write instead?✷3.3 Observing an inductive datatypeSuppose that one is asked to express a particular observation of an inductive such as Tf(3.15), that is, a function of signature B T for some target type B. Suppose, for

3.3. OBSERVING AN INDUCTIVE DATATYPE 67instance, that A is IN 0 (the set of all non-negative integers) and that we want to add allelements which occur in a T-list. Of course, we have to ensure that addition is availablein IN 0 ,add : IN 0 × IN 0 IN 0add(x,y) def = x + yand that 0 ∈ IN 0 is a value denoting “the addition of nothing”.0So constant arrowIN 0 1 is available. Of course, add(0,x) = add(x,0) = x holds, for all x ∈ IN 0 .This property means that IN 0 , together with operator add and constant 0, forms a monoid,a very important algebraic structure in computing which will be exploited intensively laterin this book. The following arrow “packaging” IN 0 , add and 0,[ 0,add ]IN 0 1 + IN 0 × IN 0(3.19)is a convenient way to express such a structure. Combining this arrow with the algebrain TT 1 + IN 0 × T(3.20)which defines T, and the function f we want to define, the target of which is B = IN 0 , weget the almost closed diagram which follows, in which only the dashed arrow is yet to befilled in:Tin T1 + IN 0 × T(3.21)fIN 0 1 + IN 0 × IN 0[ 0,add ]We know that in T = [ Nil,Cons ]. A pattern for the missing arrow is not difficult toguess: in the same way f bridges T and IN 0 on the lefthand side, it will do the same jobon the righthand side. So pattern · · · + · · · × f comes to mind (recall section 2.10), wherethe “· · ·” are very naturally filled in by identity functions. All in all, we obtain diagramT[ Nil,Cons ]1 + IN 0 × T(3.22)fid+id×fIN 0 1 + IN 0 × IN 0[ 0,add ]which pictures the following property of ff · [ Nil,Cons ] = [ 0,add ] · (id + id × f) (3.23)

68 CHAPTER 3. RECURSION IN THE POINTFREE STYLEand is easy to convert to pointwise notation:f · [ Nil,Cons ] = [ 0,add ] · (id + id × f)≡ { (2.40) on the lefthand side, (2.41) and identity id on the righthand side }[ f · Nil,f · Cons ] = [ 0,add · (id × f) ]≡ { either structural equality (2.58) }{f · Nil = 0f · Cons = add · (id × f)≡ { going pointwise }{(f · Nil)x = 0x(f · Cons)(a,x) = (add · (id × f))(a,x)≡ { composition (2.6), constant (2.12), product (2.22) and definition of add }{f Nil = 0f(Cons(a,x)) = a + f xNote that we could have used out T in diagram (3.21),Tout T1 + IN 0 × T(3.24)fid+id×fIN 0 1 + IN 0 × IN 0[ 0,add ]obtaining another version of the definition of f,f = [ 0,add ] · (id + id × f) · out T (3.25)which would lead to exactly the same pointwise recursive definition:f = [ 0,add ] · (id + id × f) · out T≡ { (2.41) and identity id on the righthand side }f = [ 0,add · (id × f) ] · out T≡ { going pointwise on out T (3.16) }{f Nil = ([ 0,add · (id × f) ] · outT )Nilf(Cons(a,x)) = ([ 0,add · (id × f) ] · out T )(a,x)≡ { definition of out T (3.16) }

3.3. OBSERVING AN INDUCTIVE DATATYPE 69{f Nil = ([ 0,add · (id × f) ] · i1 )Nilf(Cons(a,x)) = ([ 0,add · (id × f) ] · i 2 )(a,x)≡ { +-cancellation (2.38) }{f Nil = 0 Nilf(Cons(a,x)) = (add · (id × f))(a,x)≡ { simplification }{f Nil = 0f(Cons(a,x)) = a + f xPointwise f mirrors the structure of type T in having has many definition clauses asconstructors in T. Such functions are said to be defined by induction on the structure oftheir input type. If we repeat this calculation for IN 0 ⋆ instead of T, that is, forout = (! + 〈hd, tl〉) · (= [ ] ?)— recall (3.10) — taking place of out T , we get a “more algorithmic” version of f:f = [ 0,add ] · (id + id × f) · (! + 〈hd, tl〉) · (= [ ] ?)≡ { +-functor (2.42), identity and ×-absorption (2.25) }f = [ 0,add ] · (! + 〈hd,f · tl〉) · (= [ ] ?)≡ { +-absorption (2.41) and constant 0 }f = [ 0,add · 〈hd,f · tl〉 ] · (= [ ] ?)≡ { going pointwise on guard = [ ] ? (2.60) and simplifying }{l = [ ] ⇒ 0lf l =¬(l = [ ]) ⇒ (add · 〈hd,f · tl〉)l≡ { simplification }{l = [ ] ⇒ 0f l =¬(l = [ ]) ⇒ hd l + f(tll)The outcome of this calculation can be encoded in HASKELL syntax asf l | l == [] = 0| otherwise = head l + f (tail l)or

70 CHAPTER 3. RECURSION IN THE POINTFREE STYLEf l = if l == []then 0else head l + f (tail l)both requiring the equality predicate “==” and destructors “head” and “tail”.3.4 Synthesizing an inductive datatypeThe issue which concerns us in this section dualizes what we have just dealt with: insteadof analyzing or observing an inductive type such as T (3.15), we want to be able tosynthesize (generate) particular inhabitants of T. In other words, we want to be able tospecify functions with signature B f T for some given source type B. Let B = IN 0and suppose we want f to generate, for a given natural number n > 0, the list containingall numbers less or equal to n in decreasing orderCons(n,Cons(n − 1,Cons(...,Nil)))or the empty list Nil, in case n = 0.Let us try and draw a diagram similar to (3.24) applicable to the new situation. Intrying to “re-use” this diagram, it is immediate that arrow f should be reversed. Bearingduality in mind, we may feel tempted to reverse all arrows just to see what happens.Identity functions are their own inverses, and in T takes the place of out T :fTIN 0in T1 + IN 0 × Tid+id×f 1 + IN 0 × IN 0Interestingly enough, the bottom arrow is the one which is not obvious to reverse, meaningthat we have to “invent” a particular destructor of IN 0 , sayIN 0g 1 + IN 0 × IN 0fitting in the diagram and generating the particular computational effect we have in mind.Once we do this, a recursive definition for f will pop out immediately,which is equivalent to:f = in T · (id + id × f) · g (3.26)f = [ Nil,Cons · (id × f) ] · g (3.27)

3.5. INTRODUCING (LIST) CATAS, ANAS AND HYLOS 71Because we want f 0 = Nil to hold, g (the actual generator of the computation) shoulddistinguish input 0 from all the others. One thus decomposes g as follows,IN 0= 0 ? IN 0 + IN 0 !+h 1 + IN 0 × IN 0gleaving h to fill in. This will be a split providing, on the lefthand side, for the value to beCons’ed to the output and, on the righthand side, for the “seed” to the next recursive call.Since we want the output values to be produced contiguously and in decreasing order, wemay define h = 〈id, pred〉 where, for n > 0,pred n def = n − 1 (3.28)computes the predecessor of n. Altogether, we have synthesizedFilling this in (3.27) we getg = (! + 〈id, pred〉) · (= 0 ?) (3.29)f = [ Nil,Cons · (id × f) ] · (! + 〈id, pred〉) · (= 0 ?)≡ { +-absorption (2.41) followed by ×-absorption (2.25) etc. }f = [ Nil,Cons · 〈id,f · pred〉 ] · (= 0 ?)≡ { going pointwise on guard = 0 ? (2.60) and simplifying }{n = 0 ⇒ Nilf n =¬(n = 0) ⇒ Cons(n,f (n − 1))which matches the function we had in mind:f n | n == 0 = Nil| otherwise = Cons(n,f(n-1))We shall see briefly that the constructions of the f function adding up a list of numbersin the previous section and, in this section, of the f function generating a list ofnumbers are very standard in algorithm design and can be broadly generalized. Let usfirst introduce some standard terminology.3.5 Introducing (list) catas, anas and hylosSuppose that, back to section 3.3, we want to multiply, rather than add, the elementsoccurring in lists of type T (3.15). How much of the program synthesis effort presentedthere can be reused in the design of the new function?

72 CHAPTER 3. RECURSION IN THE POINTFREE STYLE[ 0,add ]It is intuitive that only the bottom arrow IN 0 1 + IN 0 × IN 0 of diagram(3.24) needs to be replaced, because this is the only place where we can specify that targetdatatype IN 0 is now regarded as the carrier of another (multiplicative rather than additive)monoidal structure,[ 1,mul ]IN 0 1 + IN 0 × IN 0(3.30)for mul(x,y) def = x y. We are saying that the argument list is now to be reduced by themultiplication operator and that output value 1 is expected as the result of “nothing left tomultiply”.Moreover, in the previous section we might have wanted our number-list generator toproduce the list of even numbers smaller than a given number, in decreasing order (seeexercise 3.4). Intuition will once again help us in deciding that only arrow g in (3.26)needs to be updated.The following diagrams generalize both constructions by leaving such bottom arrowsunspecified,Tout T1 + IN 0 × TTin T1 + IN 0 × T(3.31)fBgid+id×f1 + IN 0 × BfBgid+id×f 1 + IN 0 × Band express their duality (cf. the directions of the arrows). It so happens that, for eachof these diagrams, f is uniquely dependent on the g arrow, that is to say, each particularinstantiation of g will determine the corresponding f. So both gs can be regarded as“seeds” or “genetic material” of the f functions they uniquely define 3 .Following the standard terminology, we express these facts by writing f = (|g|) withrespect to the lefthand side diagram and by writing f = [(g)] with respect to the righthandside diagram. Read (|g|) as “the T-catamorphism induced by g” and [(g)] as “theT-anamorphism induced by g”. This terminology is derived from the Greek words κατα(cata) and ανα (ana) meaning, respectively, “downwards” and “upwards” (compare withthe direction of the f arrow in each diagram). The exchange of parentheses “( )” and “[ ]”in double parentheses “(| |)” and “[( )]” is aimed at expressing the duality of both concepts.We shall have a lot to say about catamorphisms and anamorphisms of a given typesuch as T. For the moment, it suffices to say that7 .g• the T-catamorphism induced by B 1 + IN 0 × B is the unique function B T3 The theory which supports the statements of this paragraph will not be dealt with until chapter(|g|)

3.5. INTRODUCING (LIST) CATAS, ANAS AND HYLOS 73which obeys to property (or is defined by)(|g|) = g · (id + id × (|g|)) · out T (3.32)which is the same as(|g|) · in T = g · (id + id × (|g|)) (3.33)• given B g 1 + IN 0 × B the T-anamorphism induced by g is the unique functionB [(g)] T which obeys to property (or is defined by)[(g)] = in T · (id + id × [(g)]) · g (3.34)From (3.31) it can be observed that T can act as a mediator between any T-anamorphism(|g|)and any T-catamorphism, that is to say, B T composes with T C , for someC h 1 + IN 0 × C . In other words, a T-catamorphism call always observe (consume)the output of a T-anamorphism. The latter produces a list of IN 0 s which is consumed bythe former. This is depicted in the diagram which follows:[(h)]Bg1 + IN 0 × B(3.35)(|g|)id+id×(|g|)Tin T1 + IN 0 × T[(h)]id+id×[(h)]Ch 1 + IN 0 × CWhat can we say about the (|g|) · [(h)] composition? It is a function from B to C which resortsto T as an intermediate data-structure and can be subject to the following calculation(cf. outermost rectangle in (3.35)):(|g|) · [(h)] = g · (id + id × (|g|)) · (id + id × [(h)]) · h≡ { +-functor (2.42) }(|g|) · [(h)] = g · ((id · id) + (id × (|g|)) · (id × [(h)])) · h≡ { identity and ×-functor (2.28) }(|g|) · [(h)] = g · (id + id × (|g|) · [(h)]) · h

74 CHAPTER 3. RECURSION IN THE POINTFREE STYLEThis calculation shows how to define Cwithout any intermediate data-structure:(|g|)·[(h)]Bin one go, that is to say, doingBg1 + IN 0 × B(3.36)(|g|)·[(h)]id+id×(|g|)·[(h)]Ch 1 + IN 0 × CAs an example, let us see what comes out of (|g|) · [(h)] for h and g respectively given by(3.29) and (3.30):(|g|) · [(h)] = g · (id + id × (|g|) · [(h)]) · h≡ { (|g|) · [(h)] abbreviated to f and instantiating h and g }f = [ 1,mul ] · (id + id × f) · (! + 〈id, pred〉) · (= 0 ?)≡ { +-functor (2.42) and identity }f = [ 1,mul ] · (! + (id × f) · 〈id, pred〉) · (= 0 ?)≡ { ×-absorption (2.25) and identity }f = [ 1,mul ] · (! + 〈id,f · pred〉) · (= 0 ?)≡ { +-absorption (2.41) and constant 1 (2.15) }f = [ 1,mul · 〈id,f · pred〉 ] · (= 0 ?)≡ { McCarthy conditional (2.59) }f = (= 0 ?) → 1,mul · 〈id,f · pred〉Going pointwise, we get — via (2.59) —andf 0 = [ 1,mul · 〈id,f · pred〉 ](i 1 0)= { +-cancellation (2.38) }1 0= { constant function (2.12) }f(n + 1) = [ 1,mul · 〈id,f · pred〉 ](i 2 (n + 1))= { +-cancellation (2.38) }1

3.5. INTRODUCING (LIST) CATAS, ANAS AND HYLOS 75mul · 〈id,f · pred〉(n + 1)= { pointwise definitions of split, identity, predecessor and mul }(n + 1) × f nIn summary, f is but the well-known factorial function:{f 0 = 1f(n + 1) = (n + 1) × f nThis result comes to no surprise if we look at diagram (3.35) for the particular g andh we have considered above and recall a popular “definition” of factorial:In fact, [(h)] n produces T-listn! = n × (n − 1) × ... × 1} {{ }n timesCons(n,Cons(n − 1,... Cons(1,Nil)))(3.37)as an intermediate data-structure which is consumed by (|g|) , the effect of which is but the“replacement” of Cons by × and Nil by 1, therefore accomplishing (3.37) and realizingthe computation of factorial.The moral of this example is that a function as simple as factorial can be decomposedinto two components (producer/consumer functions) which share a common intermediateinductive datatype. The producer function is an anamorphism which “represents” orproduces a “view” of its input argument as a value of the intermediate datatype. Theconsumer function is a catamorphism which reduces this intermediate data-structure andproduces the final result. Like factorial, many functions can be handsomely expressed bya (|g|) ·[(h)] composition for a suitable choice of the intermediate type, and of g and h. Theintermediate data-structure is said to be virtual in the sense that it only exists as a meansto induce the associated pattern of recursion and disappears by calculation.The composition (|g|) ·[(h)] of a T-catamorphism with a T-anamorphism is called a T-hylomorphism 4 and is denoted by [g,h]. Because g and h fully determine the behaviourof the [g,h] function, they can be regarded as the “genes” of the function they define. Aswe shall see, this analogy with biology will prove specially useful for algorithm analysisand classification.Exercise 3.2. A way of computing n 2 , the square of a given natural number n, is tosum up the n first odd numbers. In fact, 1 2 = 1, 2 2 = 1 + 3, 3 2 = 1 + 3 + 5, etc.,n 2 = (2n − 1) + (n − 1) 2 . Following this hint, express functionsq n def = n 2 (3.38)4 This terminology is derived from the Greek word vλoσ (hylos) meaning “matter”.

76 CHAPTER 3. RECURSION IN THE POINTFREE STYLEas a T-hylomorphism and encode it in HASKELL.✷Exercise 3.3. Write function x n as a T-hylomorphism and encode it in HASKELL.✷Exercise 3.4. The following function in HASKELL computes the T-sequence of all evennumbers less or equal to n:f n = if n

3.7. FUNCTORS 77Although a generic theory of arbitrary datatypes requires a theoretical elaborationwhich cannot be explained at once, we can move a step further by taking the two observationsabove as starting points. We shall start from the latter in order to talk genericallyabout inductive types. Then we introduce parameterization and functorial behaviour.Suppose that, as a mere notational convention, we abbreviate every expression of theform “1+IN 0 ×...” occurring in the previous section by “F ...”, e.g. 1+IN 0 ×B by F B,e.g. 1 + IN 0 × T by F Tout TT ∼= F T(3.39)etc. This is the same as introducing a datatype-level operatorin TF X def = 1 + IN 0 × X (3.40)which maps every datatype A into datatype 1 + IN 0 × A. Operator F captures the patternof recursion which is associated to so-called “right” lists (of non-negative integers), thatis, lists which grow to the right. The slightly different pattern G X def = 1 + X × IN 0 willgenerate a different, although related, inductive typeX ∼ = 1 + X × IN 0 (3.41)— that of so-called “left” lists (of non-negative integers). And it is not difficult to think ofthe pattern which is merges both right and left lists and gives rise to bi-linear lists, betterknown as binary trees:X ∼ = 1 + X × IN 0 × X (3.42)One may think of many other expressions F X and guess the inductive datatype theygenerate, for instance HX def = IN 0 + IN 0 × X generating non-empty lists of non-negativeintegers (IN + 0 ). The general rule is that, given an inductive datatype definition of the formX ∼ = F X (3.43)(also called a domain equation), its pattern of recursion is captured by a so-called functorF.3.7 FunctorsThe concept of a functor F, borrowed from category theory, is a most generic and usefuldevice in programming 5 . As we have seen, F can be regarded as a datatype constructor5 The category theory practitioner must be warned of the fact that the word functor is used herein a too restrictive way. A proper (generic) definition of a functor will be provided later in this

78 CHAPTER 3. RECURSION IN THE POINTFREE STYLEwhich, given datatype A, builds a more elaborate datatype F A; given another datatypeB, builds a similarly elaborate datatype F B; and so on. But what is more importantand has the most beneficial consequences is that, if F is regarded as a functor, then itsdata-structuring effect extends smoothly to functions in the following way: suppose thatfB A is a function which observes A into B, which are parameters of F A and F B,respectively. By definition, if F is a functor then F B F A exists for every such f:F ffABF AF BF fF f extends f to F-structures and will, by definition, obey to two very basic properties: itcommutes with identityand with compositionTwo simple examples of a functor follow:F id A = id (F A) (3.44)F(g · h) = (F g) · (F h) (3.45)• Identity functor: define F X = X, for every datatype X, and F f = f. Properties(3.44) and (3.45) hold trivially just by removing symbol F wherever it occurs.• Constant functors: for a given C, define F X = C (for all datatypes X) and F f =id C , as expressed in the following diagram:AfBCCid CProperties (3.44) and (3.45) hold trivially again.In the same way functions can be unary, binary, etc., we can have functors with morethan one argument. So we get binary functors (also called bifunctors), ternary functorsetc.. Of course, properties (3.44) and (3.45) have to hold for every parameter of an n-aryfunctor. For a binary functor B, for instance, equation (3.44) becomesbook.B (id A ,id B ) = id B (A,B) (3.46)

3.8. POLYNOMIAL FUNCTORS 79Data construction Universal construct Functor DescriptionA × B 〈f, g〉 f × g ProductA + B [ f, g ] f + g CoproductB A f f A ExponentialTable 3.1: Datatype constructions and associated operators.and equation (3.45) becomesB (g · h,i · j) = B (g,i) · B (h,j) (3.47)Product and coproduct are typical examples of bifunctors. In the former case onehas B (A,B) = A × B and B (f,g) = f × g — recall (2.22). Properties (2.29) and(2.28) instantiate (3.46) and (3.47), respectively, and this explains why we called themthe functorial properties of product. In the latter case, one has B (A,B) = A + B andB (f,g) = f + g — recall (2.37) — and functorial properties (2.43) and (2.42). Finally,exponentiation is a functorial construction too: assuming A, one has F X def = X A andF f def = f · ap and functorial properties (2.73) and (2.74). All this is summarized in table3.1.Such as functions, functors may compose with each other in the obvious way: thecomposition of F and G, denoted F · G, is defined by(F · G)X def = F (G X) (3.48)(F · G)f3.8 Polynomial functorsdef= F (G f) (3.49)We may put constant, product, coproduct and identity functors together to obtain so-calledpolynomial functors, which are described by polynomial expressions, for instanceF X = 1 + A × X— recall (3.6). A polynomial functor is either• a constant functor or the identity functor, or• the (finitary) product or coproduct (sum) of other polynomial functors, or• the composition of other polynomial functors.

80 CHAPTER 3. RECURSION IN THE POINTFREE STYLESo the effect on arrows of a polynomial functor is computed in an easy and structuredway, for instance:F f= (1 + A × X)f= { sum of two functors where A is a constant and X is a variable }(1)f + (A × X)f= { constant functor and product of two functors }id 1 + (A)f × (X)f= { constant functor and identity functor }id 1 + id A × f= { subscripts dropped for simplicity }id + id × fSo, 1 + A × f denotes the same as id 1 + id A × f, or even the same as id + id × f if onedrops the subscripts.It should be clear at this point that what was referred to in section 2.10 as a “symbolicpattern” applicable to both datatypes and arrows is after all a functor in the mathematicalsense. The fact that the same polynomial expression is used to denote both the dataand the operators which structurally transform such data is of great conceptual economyand practical application. For instance, once polynomial functor (3.40) is assumed, thediagrams in (3.31) can be written as simply asTout TF TTin TF T(3.50)fBgF BF ffBg F BF fIt is useful to know that, thanks to the isomorphism laws studied in chapter 2, everypolynomial functor F may be put into the canonical form,F X ∼ = C 0 + (C 1 × X) + (C 2 × X 2 ) + · · · + (C n × X n )= ∑ ni=0 C i × X i (3.51)and that Newton’s binomial formula(A + B) n ∑∼ n = n C p × A n−p × B p (3.52)p=0

3.9. POLYNOMIAL INDUCTIVE TYPES 81can be used in such conversions. These are performed up to isomorphism, that is to say,after the conversion one gets a different but isomorphic datatype. Consider, for instance,functorF X def = A × (1 + X) 2(where A is a constant datatype) and check the following reasoning:F X = A × (1 + X) 2∼= { law (2.87) }A × ((1 + X) × (1 + X))∼= { law (2.50) }A × ((1 + X) × 1 + (1 + X) × X))∼= { laws (2.81), (2.31) and (2.50) }A × ((1 + X) + (1 × X + X × X))∼= { laws (2.81) and (2.87) }A × ((1 + X) + (X + X 2 ))∼= { law (2.46) }A × (1 + (X + X) + X 2 )∼= { canonical form obtained via laws (2.50) and (2.88) }A }{{}+ A × 2} {{ }C 0C 1×X + A }{{}C 2×X 2Exercise 3.5. Synthesize the isomorphism A + A × 2 × X + A × X 2 νA × (1 + X 2 )implicit in the above reasoning.✷3.9 Polynomial inductive typesAn inductive datatype is said to be polynomial wherever its pattern of recursion is describedby a polynomial functor, that is to say, wherever F in equation (3.43) is polynomial.For instance, datatype T (3.20) is polynomial (n = 1) and its associated polynomial

82 CHAPTER 3. RECURSION IN THE POINTFREE STYLEfunctor is canonically defined with coefficients C 0 = 1 and C 1 = IN 0 . For reasons thatwill become apparent later on, we shall always impose C 0 ≠ 0 to hold in a polynomialdatatype expressed in canonical form.Polynomial types are easy to encode in HASKELL wherever the associated functor isin canonical polynomial form, that is, wherever one hasThen we haveT ∼∑ = ni=0C i × T i(3.53)in Tin Tdef= [ f 1 ,... ,f n ]where, for i = 1,n, f i is an arrow of type T C i × T i . Since n is finite, one mayexpand exponentials according to (2.87) and encode this in HASKELL as follows:data T = C0 |C1 (C1,T) |C2 (C2,(T,T)) |... |Cn (Cn,(T, ..., T))Of course the choice of symbol Ci to realize each f i is arbitrary 6 . Several instances ofpolynomial inductive types (in canonical form) will be mentioned in section 3.13. Section3.17 will address the conversion between inductive datatypes induced by so-called naturaltransformations.The concepts of catamorphism, anamorphism and hylomorphism introduced in section3.5 can be extended to arbitrary polynomial types. We devote the following sectionsto explaining catamorphisms in the polynomial setting. Polynomial anamorphisms andhylomorphisms will not be dealt with until chapter 7.3.10 F-algebras and F-homomorphismsOur interest in polynomial types is basically due to the fact that, for polynomial F, equation(3.43) always has a particularly interesting solution which corresponds to our notionof a recursive datatype.6 A more traditional (but less close to (3.53)) encoding will bedata T = C0 | C1 C1 T | C2 C2 T T | ... | Cn Cn T ... T (3.54)delivering every constructor in curried form.

3.11. F-CATAMORPHISMS 83In order to explain this, we need two notions which are easy to understand: first, thatαof an F-algebra, which simply is any function α of signature A F A . A is calledthe carrier of F-algebra α and contains the values which α manipulates by computingnew A-values out of existing ones, according to the F-pattern (the “type” of the algebra).As examples, consider [ 0,add ] (3.19) and in T (3.20), which are both algebras of typeF X = 1+IN 0 × X. The type of an algebra clearly determines its form. For instance, anyalgebra α of type F X = 1+X ×X will be of form [ α 1 ,α 2 ], where α 1 is a constant andα 2 is a binary operator. So monoids are algebras of this type 7 .Secondly, we introduce the notion of an F-homomorphism which is but a functionobserving a particular F-algebra α into another F-algebra β:AαF Af · α = β · (F f) (3.55)fBβF BF fClearly, f can be regarded as a structural translation between A and B, that is, A andB have a similar structure 8 . Note that — thanks to (3.44) — identity functions arealways (trivial) F-homomorphisms and that — thanks to (3.45) — these homomorphismscompose, that is, the composition of two F-homomorphisms is an F-homomorphism.3.11 F-catamorphismsAn F-algebra can be epic, monic or both, that is, iso. Iso F-algebras are particularlyrelevant to our discussion because they describe solutions to the X ∼ = F X equation (3.43).Moreover, for polynomial F a particular iso F-algebra always exists, which is denoted byinµF F µF and has special properties. First, its carrier is the smallest among thecarriers of other iso F-algebras, and this is why it is denoted by µF — µ for “minimal” 9 .Second, it is the so-called initial F-algebra. What does this mean?It means that, for every F-algebra α there exists one and only one F-homomorphismbetween in and α. This unique arrow mediating in and α is therefore determined byα itself, and is called the F-catamorphism generated by α. This construct, which wasintroduced in 3.5, is in general denoted by (|α|) F:7 But not every algebra of this type is a monoid, since the type of an algebra only fixes its syntaxand does not impose any properties such as associativity, etc.8 Cf. homomorphism = homo (the same) + morphos (structure, shape).9 µF means the least fixpoint solution of equation X ∼ = FX, as will be described in chapter 7 .

84 CHAPTER 3. RECURSION IN THE POINTFREE STYLEµFinF µF(3.56)f=(|α|) FAαF AF(|α|) FWe will drop the F subscript in (|α|) Fwherever deducible from the context, and often callα the “gene” of (|α|) F.As happens with splits, eithers and transposes, the uniqueness of the catamorphismconstruct is captured by a universal property established in the class of all F-homomorphisms:k = (|α|) ⇔ k · in = α · F k (3.57)According to the experience gathered from section 2.12 onwards, a few properties can beexpected as consequences of (3.57). For instance, one may wonder about the “gene” ofthe identity catamorphism. Just let k = id in (3.57) and see what happens:id = (|α|) ⇔ id · in = α · F id= { identity (2.10) and F is a functor (3.44) }id = (|α|) ⇔ in = α · id= { identity (2.10) once again }id = (|α|) ⇔ in = α= { α replaced by in and simplifying }id = (|in|)Thus one finds out that the genetic material of the identity catamorphism is the initialalgebra in. Which is the same as establishing the reflection property of catamorphisms:Cata-reflection :µFinF µF(|in|) = id µF (3.58)(|in|)µF F µFinF(|in|)In a more intuitive way, one might have observed that (|in|) is, by definition of in, theunique arrow mediating µF and itself. But another arrow of the same type is alreadyknown: the identity id µF . So these two arrows must be the same.Another property following immediately from (3.57), for k = (|α|), is

3.11. F-CATAMORPHISMS 85Cata-cancellation :Because in is iso, this law can be rephrased as followswhere out denotes the inverse of in:(|α|) · in = α · F (|α|) (3.59)(|α|) = α · F (|α|) · out (3.60)µFout∼= F µFinNow, let f be F-homomorphism (3.55) between F-algebras α and β. How does itrelate to (|α|) and (|β|)? Note that f · (|α|) is an arrow mediating µF and B. But B is thecarrier of β and (|β|) is the unique arrow mediating µF and B. So the two arrows are thesame:Cata-fusion :(|α|)µFAfBinαβF µFF AF BF(|α|)F ff · (|α|) = (|β|) if f · α = β · F f (3.61)Of course, this law is also a consequence of the universal property, for k = f · (|α|):f · (|α|) = (|β|) ⇔(f · (|α|)) · in = β · F (f · (|α|))⇔ { composition is associative and F is a functor (3.45) }f · ((|α|) · in) = β · (F f) · (F (|α|))⇔ { cata-cancellation (3.59) }f · α · F (|α|) = β · F f · F (|α|)⇔ { require f to be a F-homomorphism (3.55) }f · α · F (|α|) = f · α · F (|α|) ∧ f · α = β · F f⇐ { simplify }f · α = β · F fThe presentation of the absorption property of catamorphisms entails the very importantissue of parameterization and deserves to be treated in a separate section, as follows.

86 CHAPTER 3. RECURSION IN THE POINTFREE STYLE3.12 Parameterization and type functorsBy analogy with what we have done about splits (product), eithers (coproduct) and transposes(exponential), we now look forward to identifying F-catamorphisms which exhibitfunctorial behaviour.Suppose that one wishes to square all numbers which are members of lists of type T(3.20). It can be checked that(|[ Nil,Cons · (sq × id) ]|) (3.62)sqwill do this for us, where IN 0 IN 0 is given by (3.38). This catamorphism, whichconverted to pointwise notation is nothing but function h which follows{hNil = Nilh(Cons(a,l)) = Cons(sqa,hl)maps type T to itself. This is because sq maps IN 0 to IN 0 . Now suppose that, instead of sq,fone would like to apply a given function B IN 0 (for some B other than IN 0 ) to allelements of the argument list. It is easy to see that it suffices to replace f for sq in (3.62).However, the output type no longer is T, but rather type T ′ ∼ = 1 + B × T ′ .Types T and T ′ are very close to each other. They share the same “shape” (recursivepattern) and only differ with respect to the type of elements — IN 0 in T and B in T ′ . Thissuggests that these two types can be regarded as instances of a more generic list datatypeListList X ∼ = 1 + X × List X(3.63)in=[ Nil,Cons ]in which the type of elements X is allowed to vary. Thus one has T = List IN 0 andT ′ = List B.fBy inspection, it can be checked that, for every B A ,maps List A to ListB. Moreover, for f = id one has:(|[ Nil,Cons · (f × id) ]|) (3.64)(|[ Nil,Cons · (id × id) ]|)= { by the ×-functor-id property (2.29) and identity }(|[ Nil,Cons ]|)= { cata-reflection (3.58) }id

3.12. PARAMETERIZATION AND TYPE FUNCTORS 87Therefore, by definingListfdef= (|[ Nil,Cons · (f × id) ]|)what we have just seen can be written thus:List id A= id List AThis is nothing but law (3.44) for F replaced by List. Moreover, it will not be too difficultto check thatList (g · f) = List g · List falso holds — cf. (3.45). Altogether, this means that List can be regarded as a functor.In programming terminology one says that List X (the “lists of Xs datatype”) is parametricand that, by instantiating parameter X, one gets ground lists such as lists of integers,booleans, etc. The illustration above deepens one’s understanding of parameterizationby identifying the functorial behaviour of the parametric datatype along with itsparameter instantiations.All this can be broadly generalized and leads to what is commonly known by a typefunctor. First of all, it should be clear that the generic formatT ∼ = F Tadopted so far for the definition of an inductive type is not sufficiently detailed becauseit does not provide a parametric view of T. For simplicity, let us suppose (for the momement)that only one parameter is identified in T. Then we may factor this out via typevariable X and write (overloading symbol T)TX ∼ = B(X,TX)where B is called the type’s base functor. Binary functor B(X,Y ) is given this namebecause it is the basis of the whole inductive type definition. By instantiation of X oneobtains F. In the example above, B (X,Y ) = 1 + X × Y and in fact F Y = B (IN 0 ,Y ) =1 + IN 0 × Y , recall (3.40). Moreover, one hasF f = B (id,f) (3.65)and so every F-homomorphism can be written in terms of the base-functor of F, e.g.instead of (3.55).f · α = β · B (id,f)

88 CHAPTER 3. RECURSION IN THE POINTFREE STYLETX will be referred to as the type functor generated by B:TX }{{}type functor∼= B(X,TX)} {{ }base functorWe proceed to the description of its functorial behaviour — Tf — for a given B A .As far as typing rules are concerned, we shall havefBTBfT fATASo we should be able to express Tf as a B (A, )-catamorphism (|g|):ATAin T AB (A,TA)fT f=(|g|)B (id,T f)B TB B (A,TB)gAs we know that in TB is the standard constructor of values of type TB, we may put itinto the diagram too:ATAin T AB (A,TA)fT f=(|g|)B (id,T f)B TB gB (A,TB)in T BB (B,TB)The catamorphism’s gene g will be synthesized by filling the dashed arrow in the diagramwith the “obvious” B (f,id), whereby one getsTfdef= (|in TB · B (f,id)|) (3.66)and a final diagram, where in T A is abbreviated by in A (ibid. in T B by in B ):ATAin AB (A,TA)fT f=(|in B·B (f,id)|)B TB B (B,TB)in BB (f,id)B (id,T f)B (A,TB)

3.12. PARAMETERIZATION AND TYPE FUNCTORS 89Next, we proceed to derive the useful law of cata-absorption(|g|) · Tf = (|g · B (f,id)|) (3.67)as a consequence of the laws studied in section 3.11. Our target is to show that, fork = (|g|) · Tf in (3.57), one gets α = g · B (f,id):(|g|) · Tf = (|α|)⇔ { type-functor definition (3.66) }(|g|) · (|in B · B (f,id)|) = (|α|)⇐ { cata-fusion (3.61) }(|g|) · in B · B (f,id) = α · B (id,(|g|))⇔ { cata-cancellation (3.59) }g · B (id,(|g|)) · B (f,id) = α · B (id,(|g|))⇔ { B is a bi-functor (3.47) }g · B (id · f,(|g|) · id) = α · B (id,(|g|))⇔ { id is natural (2.11) }g · B (f · id,id · (|g|)) = α · B (id,(|g|))⇔ { (3.47) again, this time from left to right }g · B (f,id) · B (id,(|g|)) = α · B (id,(|g|))⇐ { obvious }g · B (f,id) = αThe following diagram pictures this property of catamorphisms:ATAin AB (A,TA)fCT f(|g|)TCDB (C,TC)in CgB (C,D)B (f,id)B (id,(|g|))B (f,id)B(id,T f)B (A,TC)B (A,D)B(id,(|g|))It remains to show that (3.66) indeed defines a functor. This can be verified by checkingproperties (3.44) and (3.45) for F = T :

90 CHAPTER 3. RECURSION IN THE POINTFREE STYLE• Property type-functor-id, cf. (3.44):Tid= { by definition (3.66) }(|in B · B (id,id)|)= { B is a bi-functor (3.46) }(|in B · id|)= { identity and cata-reflection (3.58) }id• Property type-functor, cf. (3.45) :T(f · g)= { by definition (3.66) }(|in B · B (f · g,id)|)= { id · id = id and B is a bi-functor (3.47) }(|in B · B (f,id) · B (g,id)|)= { cata-absorption (3.67) }(|in B · B (f,id)|) · Tg= { again cata-absorption (3.67) }(|in B |) · Tf · Tg= { cata-reflection (3.58) followed by identity }Tf · Tg3.13 A catalogue of standard polynomial inductivetypesThe following table contains a collection of standard polynomial inductive types and associatedbase type bi-functors, which are in canonical form (3.53). The table contains twoextra columns which may be used as bookmarks for equations (3.65) and (3.66), respec-

3.13. A CATALOGUE OF STANDARD POLYNOMIAL INDUCTIVE TYPES91tively 10 :Description TX B (X,Y ) B (id,f) B (f,id)“Right” Lists List X 1 + X × Y id + id × f id + f × id“Left” Lists LList X 1 + Y × X id + f × id id + id × fNon-empty Lists NList X X + X × Y id + id × f f + f × idBinary Trees BTree X 1 + X × Y 2 id + id × f 2 id + f × id“Leaf” Trees LTree X X + Y 2 id + f 2 f + id(3.68)All type functors T in this table are unary. In general, one may think of inductivedatatypes which exhibit more than one type parameter. Should n parameters be identifiedin T, then this will be based on an n + 1-ary base functor B, cf.T(X 1 ,...,X n ) ∼ = B(X1 ,...,X n ,T(X 1 ,... ,X n ))So, every n + 1-ary polynomial functor B(X 1 ,...,X n ,X n+1 ) can be identified as thebasis of an inductive n-ary type functor (the convention is to stick to the canonical formand reserve the last variable X n+1 for the “recursive call”). While type bi-functors (n = 2)are often found in programming, the situation in which n > 2 is relatively rare. Forinstance, the combination of leaf-trees with binary-trees in (3.68) leads to the so-called“full tree” type bi-functorDescription T(X 1 ,X 2 ) B(X 1 ,X 2 ,Y ) B(id,id,f) B(f,g,id)“Full” Trees FTree(X 1 ,X 2 ) X 1 + X 2 × Y 2 id + id × f 2 f + g × id (3.69)As we shall see later on, these types are widely used in programming. In the actualencoding of these types in HASKELL, exponentials are normally expanded to productsaccording to (2.87), see for instancedata BTree a = Empty | Node(a, (BTree a, BTree a))Moreover, one may chose to curry the type constructors as in, e.g.data BTree a = Empty | Node a (BTree a) (BTree a)Exercise 3.6. Write as a catamorphisms• the function which counts the number of elements of a non-empty list (type NListin (3.68)).10 Since (id A ) 2 = id (A2 ) one writes id 2 for id in this table.

92 CHAPTER 3. RECURSION IN THE POINTFREE STYLE• the function which computes the maximum element of a binary-tree of natural numbers.✷Exercise 3.7. Characterize the function which is defined by (|[ [],h ]|) for each of thefollowing definitions of h:h(x,(y 1 ,y 2 )) = y 1 + [x] + y 2 (3.70)h = + · (singl × +) (3.71)h = + · ( + × singl) · swap (3.72)assuming singl a = [a]. Identify in (3.68) which datatypes are involved as base functors.✷Exercise 3.8. Write as a catamorphism the function which computes the frontier of a treeof type LTree (3.68), listed from left to right.✷3.14 Functors and type functors in HASKELLThe concept of a (unary) functor is provided in HASKELL in the form of a particular classexporting the fmap operator:class Functor f wherefmap :: (a -> b) -> (f a -> f b)So fmap g encodes F g once we declare F as an instance of class Functor. The mostpopular use of fmap has to do with HASKELL lists, as allowed by declarationinstance Functor [] wherefmap f [] = []fmap f (x:xs) = f x : fmap f xs

3.15. THE MUTUAL-RECURSION LAW 93in language’s Standard Prelude.In order to encode the type functors we have seen so far we have to do the sameconcerning their declaration. For instance, should we writeinstance Functor BTreewhere fmap f =cataBTree ( inBTree . (id -|- (f >< id)) )concerning the binary-tree datatype of (3.68) and assuming appropriate declarations ofcataBTree and inBTree, then fmap is overloaded and used across such binary-trees.Bi-functors can be added easily by writingclass BiFunctor f wherebmap :: (a -> b) -> (c -> d) -> (f a c -> f b d)Exercise 3.9. Declare all datatypes in (3.68) in HASKELL notation and turn them intoHASKELL type functors, that is, define fmap in each case.✷Exercise 3.10. Declare datatype (3.69) in HASKELL notation and turn it into an instanceof class BiFunctor.✷3.15 The mutual-recursion lawThe theory developed so far for building (and reasoning about) recursive functions doesn’tcope with mutual recursion. As a matter of fact, the pattern of recursion of a givencata(ana,hylo)morphism involves only the recursive function being defined, even thoughmore than once, in general, as dictated by the relevant base functor.It turns out that rules for handling mutual recursion are surprisingly simple to calculate.As motivation, recall section 2.10 where, by mixing products with coproducts, weobtained a result — the exchange rule (2.47) — which stemmed from putting together thetwo universal properties of product and coproduct, (2.55) and (2.57), respectively.

94 CHAPTER 3. RECURSION IN THE POINTFREE STYLEThe question we want to address in this section is of the same brand: what can onetell about catamorphisms which output pairs of values? By (2.55), such catamorphismsare bound to be splits, as are the corresponding genes:TinF T(|〈h,k〉|)F(|〈h,k〉|)A × B F (A × B)〈h,k〉As we did for the exchange rule, we put (2.55) and the universal property of catamorphisms(3.57) against each other and calculate:〈f,g〉 = (|〈h,k〉|)≡ { cata-universal (3.57) }〈f,g〉 · in = 〈h,k〉 · F 〈f,g〉≡ { ×-fusion (2.24) twice }〈f · in,g · in〉 = 〈h · F 〈f,g〉,k · F 〈f,g〉〉≡ { (2.56) }f · in = h · F 〈f,g〉 ∧ g · in = k · F 〈f,g〉The rule thus obtained,{f · in = h · F 〈f,g〉g · in = k · F 〈f,g〉≡ 〈f,g〉 = (|〈h,k〉|) (3.73)is referred to as the mutual recursion law (or as “Fokkinga’s law”) and is useful in combiningtwo mutually recursive functions f and gTinF TTinF TfF 〈f,g〉A F (A × B)hgF 〈f,g〉B F (A × B)kinto a single catamorphism.When applied from left to right, law (3.73) is surprisingly useful in optimizing recursivefunctions in a way which saves redundant traversals of the input inductive type T.Let us take the Fibonacci function as example:fib 0 = 1fib 1 = 1fib(n + 2) = fib(n + 1) + fib n

3.15. THE MUTUAL-RECURSION LAW 95It can be shown that fib is a hylomorphism of type LTree (3.68), fib = [count,fibd], forcount = [ 1,add ], add(x,y) = x + y and fibd n = if n < 2 then i 1 Nil else i 2 (n −1,n − 2). This hylo-factorization of fib tells its internal algorithmic structure: the dividestep [(fibd)] builds a tree whose number of leaves is a Fibonacci number; the conquer step(|count|) just counts such leaves.There is, of course, much re-calculation in this hylomorphism. Can we improve itsperformance? The clue is to regard the two instances of fib in the recursive branch asmutually recursive over the natural numbers. This clue is suggested not only by fibhaving two base cases (so, perhaps it hides two functions) but also by the lookahead n+2in the recursive clause.We start by defining a function which reduces such a lookahead by 1,f n = fib(n + 1)Clearly, f(n + 1) = fib(n + 2) = f n + fib n and f 0 = fib 1 = 1. Putting f and fibtogther,f 0 = 1f(n + 1) = f n + fib nfib 0 = 1fib(n + 1) = f nwe obtain two mutually recursive functions over the natural numbers (IN 0 ) which transforminto pointfree equalitiesoverf · [ 0,suc ] = [ 1,add · 〈f,fib〉 ]fib · [ 0,suc ] = [ 1,f ]1 + IN 0IN 0 ∼= } {{ }(3.74)in=[ 0,suc ]FIN 0Reverse +-absorption (2.41) will further enable us to rewrite the above intof · in = [ 1,add ] · F 〈f,fib〉fib · in = [ 1,π 1 ] · F 〈f,fib〉thus bringing functor F f = id + f explicit and preparing for mutual recursion removal:f · in = [ 1,add ] · F 〈f,fib〉fib · in = [ 1,π 1 ] · F 〈f,fib〉

96 CHAPTER 3. RECURSION IN THE POINTFREE STYLE≡ { (3.73) }〈f,fib〉 = (|〈[ 1,add ],[ 1,π 1 ]〉|)≡ { exchange law (2.47) }〈f,fib〉 = (|[ 〈1,1〉, 〈add,π 1 〉 ]|)≡ { going pointwise and denoting 〈f,fib〉 by fib ′ }{fib ′ 0 = (1,1)fib ′ (n + 1) = (x + y,x) where (x,y) = fib ′ nSince fib = π 2 · fib ′ we easily recover fib from fib ′ and obtain the intended linearversion of Fibonacci (encoded in Haskell):fib n = y where (x,y) = fib’ nfib’ 0 = (1,1)fib’ (n+1) = (x+y,x)where (x,y) = fib’ nThis version of fib is actually the semantics of the “for-loop” one would write in animperative language which would initialize two global variables x,y := 1,1, loop overassignment x,y := x + y,x and yield the result in y. In the C programming language,one would writeint fib(int n){int x=1; int y=1; int i;for (i=1;i

3.15. THE MUTUAL-RECURSION LAW 97answer is given by functionthat is,i 2i=1,nns n def = ∑ns 0 = 0ns(n + 1) = (n + 1) 2 + ns nin Haskell. However, this hylomorphism is inefficient because each iteration involvesanother hylomorphism computing square numbers.One way of improving ns is to introduce function bnm n def = (n + 1) 2 and expressthis over (3.74),bnm 0 = 1bnm(n + 1) = 2n + 3 + bnm nhoping to blend ns with bnm using the mutual recursion law. However, the same problemarises in bnm itself, which now depends on term 2n + 3. We invent lin n def = 2 n + 3 andrepeat the process, thus obtaining:By redefininglin 0 = 3lin(n + 1) = 2 + lin nbnm ′ 0 = 1bnm ′ (n + 1) = lin n + bnm ′ nns ′ 0 = 0ns ′ (n + 1) = bnm ′ n + ns ′ nwe obtain three functions — ns ′ , bnm ′ and lin — mutually recursive over the polynomialbase F g = id + g of the natural numbers.Exercise 3.11 below shows how to extend (3.73) to three mutually recursive functions(3.75). (From this it is easy to extend it further to the n-ary case.) It is routine work toshow that, by application of (3.75) to the above three functions, one obtains the linearversion of ns which follows:

98 CHAPTER 3. RECURSION IN THE POINTFREE STYLEns’’ n = let (a,b,c) = aux n in awhereaux 0 = (0,1,3)aux(n+1) = let (a,b,c) = aux nin (a+b,b+c,2+c)In retrospect, note that (in general) not every system of n mutually recursive functions⎧⎪⎨⎪⎩f 1 = φ 1 (f 1 ,...,f n ).f n = φ n (f 1 ,...,f n )involving n functions and n functional combinators φ 1 ,... ,φ n can be handled by a suitablyextended version of (3.73). This only happens if all f i have the same “shape”, thatis, if they share the same base functor F.Exercise 3.11. Show that law (3.73) generalizes to more than two mutually recursivefunctions, in this case three:✷⎧⎪⎨⎪⎩f · in = h · F 〈f, 〈g,j〉〉g · in = k · F 〈f, 〈g,j〉〉j · in = l · F 〈f, 〈g,j〉〉≡ 〈f, 〈g,j〉〉 = (|〈h, 〈k,l〉〉|) (3.75)Exercise 3.12. The exponential function e x : IR → IR (where “e” denotes Euler’snumber) can be defined in several ways, one being the calculation of Taylor series:e x =∞∑n=0x nn!(3.76)The following function, in Haskell,exp :: Double -> Integer -> Doubleexp x 0 = 1exp x (n+1) = xˆ(n+1) / fac (n+1) + (exp x n)computes an approximation of e x , where the second parameter tells how many terms tocompute. For instance, while exp 1 1 = 2.0, exp 1 10 yields 2.7182818011463845.

3.15. THE MUTUAL-RECURSION LAW 99Function exp x n performs badly for n larger and larger: while exp 1 100 runs instantaneously,exp 1 1000 takes around 9 seconds, exp 1 2000 takes circa 33 seconds, and soon.Decompose exp into mutually recursive functions so as to apply (3.75) and obtain thefollowing linear version:exp x n = let (e,b,c) = aux x nin e whereaux x 0 = (1,2,x)aux x (n+1) = let (e,s,h) = aux x nin (e+h,s+1,(x/s)*h)✷Exercise 3.13. From the following basic properties of addition and multiplication,show that a ∗ n is the “for-loop” (a+) n 0.✷a ∗ 0 = 0 (3.77)a ∗ 1 = a (3.78)a ∗ (b + c) = a ∗ b + a ∗ c (3.79)Exercise 3.14. Show that, for all n ∈ IN 0 , n = suc n 0. Hint: use cata-reflexion (3.58).✷As example of application of (3.73) for T other than IN 0 , consider the following recursivepredicate which checks whether a (non-empty) list is ordered,ord : A + 2ord[a] = TRUEord(cons(a,l)) = a ≥ (listMaxl) ∧ (ordl)

100 CHAPTER 3. RECURSION IN THE POINTFREE STYLEwhere ≥ is assumed to be a total order on datatype A andcomputes the greatest element of a given list of As:listMax = (|[ id,max ]|) (3.80)A +[ singl,cons ]A + A × A +listMaxA[ id,max ]A + A × Aid+id×listMax(In the diagram, singl a = [a].)Predicate ord is not a catamorphism because of the presence of listMax l in therecursive branch. However, the following diagram depicting ordA +[ singl,cons ]A + A × A +ord2 A + A × (A × 2)[ TRUE,α ]id+id×〈listMax,ord〉(where α(a,(m,b)) def = a ≥ m ∧ b) suggests the possibility of using the mutual recursionlaw. One only has to find a way of letting listMax depend also on ord, which isn’tdifficult: for any A + g B , one hasA +[ singl,cons ]A + A × A +listMaxid+id×〈listMax,g〉A A + A × (A × B)[ id,max·(id×π 1 ) ]where the extra presence of g is cancelled by projection π 1 .For B = 2 and g = ord we are in position to apply Fokkinga’s law and obtain:〈listMax,ord〉 = (|〈[ id,max · (id × π 1 ) ],[ TRUE,α ]〉|)= { exchange law (2.47) }(|[ 〈id, TRUE〉, 〈max · (id × π 1 ),α〉 ]|)Of course, ord = π 2 · 〈listMax,ord〉. By denoting the above synthesized catamorphismby aux, we end up with the following version of ord:ordl = let (a,b) = auxlin b

3.16. “BANANA-SPLIT”: A COROLLARY OF THE MUTUAL-RECURSION LAW101whereaux : A + A × 2aux[a] = (a, TRUE)aux(cons(a,l)) = letin(m,b) = auxl(max(a,m),(a > m ∧ b))3.16 “Banana-split”: a corollary of the mutual-recursionlawLet h = i · F π 1 and k = j · F π 2 in (3.73). Thenf · in = (i · F π 1 ) · F 〈f,g〉≡ { composition is associative and F is a functor }f · in = i · F (π 1 · 〈f,g〉)≡ { by ×-cancellation (2.20) }f · in = i · F f≡ { by cata-cancellation }f = (|i|)Similarly, from k = j · F π 2 we getThen, from (3.73), we getthat isg = (|j|)〈(|i|),(|j|)〉 = (|〈i · F π 1 ,j · F π 2 〉|)〈(|i|),(|j|)〉 = (|(i × j) · 〈F π 1 ,Fπ 2 〉|) (3.81)by (reverse) ×-absorption (2.25).This law provides us with a very useful tool for “parallel loop” inter-combination:“loops” (|i|) and (|j|) are fused together into a single “loop” (|(i × j) · 〈F π 1 ,Fπ 2 〉|). Theneed for this kind of calculation arises very often. Consider, for instance, the functionwhich computes the average of a non-empty list of natural numbers,average def = (/) · 〈sum,length〉 (3.82)

102 CHAPTER 3. RECURSION IN THE POINTFREE STYLEwhere sum and length are the expected IN + catamorphisms:sum = (|[ id,+ ]|)length = (|[ 1,succ · π 2 ]|)As defined by (3.82), function average performs two independent traversals of the argumentlist before division (/) takes place. Banana-split will fuse such two traversals into asingle one (see function aux below), thus leading to a function which will run ”twice asfast”:average l =x/ywhere(x,y) = aux laux[a] = (a,1)aux(cons(a,l)) = (a + x,y + 1)where (x,y) = aux l(3.83)Exercise 3.15. Calculate (3.83) from (3.82). Which of these two versions of the samefunction is easier to understand?✷3.17 Inductive datatype isomorphismnot yet available3.18 Bibliography notesIt is often the case that the expressive power of a particular programming language orparadigm is counter-productive in the sense that too much freedom is given to programmers.Sooner or later, these will end up writing unintelligible (authorship dependent) codewhich will become a burden to whom has to maintain it. Such has been the case of imperativeprogramming in the past (inc. assembly code), where the unrestricted use of gotoinstructions eventually gave place to if-then-else, while and repeat structuredprogramming constructs.A similar trend has been observed over the last decades at a higher programminglevel: arbitrary recursion and/or (side) effects have been considered harmful in functionalprogramming. Instead, programmers have been invited to structure their code around

3.18. BIBLIOGRAPHY NOTES 103generic program devices such as eg. fold/unfold combinators, which bring discipline torecursion. One witnesses progress in the sense that the loss of freedom is balanced by theincrease of formal semantics and the availability of program calculi.Such disciplined programming combinators have been extended from list-processingto other inductive structures thanks to one of the most significant advances in programmingtheory over the last decade: the so-called functorial approach to datatypes whichoriginated mainly from [MA86], was popularized by [Mal90] and reached textbook formatin [BdM97]. A comfortable basis for exploiting polymorphism [Wad89], the “datatypesas functors” moto has proved beneficial at a higher level of abstraction, giving birth topolytypism [JJ96].The literature on anas, catas and hylos is vast (see eg. [MH95], [JJ98], [GHA01]) andit is part of a broader discipline which has become known as the mathematics of programconstruction [Bac04]. This chapter provides an introduction to such as discipline. Onlythe calculus of catamorphisms is presented. The corresponding theory of anamorphismsand hylomorphisms demands further mathematical machinery and will won’t be dealtwith before chapters 5 and 7. The results on mutual recursion presented in this chapterwere pionered by Maarten Fokkinga [Fok92].

104 CHAPTER 3. RECURSION IN THE POINTFREE STYLE

Chapter 4Why Monads MatterIn this chapter we present a powerful device in state-of-the-art programming, that of amonad. The monad concept is nowadays of primary importance in computing sciencebecause it makes it possible to describe computational effects as disparate as input/output,comprehension notation, state variable updating, context dependence, partial behaviouretc. in an elegant and uniform way.Our motivation to this concept will start from a well-known problem in functionalprogramming (and computing as a whole) — that of coping with undefined computations.4.1 Partial functionsConsider the IR to IR functiong x def = 1/xClearly, g is undefined for x = 0 because g 0 = 1/0 is so big a real number that it cannotbe properly evaluated. In fact, the HASKELL output for g 0 = 1/0 is just “panic”:Main> g 0Program error: {primDivDouble 1.0 0.0}Main>Functions such as g above are called partial functions because they cannot be appliedto all of their inputs (i.e., they diverge for some of their inputs). Partial functions arevery common in mathematics or programming — for other examples think of e.g. listprocessingfunctions head and tail.105

106 CHAPTER 4. WHY MONADS MATTERPanic is very dangerous in programming. In order to avoid this kind of behaviourone has two alternatives, either ensuring that every call to g x is protected — i.e., thecontexts which wrap up such calls ensure pre-condition x ≠ 0, or one raises exceptions,i.e. explicit error values. In the former case, mathematical proofs need to be carried outin order to guarantee safety (that is, pre-condition compliance). The overall effect is torestrict the domain of the partial function. In the latter case one goes the other wayround, by extending the co-domain (vulg. range) of the function so that it accommodatesexceptional outputs. In this way one might define, in HASKELL:data ExtReal = Ok Real | Errorand then redefineg :: Real -> ExtRealg 0 = Errorg n = Ok 1/nIn general, one might define parametric typedata Ext a = Ok a | Errorin order to extend an arbitrary data type a with its (polymorphic) exception (or errorvalue). Clearly, one hasExt A ∼ = MaybeA ∼ = 1 + ASo, in abstract terms, one may regard as partial every function of signaturefor some A and B 1 .g1 + A B4.2 Putting partial functions togetherDo partial functions compose? Their types won’t match in general:g1 + B Af1 + C B1 In conventional programming, every function delivering a pointer as result — as in e.g. the Cprogramming language — can be regarded as one of these functions.

4.2. PUTTING PARTIAL FUNCTIONS TOGETHER 107Clearly, we have to extend f — which is itself a partial function — to some f ′ able toaccept arguments from 1 + B:1i 1... i 2f ′1 + B1 + C BfgAThe most “obvious” instance of the ellipsis (...) in the diagram above is i 1 and thiscorresponds to what is called strict composition: an exception produced by the producerfunction g is propagated to the output of the consumer function f:f • g def = [ i 1 ,f ] · g (4.1)Expressed in terms of Ext, composite function f • g works as follows:where(f • g)a = f ′ (g a)f ′ Error= Errorf ′ (Ok b) = f bAltogether, we have the following Haskell expression for the meaning of f • g:\a -> f’ (g a)where f’ Error = Errorf’ (Ok b) = f bNote that the adopted extension of f can be decomposed — by reverse +-absorption(2.41) — intoas displayed in diagramf ′ = [ i 1 ,id ] · (id + f)1 + (1 + C)id+f1 + BgA[ i 1 ,id ]f1 + C B

108 CHAPTER 4. WHY MONADS MATTERAll in all, we have the following version of (4.1):f • g def = [ i 1 ,id ] · (id + f) · gDoes this functional composition scheme have a unit, that is, is there a u such thatf • u = f = u • f (4.2)holds? Clearly, if it exists, it must bear type 1 + A A . Let us solve (4.2) for u:uf • u = f = u • f≡ { substitution }[ i 1 ,f ] · u = f = [ i 1 ,u ] · f⇐ { let u = i 2 }[ i 1 ,f ] · i 2 = f = [ i 1 ,i 2 ] · f ∧ u = i 2≡ { by +-cancellation (2.38) and +-reflection (2.39) }f = f = id · f ∧ u = i 2⇐ { identity }u = i 24.3 ListsIn contrast to partial functions, which can produce no output, let us now consider functionswhich deliver too many outputs, for instance, lists of output values:B ⋆gAC ⋆fBFunctions f and g do not compose but once again one can think of extending the consumerfunction (f) by mapping it along the output of the producer function (g):(C ⋆ ) ⋆ f⋆B ⋆C ⋆ fB

4.4. MONADS 109To complete the process, one has to flatten the nested-sequence output in (C ⋆ ) ⋆ via the obviouslist-catamorphism C ⋆ concat (C ⋆ ) ⋆ , where concat def = (|[ [ ], + ]|). In summary:as captured in the following diagram:f • g def = concat · f ⋆ · g (4.3)(C ⋆ ) ⋆f ⋆B ⋆gAconcatC ⋆fBExercise 4.1.(4.3).✷Show that singl (recall exercise 3.7) is the unit u of • in the context ofExercise 4.2. Encode in HASKELL a pointwise version of (4.3). Hint: first apply (list)cata-absorption (3.67).✷4.4 MonadsBoth function composition schemes (4.1) and (4.3) above share the same polytypic pattern:the output of the producer function is “F-times” more elaborate than the input of theconsumer function, where F is some parametric datatype — F X = 1+X in case of (4.1),and F X = X ⋆ in case of (4.3). Then a composition scheme is devised for such functions,which is displayed inF(F C)F fF BgAµF C fB

110 CHAPTER 4. WHY MONADS MATTERand is given byµf • g def = µ · F f · g (4.4)where F A F 2 A is a suitable polymorphic function. Together with a unit functionF A uA and µ, datatype F will form a so-called monad type, of which 1 + and( ) ⋆ are the two examples seen above.Arrow µ · F f is called the extension of f. Functions µ and u are referred to as themonad’s multiplication and unit, respectively. The monadic composition scheme (4.4) iscalled Kleisli composition.fA monadic arrow FB A conveys the idea of a function which produces anoutput of “type” B “wrapped by F”, where datatype F describes some kind of (computational)“effect”. The monad’s unit FB B is a primitive monadic arrow whichuproduces (i.e. promotes, injects, wraps) data together with such an effect.The monad concept is nowadays of primary importance in computing science becauseit makes it possible to describe computational effects as disparate as input/output, statevariable updating, context dependence, partial behaviour (seen above) etc. in an elegantand uniform way. Moreover, the monad’s operators exhibit notable properties which makeit possible to reason about such computational effects.The remainder of this section is devoted to such properties. First of all, the propertiesimplicit in the following diagrams will be required for F to be regarded as a monad:Multiplication :F 2 AµF 3 Aµ · µ = µ · F µ (4.5)µF AµF µF 2 AUnit :F 2 AµF Auid µF AF uF 2 Aµ · u = µ · F u = id (4.6)Simple but beautiful symmetries apparent in these diagrams make it easy to memorizetheir laws and check them for particular cases. For instance, for the (1 + ) monad, law(4.6) will read as follows:[ i 1 ,id ] · i 2 = [ i 1 ,id ] · (id + i 2 ) = id

4.4. MONADS 111These equalities are easy to check.In laws (4.5) and (4.6), the different instances of µ and u are differently typed, asthese are polymorphic and exhibit natural properties:µ-natural :fAF fF AµF 2 AB F B F 2 B µF 2 fF f · µ = µ · F 2 f (4.7)u-natural :fAF AF fuAB F B B ufF f · u = u · f (4.8)The simplest of all monads is the identity monad F X def = X, which is such thatµ = id, u = id and f • g = f · g. So — in a sense — one may be think of all thefunctional discipline studied so far as a particular case of a wider discipline in which anarbitrary monad is present.4.4.1 Properties involving (Kleisli) compositionThe following properties arise from the definitions and monadic properties presentedabove:f • (g • h) = (f • g) • h (4.9)u • f = f = f • u (4.10)(f • g) · h = f • (g · h) (4.11)(f · g) • h = f • (F g · h) (4.12)id • id = µ (4.13)Properties (4.9) and (4.10) are the monadic counterparts of, respectively, (2.8) and (2.10),meaning that monadic composition preserves the properties of normal functional composition.In fact, for the identity monad, these properties coincide with each other.Above we have shown that property (4.10) holds for the list monad, recall (4.2). Ageneral proof can be produced similarly. We select property (4.9) as an illustration of the

112 CHAPTER 4. WHY MONADS MATTERrôle of the monadic properties:f • (g • h)= { definition (4.4) twice }µ · F f · (µ · F g · h)= { µ is natural (4.7) }µ · µ · F(F f) · F g · h= { functor F }µ · µ · F(F f · g) · h= { definition (4.4) }µ · (F f · g) • h= { definition (4.4) }(f • g) • hExercise 4.3. Check the other laws above.✷4.5 Monadic application (binding)The monadic extension of functional application ap (2.67) is another operator ap ′ whichis intended to be “tolerant” in face of any F’ed argument x:(F B) A × F A ap′ Bap ′ (f,x) = f ′ x = (µ · F f)x(4.14)If in curry/flipped format, monadic application is called binding and denoted by symbol“>>=”, looking very much like postfix functional application,((F B) A ) F A >= F B (4.15)that is:x >>= fdef= (µ · F f)x (4.16)

4.6. SEQUENCING AND THE DO-NOTATION 113This operator will exhibit properties arising from its definition and the basic monadicproperties, e.g.x >>= u≡ { definition (4.16) }(µ · F u)x≡ { law (4.6) }(id)x≡ { identity function }xAt pointwise level, one may chain monadic compositions from left to right, e.g.(((x >>= f 1 ) >>= f 2 ) >>= ...f n−1 ) >>= f nfor functions A f 1 F B 1 , B 1f 2 F B 2 , . . . B n−1f n F B n .4.6 Sequencing and the do-notationGiven two monadic values x and y, it becomes possible to “sequence” them, thus obtaininganother of such value, by defining the following operator:For instance, within the finite-list monad, one hasx >> y def = x >>= y (4.17)[1,2] >> [3,4] = (concat · [3,4] ⋆ )[1,2] = concat[[3,4],[3,4]] = [3,4,3,4]Because this operator is associative (prove this as an exercise), one may iterate it tomore than two arguments and write, for instance,x 1 >> x 2 >> ... >> x nThis leads to the popular do notation, which is another piece of (pointwise) notationwhich makes sense in a monadic context:do x 1 ;x 2 ;... ;x ndef= x 1 >> do x 2 ;...;x nfor n ≥ 1. For n = 1 one trivially hasdo x 1def= x 1

114 CHAPTER 4. WHY MONADS MATTER4.7 Generators and comprehensionsThe do-notation accepts a variant in which the arguments of the >> operator are “generators”of the forma ← x (4.18)where, for a of type A, x is an inhabitant of monadic type F A. One may regard a ← x asmeaning “let a be taken from x”. Then the do-notation extends as follows:do a ← x 1 ;x 2 ;...;x ndef= x 1 >>= λa.(do x 2 ;... ;x n ) (4.19)Of course, we should now allow for the x i to range over terms involving variable a. Forinstance (again in the list-monad), by writingwe meando a ← [1,2,3];[a 2 ] (4.20)[1,2,3] >>= λa.[a 2 ]= concat((λa.[a 2 ]) ⋆ [1,2,3])= concat[[1],[4],[9]]= [1,4,9]The analogy with classical set-theoretic ZF-notation, whereby one might write {a 2 |a ∈ {1,2,3}} to describe the set of the first three perfect squares, calls for the followingnotation,[ a 2 | a ← [1,2,3] ] (4.21)as a “shorthand” of (4.20). This is an instance of the so-called comprehension notation,which can be defined in general as follows:[ e | a 1 ← x 1 ,...,a n ← x n ] = do a 1 ← x 1 ;...;a n ← x n ;u(e) (4.22)Alternatively, comprehensions can be defined as follows, where p,q stand for arbitrarygenerators:[t] = ut (4.23)[ f x | x ← l ] = (F f)l (4.24)[ t | p,q ] = µ[ [ t | q ] | p ] (4.25)Note, however, that comprehensions are not restricted to lists or sets — they can bedefined for any monad F.

4.8. MONADS IN HASKELL 1154.8 Monads in HASKELLIn the Standard Prelude for HASKELL, one finds the following minimal definition of theMonad class,class Monad m wherereturn :: a -> m a(>>=) :: m a -> (a -> m b) -> m bwhere return refers to the unit of m, on top of which the “sequence” operator(>>) :: m a -> m b -> m bfail :: String -> m ais defined byp >> q= p >>= \ _ -> qas expected. This class is instantiated for finite sequences ([]), Maybe and IO.The µ multiplication operator is function join in module Monad.hs:join :: (Monad m) => m (m a) -> m ajoin x = x >>= idThis is easily justified:join x = x >>= id (4.26)= { definition (4.16) }(µ · F id)x= { functors commute with identity (3.44) }(µ · id)x= { law (2.10) }µ xIn Mpi.hs we define (Kleisli) monadic composition in terms of the binding operator:(.!) :: Monad a => (b -> a c) -> (d -> a b) -> d -> a c(f .! g) a = (g a) >>= f

116 CHAPTER 4. WHY MONADS MATTER4.8.1 Monadic I/OIO, a parametric datatype whose inhabitants are special values called actions or commands,is a most relevant monad. Actions perform the interconnection between HASKELLand the environment (file system, operating system). For instance, getLine :: IO Stringis a particular action. Parameter String refers to the fact that this action “delivers” —or extracts — a string from the environent. This meaning is clearly conveyed by the typeString assigned to symbol l indo l ← getLine;... l ...which is consistent with typing rule for generators (4.18). Sequencing corresponds to the“;” syntax in most programming languages (e.g. C) and the do-notation is particularyintuitive in the IO-context.Examples of functions delivering actions areandFilePathCharreadFileputChar IOString IO()— both produce I/O commands as result.As is to be expected, the implementation of the IO monad in HASKELL — availablefrom the Standard Prelude — is not totally visible, for it is bound to deal with theintrincacies of the underlying machine:instance Monad IO where(>>=) = primbindIOreturn = primretIORather interesting is the way IO is regarded as a functor:fmap f x = x >>= (return . f)This goes the other way round, the monadic structure “helping” in defining the functorstructure, everything consistent with the underlying theory:x >>= (u · f) = (µ · IO(u · f))x= { functors commute with composition }(µ · IOu · IOf)x= { law (4.6) for F = IO }(IO f)x= { definition of fmap }(fmap f)x

4.9. THE STATE MONAD 117For enjoyable reading on monadic input/output in HASKELL see [Hud00], chapter 18.Exercise 4.4. Use the do-notation and the comprehension notation to output the followingtruth-table, in HASKELL:✷p / q False TrueFalse False FalseTrue False TrueExercise 4.5. Extend the Maybe monad to the following “error message” exceptionhandling datatype:data Error a = Err String | Ok a deriving ShowIn case of several error messages issued in a do sequence, how many turn up on thescreen? Which ones?✷4.9 The state monadNB: this section is still very draftyThe so-called state monad is a monad whose inhabitants are state-transitions encoding aparticular brand of state-based automaton known as Mealy machine. Given a set A (inputalphabet), a set B (output alphabet) and a set of states S, a deterministic Mealy machine(DMM) is specified by a transition function of typeA × Sδ B × S (4.27)Wherever (b,s ′ ) = δ(a,s), we say that the machine has transitions a|b s ′

118 CHAPTER 4. WHY MONADS MATTERand refer to s as the before state, and to s ′ as the after state.It is clear from (4.27) that δ can be expressed as the split of two functions f and g,δ = 〈f,g〉, as depicted in the following diagram:a♣ ✲✲f✲b = f(a,s)✲✲♣gs✓S✒✏✛✑s ′ = g(a,s)(4.28)The information recorded in the state of a DMM is either meaningless to the user ofthe machine (as in eg. the case of states represented by numbers) or too complex to behandled explicitly (as is the case of eg. the data kept in a large database). So, it is convenientto abstract from it. Such an abstraction leads to the state monad in the followingway: recalling (2.75), we simply curry δA δ (B × S) S} {{ }(St S) B(4.29)thus “shifting” the input state to the output. In this way, δ a is a function capturing allstate-transitions (and corresponding outputs) for input a. For instance, the function whichappends a new element to the back of a queue,enq(a,s)def= s + [a]can be converted into a DMM by adding to it a dummy output of type 1 and then transposing:enqueue : A → (1 × S) Senqueue a def = 〈!,( +[a])〉(4.30)Action enqueue performs enq on the state while acknowledging it by issuing an outputof type 1.

4.9. THE STATE MONAD 119Unit and multiplication.Let us show that(St S) A ∼ = (A × S) S (4.31)forms a monad. As we shall see, the fact that the values of this monad are functionsbrings the theory of exponentiation to the forefront. (Thus a review of section 2.14 isrecommended.) Notation ̂f will be used to abbreviate uncurry f. Thus the followingvariant of universal law (2.67),whose cancellationis written pointwise as follows:̂k = f ⇔ f = ap · (k × id) (4.32)̂k = ap · (k × id) (4.33)̂k(c,a) = (k c)a (4.34)First of all, what is the functor behind (4.31)? Fixing the state space S, we obtainon objects andFX def = (X × S) S (4.35)Ffdef= (f × id) S (4.36)on functions, where ( ) S is the exponential functor (2.71).The unit of this monad is the transpose of the simplest of all Mealy machines — theidentity:u : A → (A × S) Su = id(4.37)Let us see what this means:u = id≡ { (2.67) }ap · (u × id) = id≡ { introducing variables }ap(u a,s) = (a,s)≡ { definition of a }(u a)s = (a,s)

120 CHAPTER 4. WHY MONADS MATTERFrom the type of µ, for this monad,((A × S) S × S) S µ (A × S) Sone figures out µ = x S (recalling the exponential functor as defined by (2.71)) for xof type ((A × S) S × S)x (A × S) . This, on its turn, is easily recognized as aninstance of the ap polymorphic function (2.67), which is such that ap = id, ̂ recall (2.69).Altogether, we defineµ = ap S (4.38)Let us check the meaning of µ by applying it to an action expressed as in diagram(2.75):µ〈f,g〉 = ap S 〈f,g〉≡ { (2.71) }µ〈f,g〉 = ap · 〈f,g〉≡ { extensional equality (2.5) }µ〈f,g〉s = ap(f s,g s)≡ { definition of ap }µ〈f,g〉s = (f s)(g s)We find out that µ “unnests” the action inside f by applying it to the state delivered by g.Checking the monadic laws.µ · u = id first,The calculation of (4.6) is made in two parts, checkingµ · u= { definitions }ap S · id= { exponentials absorption (2.72) }ap · id= { reflection (2.69) }id

4.9. THE STATE MONAD 121and then checking µ · (Fu) = id:µ · (Fu)= { (4.38,4.36) }ap S · (id × id) S= { functor }(ap · (id × id)) S= { cancellation (2.68) }id S= { functor }idThe proof of (4.5) is also not difficult once supported by the laws of exponentials.Kleisli composition.Let us calculate f • g for this monad:f • g= { (4.4) }µ · F f · g= { (4.38) ; (4.36) }ap S · (f × id) S · g= { ( ) S is a functor }(ap · (f × id)) S · g= { (4.32) }̂f S · g= { cancellation }̂f S · ĝ= { absorption (2.72) }̂f · ĝIn summary, we have:f • g = ̂f · ĝ (4.39)

122 CHAPTER 4. WHY MONADS MATTERLet us use this in calculating lawwhere push and pop are such thatpop • push = u (4.40)push : A → (1 × S) Spush ̂ def = 〈!,̂:〉for S the datatype of finite lists. We reason:(4.41)pop : 1 → (A × S) Ŝpop def = 〈head,tail〉 · π 2(4.42)pop • push= { (4.39) }̂pop · ̂ push= { (4.41, 4.42) }〈head,tail〉 · π 2 · 〈!, ̂(:)〉= { (2.20, 2.24) }〈head,tail〉 · ̂(:)= { out · in = id (lists) }id= { (4.37) }uBind. The effect of binding a state transition x to a state-monadic function h is calculatedin a similar way:x >>= h= { (4.16) }(µ · Fh)x= { (4.38) and (4.36) }(ap S · (h × id) S )x= { ( ) S is a functor }

4.9. THE STATE MONAD 123(ap · (h × id)) S x= { cancellation (4.33) }ĥ S x= { exponential functor (2.71) }ĥ · xLet us unfold ĥ · x by splitting x into its components two components f and g:〈f,g〉 >>= h = ĥ · 〈f,g〉≡ { go pointwise }(〈f,g〉 >>= h)s = ĥ(〈f,g〉s)≡ { (2.18) }(〈f,g〉 >>= h)s =≡ { (4.34) }ĥ(f s,g s)(〈f,g〉 >>= h)s = h(f s)(g s)In summary, for a given “before state” s, g s is the intermediate state upon which f s runsand yields the output and (final) “after state”.Two prototypical inhabitants of the state monad: get and put.actions are defined as follows, in the PF-style:These genericgetputdef= 〈id,id〉 (4.43)def= 〈!,π 1 〉 (4.44)Action g retrieves the data stored in the state while put (which can also be writtenwhereput s = modify(s) (4.45)modify fdef= 〈!,f〉 (4.46)updates the state via state-to-state function f) stores a particular value in the state.The following is an example, in Haskell, of the standard use of get/put in managingcontext data, in this case a counter. The function decorates each node of a BTree (recallthis datatype from page 91) with its position in the tree:

124 CHAPTER 4. WHY MONADS MATTERdecBTree Empty = return EmptydecBTree (Node (a,(t1,t2))) =do n

Part IIMoving Away From (Pure)Functions125

Bibliography[ABH + 92] C. Aarts, R.C. Backhouse, P. Hoogendijk, E.Voermans, and J. van derWoude. A relational theory of datatypes, December 1992. Available fromwww.cs.nott.ac.uk/˜rcb.[Bac78][Bac04][Bar01]J. Backus. Can programming be liberated from the von Neumann style? afunctional style and its algebra of programs. CACM, 21(8):613–639, August1978.R.C. Backhouse. Mathematics of Program Construction. Univ. of Nottingham,2004. Draft of book in preparation. 608 pages.L.S. Barbosa. Components as Coalgebras. University of Minho, December2001. Ph. D. thesis.[BD77] R.M. Burstall and J. Darlington. A transformation system for developingrecursive programs. JACM, 24(1):44–67, January 1977.[BdM97] R. Bird and O. de Moor. Algebra of Programming. Series in ComputerScience. Prentice-Hall International, 1997. C.A.R. Hoare, series editor.[Bir98][Flo67][Fok92][GHA01]R. Bird. Introduction to Functional Programming. Series in Computer Science.Prentice-Hall International, 2nd edition, 1998. C.A.R. Hoare, serieseditor.R.W. Floyd. Assigning meanings to programs. In J.T. Schwartz, editor, MathematicalAspects of Computer Science, volume 19, pages 19–32. AmericanMathematical Society, 1967. Proc. Symposia in Applied Mathematics.M.M. Fokkinga. Law and Order in Algorithmics. PhD thesis, University ofTwente, Dept INF, Enschede, The Netherlands, 1992.Jeremy Gibbons, Graham Hutton, and Thorsten Altenkirch. When is a functiona fold or an unfold?, 2001. WGP, July 2001 (slides).283

284 BIBLIOGRAPHY[Gof84] J. Le Goff. Calendário, volume I, chapter 8, pages 260–292. I.N.-C.M.,1984. EINAUDI Encyclopedia (Portuguese translation).[HM93][Hud00][JJ96]Graham Hutton and Erik Meijer. Monadic parsing in Haskell. Journal ofFunctional Programming, 8(4), 1993.P. Hudak. The Haskell School of Expression - Learning Functional ProgrammingThrough Multimedia. Cambridge University Press, 1st edition, 2000.ISBN 0-521-64408-9.J. Jeuring and P. Jansson. Polytypic programming. In Advanced FunctionalProgramming, number 1129 in LNCS, pages 68–114. Springer, 1996.[JJ98] P. Jansson and J. Jeuring. Polylib — a library of polytypic functions. InWorkshop on Generic Programming (WGP’98), Marstrand, Sweden, 1998.[Jon03][LV03]S.L. Peyton Jones. Haskell 98 Language and Libraries. Cambridge UniversityPress, Cambridge, UK, 2003. Also published as a Special Issue of theJournal of Functional Programming, 13(1) Jan. 2003.R. Lämmel and J. Visser. A Strafunski Application Letter. In V. Dahl andP.L. Wadler, editors, Proc. of Practical Aspects of Declarative Programming(PADL’03), volume 2562 of LNCS, pages 357–375. Springer-Verlag, January2003.[MA86] E.G. Manes and M.A. Arbib. Algebraic Approaches to Program Semantics.Texts and Monographs in Computer Science. Springer-Verlag, 1986.D. Gries, series editor.[Mal90][McC63][MH95][Mog89]G. Malcolm. Data structures and program transformation. Science of ComputerProgramming, 14:255–279, 1990.J. McCarthy. Towards a mathematical science of computation. In C.M. Popplewell,editor, Proc. of IFIP 62, pages 21–28, Amsterdam-London, 1963.North-Holland Pub. Company.E. Meijer and G. Hutton. Bananas in space: Extending fold and unfold toexponential types. In S. Peyton Jones, editor, Proceedings of FunctionalProgramming Languages and Computer Architecture (FPCA95), 1995.Eugenio Moggi. Computational lambda-calculus and monads. In Proceedings4th Annual IEEE Symp. on Logic in Computer Science, LICS’89, PacificGrove, CA, USA, 5–8 June 1989, pages 14–23. IEEE Computer Society Press,Washington, DC, 1989.

BIBLIOGRAPHY 285[NR69]P. Naur and B. Randell, editors. Software Engineering: Report on a conferencesponsored by the NATO SCIENCE COMMITTEE, Garmisch, Germany,7th to 11th October 1968. Scientific Affairs Division, NATO, 1969.[Oli08a] J.N. Oliveira. Extended static checking by calculation using the pointfreetransform, 2008. Tutorial paper (56 p.) accepted for publication by Springer-Verlag, LNCS series.[Oli08b] J.N. Oliveira. Transforming Data by Calculation. In GTTSE’07, volume 5235of LNCS, pages 134–195. Springer, 2008.[OR06][PH70][Wad89]J.N. Oliveira and C.J. Rodrigues. Pointfree factorization of operation refinement.In FM’06, volume 4085 of LNCS, pages 236–251. Springer-Verlag,2006.M.S. Paterson and C.E. Hewitt. Comparative schematology. In Project MACConference on Concurrent Systems and Parallel Computation, pages 119–127, August 1970.P.L. Wadler. Theorems for free! In 4th International Symposium on FunctionalProgramming Languages and Computer Architecture, pages 347–359,London, Sep. 1989. ACM.

J.N. Oliveira PROGRAM DESIGN BY CALCULATION University of ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?