Why Functional Programming Matters

Papers we love Utrecht, June 2017


  • Old but good: 1984
  • No Haskell at that time, but LISP, ML, Miranda, etc.
  • Syntax in the paper: learn as we go

Intro: Before the "why", the "what"

  • Programming with functions
    • As opposed to objects or procedures
    • Program is a function, functions are defined in terms of functions…
  • Functions are defined by equations, like in maths
    • Equations mean that RHS can be substituted by LHS (vice-versa), always

Now the "why": The "advantages" you often hear

  • No assignments, variables never change
  • Functions have no side-effects
    • The only thing a function does is compute its result
  • Not REALLY advantages, right?

An analogy with structured programming

  • False "advantages"
    • No GOTOs
    • No multiple entry/exit points into code blocks
  • Real advantage: code becomes more modular

How modular is structured programming REALLY?

  • Modular problem solving:
    • Split problem into subproblems
    • Solve each subproblem
    • Glue solutions together
  • Separate compilation and scoping doesn't really help much…

The claim of this paper

  • We need better kinds of GLUE
  • Functional programming allows two great ways of glueing things
    • Using functions as parameters of other functions
    • Composing functions with no space waste

Glueing functions together

  • Having defined lists as follows

listof X ::= nil | cons X (listof X)

[]       means  nil
[1]      means  (cons 1 nil)
[1,2,3]  means  (cons 1 (cons 2 (cons 3 nil)))
  • We can define the sum of all elements in a list of numbers

sum nil = 0
sum (cons num list) = num + sum list

Glueing functions together

  • There are only two parts specific to computing a sum in sum

sum nil = 0
sum (cons num list) = num + sum list
  • We can build sum by giving the specific parts to a general function
  • This more general function is very useful

sum = reduce add 0
add x y = x + y

(reduce f x) nil = x
(reduce f x) (cons a l) = f a ((reduce f x) l)

Just how useful is reduce

product = reduce multiply 1
anytrue = reduce or false
alltrue = reduce and true
  • We can better understand what reduce does by example

cons 1 (cons 2 (cons 3 nil))
{- APPLY "reduce add 0" to line above -}
add 1 (add 2 (add 3 0))

cons 1 (cons 2 (cons 3 nil))
{- APPLY "reduce multiply 1" to line above -}
multiply 1 (multiply 2 (multiply 3 1))

Just how useful is reduce

  • We can even use reduce to append lists

append a b = reduce cons b a

append [1,2] [3,4]
reduce cons [3,4] [1,2]
(reduce cons [3,4]) (cons 1 (cons 2 nil))
cons 1 (cons 2 [3,4])

But wait, there's more

doubleall = reduce doubleandcons nil
  where doubleandcons num list = cons (2*num) list

doubleandcons = fandcons double
  where double n = 2 * n

fandcons f el list = cons (f el) list
fandcons f el = cons (f el)  {- because ∀ x. f x = g x ⟶  f = g -}

(f . g) h = f (g h)
fandcons f = cons . f

doubleall = reduce (cons . double) nil
map f = reduce (cons . f) nil
doubleall = map double

Now, for trees…

  • We can define ordered labeled trees as:

treeof X ::= node X (listof (treeof X))

node 1
     (cons (node 2 nil)
           (cons (node 3
                       (cons (node 4 nil) nil))

Reducing trees

  • We define redtree by analogy with reduce
    • 3 parameters, substituting node, cons and nil
    • redtree reduces trees, redtree' reduces list of subtrees

redtree f g a (node label subtrees) = f label (redtree’ f g a subtrees)

redtree’ f g a (cons subtree rest) = g (redtree f g a subtree) (redtree’ f g a rest)
redtree’ f g a nil = a

Example of using redtree

  • We can define a function to get a list of all labels in a tree

labels = redtree cons append nil

node 1
     (cons (node 2 nil)
           (cons (node 3
                       (cons (node 4 nil) nil))
{- APPLY "redtree cons append nil" to line above -}
cons 1
     (append (cons 2 nil)
             (append (cons 3
                           (append (cons 4 nil) nil))
  • By the way, of course we can map a function over a tree

maptree f = redtree (node . f) cons nil

Glueing programs together: composition

  • Remember Unix pipelines?
    • cmd1 | cmd2
    • Output of cmd1 fed to input of cmd2
    • Circular FIFO
  • In FP, we write
    cmd2 . cmd1
    • cmd1 ONLY runs enough to provide cmd2 with what it needs
    • No intermediate data stored
    • This is called lazy evaluation

Elegant sqrt

  • We use the Newton-Raphson algorithm
    • Start with approx. a0 and compute next approx. as follows:

a(n+1) = (a(n) + N/a(n)) / 2
  • If the sequence converges to a certain "a", then:

a = (a + N/a) / 2
2a = a + N/a
a = N/a
a*a = N
a = √N

This would normally be so ugly…

X = A0
Y = A0 + 2.*EPS
100  IF (ABS(X-Y).LE.EPS) GOTO 200
     Y = X
     X = (X + N/X) / 2.
     GOTO 100
  • Making it better with FP: first step is next

next N x = (x + N/x) / 2

A sequence of approximations appears!

next N x = (x + N/x) / 2

# abbreviating "next N" as "f", this is the sequence we want:
[a0, f a0, f (f a0), f (f (f a0)), ...

repeat f a = cons a (repeat f (f a))

repeat (next N) a0

Stopping before infinity

  • In the previous slide we define an infinite seq. of approxs.
  • But we can "stop" whenever the difference gets small enough

within eps (cons a (cons b rest)) = if abs (a - b) <= eps
                                        then b
                                        else within eps (cons b rest)

sqrt a0 eps N = within eps (repeat (next N) a0)
  • Square-root finder has two "parts":
    • Generate sequence of approximations
    • Select element from sequence
  • Let's re-use them for other goals

Numerical differentiation, FP-style

  • A way to approximate a derivative

easydiff f x h = (f(x+h)-f x) / h
  • Then we can get a sequence of approximations
  • And stop when the difference is small enough

halve x = x/2
differentiate h0 f x = map (easydiff f x) (repeat halve h0)

diffAttempt1 h0 f x = within eps (differentiate h0 f x)

This naïve attempt converges slowly…

  • Calculus helps with improving the sequence
  • The reasoning behind this is easy, in the paper
  • But essentially it gives a function improve

elimerror n (cons a (cons b rest)) = cons ((b*(2**n)-a)/(2**n-1)) (elimerror n (cons b rest))
order (cons a (cons b (cons c rest))) = round (log2 ((a-c)/(b-c) - 1))  {- NO PROOF HERE -}
improve s = elimerror (order s) s

within eps (improve (differentiate h0 f x))
within eps (improve (improve (differentiate h0 f x)))

What the glue allowed us here…

  • More understandable definitions
  • Above all: re-usable ingredients
    • improve, within also useful for integration

An example from artificial intelligence

  • Let's write a program capable of playing (deterministic) games
  • Using the minimax strategy, with the alpha-beta opt,
  • But in super elegant FP style
  • First of all we need a function giving the possible moves

moves: position -> listof position
  • With this we can calculate a game tree

Game trees

  • Remember the trees we had from section 3

treeof X ::= node X (listof (treeof X))
  • We can build a tree by repeated applications of a function
  • By repeat application of moves, we get a game tree

repeat f a = cons a (repeat f (f a))

reptree f a = node a (map (reptree f) (f a))

gametree p = reptree moves p

A first attempt at minimax

  • We need a function to define how much approx. a position is worth
    • Worth for the computer
    • Approx. means without lookahead

static: position -> number
  • Now we can express the core of minimax as two functions
    • For computer: maximise true value
    • For the opponent: minimise true value

maximise (node n sub) = max (map minimise sub)
minimise (node n sub) = min (map maximise sub)

A first attempt at minimax

  • Little detail: base case for maximise and minimise
  • Leaves of the tree = no more moves to analyse
    • Just use the "static" approximation at this point

maximise (node n nil) = n
minimise (node n nil) = n
  • With this we can define a minimax algorithm


Making out minimax better

  • Doesn't work with infinite game trees
  • Even for finite ones (tic-tac-toe), it can take a looooong time
  • Fortunately, we can prune a tree

prune 0 (node a x) = node a nil
prune n (node a x) = node a (map (prune (n-1)) x)
  • Compare the optimized definition with the unoptimized one
  • This modularity could NOT be achieved without lazy eval

evaluateSlow = maximise . maptree static . gametree
evaluateFast = maximise . maptree static . prune 5 . gametree

Last-step: alpha-beta

  • Gist of it: when computing a max, we don't need to compute all minima
  • Just involves a change from maximise into maximise'

maximise’ (node n nil) = cons n nil
maximise’ (node n l) = map minimise l
                     = map (min . minimise’) l
                     = map min (map minimise’ l)
                     = mapmin (map minimise’ l)
    where mapmin = map min

mapmin (cons nums rest) = cons (min nums) (omit (min nums) rest)

evaluate = max . maximise’ . maptree static . prune 8 . gametree

That's all for today!

  • Questions?