Lecture 4 — 2015-09-14

Folds; set theory

This lecture is written in literate Haskell; you can download the raw source.

We spent a good chunk of class going over the homework from a high level, talking about style, functionality, and grading.

The following were things that didn’t get points off, but are still ‘wrong’ from a style perspective.

Not so good Better
tabs spaces GHC doesn’t know how wide a tab is!
super long lines lines broken to be <90 chars
sumUp(xs) sumUp xs Haskell uses space to denote application
f (x) (y) f x y Those parens do nothing.
[x] ++ xs x:xs
insertBST x (Node l y r) | x == y = (Node l y r) insertBST x t@(Node l y r) | x == y = t You can capture outer structures in your patterns.
if e then True else False e
case e of { True -> e1 ; False -> e2 } if e then e1 else e2
f (Node l x r) | l == Empty = x f (Node Empty x r) = x Compound patterns are easier to read.

In general, you should write conditionals in top-level functions preferring:

  1. Pattern matching
  2. Where conditions (written with |)
  3. If/case expressions

Some people’s code used the same conditional on a bunch of different cases, as in:

f p1 = if e then ...
f p2 = if e then ...
f p3 = if e then ...

This isn’t great. The core logic of your function is more about the condition e than the patterns, so restructure it to do that first. Maybe invert the conditional? Maybe condition less in the pattern? Maybe do a case statement on a pair? In any case, code that looks like it’s been copied and pasted is bad. Abstract it out!

With all of the bad news out of the way, we spent most of the class talking about folds. There’s some nice material on Wikipedia about folds.

There are two kinds of folds: rightwards folds and leftwards folds. They correspond to direct and accumulating recursion, respectively. Since direct is easier to understand, we looked at foldr first.

The folds we’ll define—foldr and foldl are already in the Prelude…

import Prelude hiding (foldr, foldl)

so we can hide them for the rest of this file.

foldr

Here’s a rightward fold, a higher-order function that seems a bit opaque at first blush.

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f b [] = b
foldr f b (a:as) = f a (foldr f b as)

We can understandfoldr using the following equation:

foldr f b [a1,...,an] = f a1 (f a2 (... (f an b) ...))

We can also understand foldr as replacing a list’s ‘spine’ with function applications, like so:

   :             f      
  / \           / \     
 1   :         1   f    
    / \    =>     / \   
   2   :         2   f  
      / \           / \ 
     3  []         3   b

We defined a few different functions using foldr.

head :: [a] -> a
head = foldr (\a b -> a) undefined 

product :: [Int] -> Int
product = foldr (*) 1

and, or :: [Bool] -> Bool
and = foldr (&&) True
or  = foldr (||) False

In general, a function defined like:

g [] = v
g (x:xs) = f x (g xs)

can be implemented using foldr as:

g = foldr f v

foldl

Leftward folds are like rightward folds but using a different pattern:

   :                f
  / \              / \
 1   :            f   3
    / \    =>    / \   
   2   :        f   2
      / \      / \ 
     3  []    b   1

The code for a leftward fold looks more like the accumulator passing style functions we’ve already written.

foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f b [] = b
foldl f b (a:as) = foldl f (f b a) as

Correspondingly, foldl satisfies the following equation:

foldl f b [a1,...,an] = f (... (f (f b a1) a2) ...) an

We can use foldl to define the accumulating version of insertion sort:

insert :: [Int] -> Int -> [Int]
insert [] x = [x]
insert (y:ys) x | x < y     = x:y:ys
                | otherwise = y:insert ys x

insertionSort :: [Int] -> [Int]
insertionSort = foldl insert []

We can also use it to easily define last:

last = foldl (\a b->b) undefined 

Finally, we can mechanically translate recursive accumulating functions defined like:

g [] acc = acc
g (x:xs) = g xs (f x acc)

into leftward folds like:

g = foldl f acc

Set theory

In the last fifteen minutes of class, we went over some of the set theory that I bungled last Wednesday.

We’ll typically use capital letters as metavariables to refer to sets. Examples of sets we’ve seen are the infinite sets ℕ and ℤ, for the naturals and integers. We’ve also seen the finite set 2 a/k/a the booleans, where 2 = { ⊥, ⊤ }, i.e., false and true. (Don’t confuse the set 2 with the number 2!)

We can interpret sets as types, i.e., ⊥ ∈ 2 means that false has type Boolean. We use :: to write types in Haskell, but we’ll use ∈ (pronounced “is a member of” or “is in”) for types in math.

For a given set X, the set 2X is the type of sets of X. For example the infinite set of primes { 2, 3, 5, 7, 11, 13, 17, … } ∈ 2. For a finite set, we can be more concrete: 22 = { ∅, {⊥}, {⊤}, {⊥, ⊤} }.

The set exponentiation notation 2X is another way of writing the function type X → 2, that is, functions from X to Bool. These so called characteristic functions are effectively the same thing as sets. For example, the set of even numbers { 0, 2, -2, 4, -4, … } is just as easily characterized as the function:

f x = x `mod` 2 == 0

Every set in 2X corresponds directly to a function that returns True on elements that are in the set and False on those that aren’t. If you’re familiar with ‘set builder notation’, then the link between sets and characteristic functions is even more clear: a set S ∈ 2X with characteristic function f is equal to { x ∈ X | f x = True }.

You might notice that 22 has 22 = 4 elements. Cosmic coincidence? No—not at all! Here’s the intuition:

  • |2X| =
  • # of sets in 2X =
  • # of different sets of X =
  • # of different ways to include or not include an X in a set of Xs =
  • 2 * 2 * 2 * … (|X| times) * 2 =
  • 2|X|

The set 2X is also called the power set, and sometimes written ℘(X).