Lecture 17.0 — 2016-10-27

Lambda calculus: definition, encodings

This lecture is written in literate Haskell; you can download the raw source.

We defined the lambda calculus, originated by Alonzo Church in the 1930s, worked on by some other famous folks (Stephen Kleene, Haskell Curry Alan Turing). Here are the collected definitions.

The lambda calculus

Let V be an infinite set of variable names.

We define the terms or expressions of the lambda calculus as follows:

e in LC ::= x | e1 e2 | lambda x. e

Note that application is left associative, so:

x y z = (x y) z

And that we write multiple variables after a lambda to indicate nesting:

lambda x y z. e = lambda x. (lambda y. (lambda z. e))

Free variables and substitution

We define the free variables of a term using the following function:

fv :: LC -> 2^V
fv(x) = {x}
fv(e1 e2) = fv(e1) U fv(e2)
fv(lambda x. e) = fv(e) \ {x}

We say x is free in e if x is in fv(e). If a variable occurs in e and it isn’t free, we say it is bound. Every occurrence of a bound variable appears under a lambda with that variable (prove this to yourself!). For example:

fv(lambda x y z. q (x y)) = {q}

We say e is closed" when fv(e) = {}. In general, we will only want to think about closed terms—it’s a strange idea to run a program with free variables! If a term isn’t closed, it’s open*.

Next, we define a substitution operation. We’ll use a simple function notation first, reading subst(e1,x,e2) as “substitute e2 for x in e1”:

subst :: LC -> V -> LC -> LC
subst(x,x,e2)            = e2
subst(y,x,e2)            = y
subst(e11 e12,x,e2)      = subst(e11,x,e2) subst(e12,x,e2)
subst(lambda x. e1,x,e2) = lambda x. e1
subst(lambda y. e1,x,e2) = lambda y. subst(e1,x,e2)

We typically write e1[e2/x] (read “e1 with e2 for x”) to mean subst(e1,x,e2).

Equational reasoning

Now we can give the two rules that define how the lambda calculus behaves; both use substitution. First, there’s alpha equivalence, which says we can rename as long as we do so consistently:

lambda x. e = lambda y. e[y/x]
  when y not in fv(e)

Second, there’s beta equivalence, which says that we can substitute actual arguments for formal arguments in a function:

(lambda x. e1) e2 = e1[e2/x]

Note that we get some other principles for free, because = is an equivalence relation:

reflexivity: e = e
symmetry: e1 = e2 iff e2 = e1
transitivity: if e1 = e2 and e2 = e3 then e1 = e3
congruence (application): if e1 = e1' and e2 = e2' then e1 e2 = e1' e2'
congruence (lambda): if e = e' then lambda x. e = lambda x. e'

Using these two rules, we can reduce some lambda expressions. Try to write out the whole derivation of these equalities yourself, subscripting equality with alpha or beta to indicate when you use alpha or beta equivalence, respectively. (Just like in algebra, we don’t need to make a big deal of it when we use properties like reflexivity or congruence.)

(lambda x. x) (lambda y. y) = lambda y. y
(lambda x. z) (lambda y. y) = z
lambda x y. x = lambda a b. a
(lambda x y. x) (lambda z. z) (lambda c. d) = lambda z. z

Encodings

Once we had these definitions, I made the claim that the lambda calculus is a universal model of computation, just like Turing machines and Gödel numbers. To back up this claim, I started defining the various elements we might expect in a programming language.

The key idea behind all of the encodings is to operationalize things: since the lambda calculus only has functions, there is no “data” per se.

Booleans

Booleans are functions that make a binary choice. Arbitrarily, we’ll say true takes the first of two choices.

true  = lambda a b. a
false = lambda a b. b

not = lambda bl. lambda a b. bl b a
and = lambda b1 b2. b1 b2 false
or  = lambda b1 b2. b1 true b2

To convince yourself that these definitions are correct, try to derive the following equalities. (Again, try to mark when you use alpha and when you use beta.)

not true = false
not (not true) = true
or false false = false

Once we have these booleans, we can encode conditionals as follows: to write if e1 then e2 else e3, simple write e1 e2 e3. If e1 is true, we’ll get e2 back; if it’s false, we’ll get e3—just like a conditional!

Naturals

The booleans were easy: there are only two possibilities. Naturals presented a greater difficulty, since there are infinitely many. The idea here is to think back to the idea of natural numbers from number theory, where a natural number is either (a) zero, or (b) the successor of some other natural number. We could do it in Haskell easily:

data Nat = Zero | Succ Nat

We can’t explicitly represent such data structures in the lambda calculus, though, since there’s only functions, no data. But we can operationalize the idea: the natural number n can be represented as a function of two arguments: we apply the first argument n times to the second. Natural numbers represented this way are called Church numerals, in honor of Alonzo Church.

zero  = lambda s z. z
one   = lambda s z. s z
two   = lambda s z. s (s z)
three = lambda s z. s (s (s z))
four  = lambda s z. s (s (s (s z)))

While we can follow this pattern in principle to write down any natural number—just apply the first argument, s, some number of times—we can also define the successor function itself.

succ = lambda n. lambda s z. s (n s z)

Verify that succ zero = one and succ (succ two) = four.

We can go on to define other numeric operations, using the intuition that m + n = 1 + 1 + ... m times ... + n and m * n = n + n + ... m times ... + 0 to define these operations.

plus = lambda m n. m succ n
times = lambda m n. m (plus n) zero

isZero = lambda n. n (lambda x. false) true

Verify that times two two = four and isZero (times three zero) = true.

In order to encode the predecessor function, we had to make a detour through pairs.

Pairs

We defined pairs using what amounts to a “callback” strategy: a pair is a function that takes a single callback, which it calls with the first and second elements of the pair, respectively.

pair = lambda a b. lambda c. c a b
fst  = lambda p. p (lambda f t. f)
snd  = lambda p. p (lambda f t. t)

We can then observe that fst (pair a b) = a and snd (pair a b) = b.

Predecessor

The predecessor function is trickily partial: in the naturals, the predecessor of zero is undefined. Our function will have to do something, so let’s just say it returns zero.

The trick to writing the predecessor function is similar to what we did when we wrote folds that computed more than one value: we’ll use pairs to simultaneously compute our current number and one less than that number.

pred = lambda n. snd (n (lambda p. pair (succ (fst p)) (fst p)) (pair zero zero))

How does pred work? Given a number, it does some computation on pairs and takes the second element. To understand the computation, let’s break it into parts:

pred_zero = pair zero zero
pred_succ = lambda p. pair (succ (fst p)) (fst p)
pred = lambda n. snd (n pred_succ pred_zero)

So pred n is equal to snd (pred_succ (pred_succ ... n times ... pred_zero)). We can compute:

pred zero = snd pred_zero = zero

Which is just the behavior we specified. We can also see:

``pre pred three = snd (pred_succ (pred_succ (pred_succ pred_zero))) = snd (pred_succ (pred_succ (pred_succ (pair zero zero)))) = snd (pred_succ (pred_succ (pair one zero))) = snd (pred_succ (pair two one)) = snd (pair three two) = two ```

Note how, as we work through the number, the first element of the pair is always our “current” number and the second is the predecessor of our “current” number (except for when we’re just starting at zero). Try to prove to yourself that pred (succ e) = e.

Once we have predecessor, we can use the intuition that m - n = m - 1 - 1 - ... n times ... -1 to define subtraction:

minus = lambda m n. n pred m

Once we have minus, we can define other standard predicates on the naturals:

lte = lambda m n. isZero (minus m n)
gte = lambda m n. lte n m
lt = lambda m n. lte (succ m) n
gt = lambda m n. lt n m
equal = lambda m n. and (lte m n) (lte n m)