Recursion

Require Export DMFP.Types.

"Recursion is the root of computation since it trades description for time." --Alan Perlis

(See https://www.cs.yale.edu/homes/perlis-alan/quotes.html.)

Let's write some recursive functions!

Fixpoint plus (n : nat) (m : nat) : nat :=
  match n with
    | O ⇒ m
    | S n' ⇒ S (plus n' m)
  end.

Adding three to two gives us five, as we'd expect.

Compute (plus 3 2).

The simplification that Coq performs to reach this conclusion can be visualized as follows, a little more succinctly than what we wrote for evenb last time:

(*  plus (S (S (S O))) (S (S O))
==> S (plus (S (S O)) (S (S O)))
      by the second clause of the match
==> S (S (plus (S O) (S (S O))))
      by the second clause of the match
==> S (S (S (plus O (S (S O)))))
      by the second clause of the match
==> S (S (S (S (S O))))
      by the first clause of the match
*)

As a notational convenience, if two or more arguments have the same type, they can be written together. In the following definition, (n m : nat) means just the same as if we had written (n : nat) (m : nat).

Fixpoint mult (n m : nat) : nat :=
  match n with
    | O ⇒ O
    | S n' ⇒ plus m (mult n' m)
  end.

Compute (mult 3 3).
Compute (mult 5 2).

You can match two expressions at once by putting a comma between them:

Fixpoint minus (n m:nat) : nat :=
  match n, m with
  | O , _ ⇒ O
  | S _ , O ⇒ n
  | S n', S m' ⇒ minus n' m'
  end.

Our minus function has the same problem pred does: minus 0 n results in 0! This 'truncating' subtraction is necessary because we're working with natural numbers, i.e., there are no negative numbers.

Keep in mind: these definitions have suggestive names, but it's up to us as humans to agree that minus corresponds to our notion of subtraction. I would wager that it does: if we pick an n and an m such that m ≤ n, then minus n m yields n - m. We don't have the tools to prove this... yet!

Again, the _ in the first line is a wildcard pattern. Writing _ in a pattern is the same as writing some variable that doesn't get used on the right-hand side. This avoids the need to invent a variable name.

The pow function defines exponentiation: pow n m should yield n to the mth power.

Fixpoint pow n m :=
  match m with
    | 0 ⇒ 1
    | S m ⇒ n × (pow n m )
  end.

End NatPlayground2.

Exercise: 1 star, standard (factorial)

Here's the standard mathematical factorial function:

       factorial(0)  =  1
       factorial(n)  =  n * factorial(n-1)     (if n>0)

We'll meet this function again later in the course. For now, translate this into Coq.

Fixpoint factorial (n:nat) : nat
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐

Compute (factorial 3).
Compute (factorial 5).

With a little work, we can use the arithmetic-like syntax we're accustomed to:

Notation "x + y" := (plus x y)
                       (at level 50, left associativity)
                       : nat_scope.
Notation "x - y" := (minus x y)
                       (at level 50, left associativity)
                       : nat_scope.
Notation "x * y" := (mult x y)
                       (at level 40, left associativity)
                       : nat_scope.

Check ((0 + 1) + 1).

(The level, associativity, and nat_scope annotations control how these notations are treated by Coq's parser. The details are not important for our purposes, but interested readers can refer to the optional "More on Notation" section at the end of this chapter.)

Note that these do not change the definitions we've already made: they are simply instructions to the Coq parser to accept e.g. x + y in place of plus x y and, conversely, to the Coq pretty-printer to display plus x y as x + y.

Fixpoints and Structural Recursion

Here is a copy of the definition of addition:

Fixpoint plus' (n : nat) (m : nat) : nat :=
  match n with
  | O ⇒ m
  | S n' ⇒ S (plus' n' m)
  end.

When Coq checks this definition, it notes that plus' is "decreasing on 1st argument." What this means is that we are performing a structural recursion over the argument n -- i.e., that we make recursive calls only on strictly smaller values of n. This implies that all calls to plus' will eventually terminate. Coq demands that some argument of every Fixpoint definition is "decreasing."

This requirement is a fundamental feature of Coq's design: In particular, it guarantees that every function that can be defined in Coq will terminate on all inputs. However, because Coq's "decreasing analysis" is not very sophisticated, it is sometimes necessary to write functions in slightly unnatural ways.

Exercise: 2 stars, standard (decreasing)

To get a concrete sense of this, find a way to write a sensible Fixpoint definition (of a simple function on numbers, say) that does terminate on all inputs, but that Coq will reject because of this restriction.

You can write it Fail Fixpoint ... to ensure it fails to compile!

(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_decreasing : option (nat ×string) := None.
☐

Anonymous functions and fixpoints

Module DefnFixpoint.

Recall our definition of pred, the predecessor function on nats.

Definition pred (n : nat) : nat :=
  match n with
    | O ⇒ O
    | S n' ⇒ n'
  end.

Check pred. (* ===> nat -> nat *)
Check (pred 5). (* ===> nat *)
Compute (pred 5). (* ===> 4 *)
Compute (pred 0). (* ===> 0 (nobody's perfect) *)

In addition to the Check command to see an expression's type and the Compute command to see how it evaluates, there's a Print command to see something's definition. Let's try it.

Print pred.
(* ===>
pred = fun n : nat => match n with
                      | 0 => 0
                      | S n' => n'
                      end
     : nat -> nat
*)

Huh, weird! We defined pred with the Definition command, but Coq used a fun. It turns out that these notions are equivalent.

That is, writing:

Definition foo (x:T) (y:U) : Z := blah.

is just the same as:

Definition foo : T → U → Z := fun (x:T) (y:U) : Z ⇒ blah.

which is just the same as:

Definition foo (x:T) : U → Z := fun (y:U) ⇒ blah.

It's very useful to know equivalences in your programming language!

It's also important to know the various forms of syntax. A Definition takes the form:

Definition NAME ARGS× : RETURN TYPE := EXPRESSION.

where NAME is an identifier and ARGS× zero or more arguments (optionally, wrapped in parentheses and annotated with types). (It's conventional in computer science to use * to mean "zero or more". The name comes from the 20th Century theorist Stephen Cole Kleene.) The syntax for anonymous functions is similar:

fun ARGS+ ⇒ EXPRESSION

Where the + means one or more.

For Definitions return types are optional and for funs they are forbidden; a Definition need not have arguments, but anonymous functions need them.

Finally---we introduced some of these concepts without such a nice description yesterday. Hopefully things are clearer now!

What about recursive functions? I briefly showed you evenb'... let's look at the recursive double function again, in a bit more detail:

Fixpoint double (n:nat) :=
  match n with
  | O ⇒ O
  | S n' ⇒ S (S (double n'))
  end.

Print double.
(* ===>
double = fix double (n : nat) : nat := match n with
                                       | 0 => 0
                                       | S n' => S (S (double n'))
                                       end
     : nat -> nat
*)

Rather than writing fun, it says fix double. Interesting! After that, it looks mostly like a fun, though there's a return type, too.

Here's another equivalence, where ARGS+ means one or more arguments:

Fixpoint NAME ARGS+ : RETURN TYPE := EXPRESSION.

is just the same as:

Definition NAME := fix NAME ARGS+ : RETURN TYPE := EXPRESSION.

End DefnFixpoint.

Exercise: 2 stars, standard (half)

Define a function half : nat → nat that divides a number by two, rounding down.

Fixpoint half (n:nat) : nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Compute (half 4). (* ==> 2 *)
Compute (half 9). (* ==> 4 *)
☐

Exercise: 2 stars, standard (third)

Define a function third : nat → nat that divides a number by three, rounding down.

Fixpoint third (n:nat) : nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Compute (third 4). (* ==> 1 *)
Compute (third 9). (* ==> 3 *)
☐

Case study: binary and trinary numbers

Exercise: 3 stars, standard (binary)

Consider a different, more efficient representation of natural numbers using a binary rather than unary system. That is, instead of saying that each natural number is either zero or the successor of a natural number, we can say that each binary number is either

zero,
twice a binary number, or
one more than twice a binary number.

Inductive bin : Type :=
  | BZ (* zero *)
  | T₂ (n : bin) (* twice n *)
  | T2P₁ (n : bin). (* twice n, plus one *)

How does this work? Let's compare some binary and unary numbers. Zero is easy:

Definition u_zero : nat := O.
Definition b_zero : bin := BZ.

One isn't so bad either. To get 1 in unary, just add an S. To get it in binary, well, what could we do? If we simply enumerate possibilities, the next up after BZ itself are T₂ BZ and T2P₁.

T₂ BZ is "twice BZ", i.e., "twice zero", i.e. "zero". Not what we want.

T2P₁ BZ is "twice BZ plus one", i.e., "zero plus one", i.e., "one". Perfect!

Definition u_one : nat := S O.
Definition b_one : bin := T2P₁ BZ.

For the number two, the nats do the same thing: just add an S. Noting that 2 = 2 * 1, we have our answer for binary, too.

Definition u_two : nat := S (S O).
Definition b_two : bin := T₂ (T2P₁ BZ).

Let's do three.

Definition u_three : nat := S (S (S O)).
Definition b_three : bin := T2P₁ (T2P₁ BZ).

Let's skip a few. What about the number nine?

Definition u_nine : nat := S (S (S (S (S (S (S (S (S O)))))))).
Definition b_nine : bin := T2P₁ (T₂ (T₂ (T2P₁ BZ))).

Eleven?

Definition u_eleven : nat := S (S (S (S (S (S (S (S (S (S (S O)))))))))).
Definition b_eleven : bin := T2P₁ (T2P₁ (T₂ (T2P₁ BZ))).

You may be wondering... what do T₂ and T2P₁ and BZ have to do with, you know, binary? 0s and 1s?

To see, take a binary number and working from BZ out, write 1 for T2P₁ and then 0 for T₂.

b_one: 1
b_two: 10
b_three: 11
b_nine: 1001
b_eleven: 1011

These are indeed the binary numbers!

(a) First, write an increment function incr for binary numbers, and a function bin_to_nat to convert binary numbers to unary numbers.

(b) Write five unit tests for your increment and binary-to-unary functions. (A "unit test" in Coq is just a Compute that you can quickly check.) Notice that incrementing a binary number and then converting it to unary should yield the same result as first converting it to unary and then incrementing. You only need to write five tests total, but please be certain to test both functions.

(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_binary : option (nat ×string) := None.
☐

These next exercises are a continuation of the previous exercise about binary numbers. It's a bit harder.

(a) Next, write a function nat_to_bin that converts natural numbers to binary numbers.

(b) You might think that starting with a binary number, converting to a natural, and then back to binary yields the same number we started with. However, this is not true! Explain what the problem is.

(c) Define a "direct" normalization function -- i.e., a function normalize from binary numbers to binary numbers such that, for any binary number b, converting to a natural and then back to binary yields (normalize b).

It's okay if you can't complete this problem, but the autograder will reject your submission if you don't at least define nat_to_bin with the right type. You can write Definition nat_to_bin (n:nat) : bin. Admitted., but you'll have to figure out what TYPE is. If you're stuck... ask for help!

Exercise: 2 stars, standard (nat_to_bin)

(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_nat_to_bin : option (nat ×string) := None.
☐

(* Answer to (b): 0 has many representations: it can be written
    BZ, T₂ BZ, T₂ T₂ BZ, and so on.
    For these alternate representations if you do bin_to_nat
    then nat_to_bin, you don't get back what you started with.
    (Any other number also has many representations, after applying
    constructors to the multiple representations of zero.) *)

Exercise: 2 stars, standard, optional (normalize)

(* FILL IN HERE *)
Fixpoint normalize (b:bin) : bin (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
(* Do not modify the following line: *)
Definition manual_grade_for_normalize : option (nat ×string) := None.
☐

What's happening here?

You might be thinking, "What is this crap? I wanted to learn computer programming! Coq/Gallina is a stupid language with stupid chicken names."

A few thoughts.

1. This is a computer science department, not a computer programming department. You'll learn to program because it's a critical skill for computer science, but we're not in it just for the programming. (Similarly, astronomers learn to use cool telescopes, and you could hardly have modern astronomy without telescopes... but astronomy isn't just about telescopes.)

2. Recursion is fundamental to computer science; many core concepts are expressed recursively. You'll have pleny of time to write iterative loops. This is practice!

3. We're trying to show you something essential about how computation and data interrelate, and recursion lets us highlight that.

4. Most people learn iterative languages early on. Studying recursion lets us level the playing field.

5. In the latter two thirds of the course, we'll be proving things about our programs. It's MUCH easier to prove things about recursive programs than iterative ones.

(* 2022-01-12 10:20 *)