Types

Separate Compilation

Before getting started, we need to import all of our definitions from the previous chapter:
Require Export DMFP.Intro.
For the Require Export to work, you first need to use coqc or CoqIDE to compile Intro.v into Intro.vo. In CoqIDE, switch over to Intro.v and run the Compile buffer command from the Compile menu, then come back to Types.v and execute the above line again.
If it doesn't work, there are a few approaches you can try to fix things.
1. Check to make sure that you have a _CoqProject file in the same directory as Types.v and Intro.v; it should contain just the single line -Q . DMFP with exactly those nine characters.
2. Double-check to make sure that Types.v is in the same directory as Intro.v.
3. Check to make sure that you didn't change any filenames. Downloading things multiple times will cause little (1)s to appear in filenames, and that will break things.
If you have trouble (e.g., if you get complaints about missing identifiers later in the file), it may be because the "load path" for Coq is not set up correctly. The Print LoadPath. command may be helpful in sorting out such issues.
In particular, if you see a message like
Compiled library Types makes inconsistent assumptions over library Coq.Init.Intro
you should check whether you have multiple installations of Coq on your machine. If so, it may be that commands (like coqc) that you execute in a terminal window are getting a different version of Coq than commands executed by CoqIDE.
If you're feeling fancy, you can run Compile > Make makefile whenever you add a new file to the CS54 folder; then, Compile > Make will recompile just the files that have changed. You'll need the program make installed for this. On Mac OS X, it comes with the Xcode command line tools (in a Terminal window, run xcode-select --install) and on Windows I suggest installing the Chocolatey package manager and then choco install make.
An even better option is to eschew CoqIDE and use the text editor emacs with the Proof General plugin, which gives you a very fancy environment including automatic recompilation of changed files. Chat with Prof Osborn if you're curious about that!

Compound Types

The types we have defined so far are examples of "enumerated types": their definitions explicitly enumerate a finite set of elements, each of which is just a bare constructor. Here is a more interesting type definition, where one of the constructors takes an argument:
Inductive rgb : Type :=
  | red
  | green
  | blue.

Inductive color : Type :=
  | black
  | white
  | primary (p : rgb).
Let's look at this in a little more detail.
Every inductively defined type (day, bool, rgb, color, etc.) contains a set of constructor expressions built from constructors like red, primary, true, false, monday, etc. The definitions of rgb and color say how expressions in the sets rgb and color can be built:
  • red, green, and blue are the constructors of rgb;
  • black, white, and primary are the constructors of color;
  • the expression red belongs to the set rgb, as do the expressions green and blue;
  • the expressions black and white belong to the set color;
  • if p is an expression belonging to the set rgb, then primary p (pronounced "the constructor primary applied to the argument p") is an expression belonging to the set color; and
  • expressions formed in these ways are the only ones belonging to the sets rgb and color.
We can define functions on colors using pattern matching just as we have done for day and bool.
Definition monochrome (c : color) : bool :=
  match c with
  | blacktrue
  | whitetrue
  | primary pfalse
  end.
Since the primary constructor takes an argument, a pattern matching primary should include either a variable (as above) or a constant of appropriate type (as below). How does monochrome white evaluate? We can write out the computation like so, simulating the way Coq works:

    monochrome white
    computes to

    match white with
     | blacktrue
     | whitetrue
     | primary pfalse
     end
Coq considers each case in turn, comparing the scrutinee (here, white) against the patterns (here, black, white, and primary p).
The scrutinee will always be a value; the pattern will be a mix of constructors and variable names. The match construct tries to match the scrutinee to each pattern, where a match is when every constructor is matched with an identical constructor or a variable. Remember that constructors build data, while patterns break data apart. Put another way, you build data values from constructors and existing values (maybe stored in variables), and you break down data by taking a value and matching bits of it into constructors and naming new bindings (variables). Just like you can build deeply nested data, you can use deeply nested patterns to examine them.
Concretely, the scrutinee white doesn't match the pattern black, because white and black are different constructors. So Coq will skip that case, and keep evaluating:

     match white with
     | blacktrue
     | whitetrue
     | primary pfalse
     end
    computes to

     match white with
     | whitetrue
     | primary pfalse
     end
We're being particularly careful with our computation steps---as we gain facility with Coq, we won't need to step each case. But here it helps us see that each pattern is considered in turn, from top to bottom.
The next pattern is white, which matches our scrutinee white, so our pattern matches and we can execute the code in that branch of the match:

     match white with
     | whitetrue
     | primary pfalse
     end
    computes to

    true In this case, there were no variables in the match to keep track of, but in general, Coq will bind each variable in a pattern to the corresponding part of the scrutinee. For example, suppose we had run monochrome (primary blue):

     monochrome (primary blue)
    computes to

     match (primary blue) with
     | blacktrue
     | whitetrue
     | primary pfalse
     end
    computes to

     match (primary blue) with
     | whitetrue
     | primary pfalse
     end
    computes to

     match (primary blue) with
     | primary pfalse
     end
At this point, the argument to the primary constructor, blue, in the scrutinee corresponds to the variable p in the pattern primary p. So what Coq will do is evaluate the corresponding branch of the match, remember that p is equal to blue. (In this case, p doesn't occur at all on the right-hand side of , so it doesn't matter. But it could!)

     match (primary blue) with
     | primary pfalse
     end
     computes to

     false (* with p bound to blue *)
Definition isred (c : color) : bool :=
  match c with
  | blackfalse
  | whitefalse
  | primary redtrue
  | primary _false
  end.
The pattern primary _ here is shorthand for "primary applied to any rgb constructor except red." Recall again that Coq applies patterns in order. For example:

     isred (primary green)
computes to

     match (primary green) with
     | blackfalse
     | whitefalse
     | primary redtrue
     | primary _false
computes to

     match (primary green) with
     | whitefalse
     | primary redtrue
     | primary _false
computes to

     match (primary green) with
     | primary redtrue
     | primary _false
At this point, our scrutinee primary green doesn't match the pattern primary red: while the outermost constructors are both primary, the arguments differ: green and red are different constructors of color. So Coq will skip this pattern:

     match (primary green) with
     | primary redtrue
     | primary _false
computes to

     match (primary green) with
     | primary _false
computes to

     false In this case, the wildcard pattern _ has the same effect as the dummy pattern variable p in the definition of monochrome. Both functions don't actually use the argument to primary. We could have written isred a different way:
Definition isred' (c : color) : bool :=
  match c with
  | blackfalse
  | whitefalse
  | primary p
    match p with
    | redtrue
    | _false
    end
  end.
In this definition, we explicitly name the primary color we're working with and do a nested pattern match. We can compute isred' (primary green) as follows:

     isred' (primary green)
computes to

     match (primary green) with
     | blackfalse
     | whitefalse
     | primary p
       match p with
       | redtrue
       | _false
       end
     end
computes to

     match (primary green) with
     | whitefalse
     | primary p
       match p with
       | redtrue
       | _false
       end
     end
computes to

     match (primary green) with
     | primary p
       match p with
       | redtrue
       | _false
       end
     end
computes to

     match green with
     | redtrue
     | _false
     end
computes to

     match green with
     | _false
     end
computes to

    false It's good practice to use wildcards when you don't need to name the variable---it helps prevent mistakes, like referring to the wrong variable. Do keep in mind, though: patterns are matched top to bottom. An early wildcard may rule out later cases! Coq should warn you if this is the case.
It's also good practice to try to condense pattern matching: the definition of isred is cleaner than that of isred'.

Exercise: 1 star, standard (is_weekday')

Define a function is_weekday' that behaves like is_weekday but has fewer than four cases. Remember: the order of cases matters!
Definition is_weekday' (d:day) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Modules

Coq provides a module system, to aid in organizing large developments. In this course we won't need most of its features, but one is useful: If we enclose a collection of declarations between Module X and End X markers, then, in the remainder of the file after the End, these definitions are referred to by names like X.foo instead of just foo. We will use this feature to introduce the definition of the type nat in an inner module so that it does not interfere with the one from the standard library (which we want to use in the rest because it comes with a tiny bit of convenient special notation).
Module NatPlayground.

Numbers

In this section, we'll define the natural numbers, that is, the numbers:
0, 1, 2, 3, ...
You may have heard in a math class that 0 isn't a natural number---for our purposes, it will be. Computer scientists almost always start counting from 0!
In other programming languages, it's easy to take numbers for granted: int is simply a built-in type, represented in some low-level way on the computer (typically binary).
But in Coq we're building everything from scratch. The representation we'll be using is called unary, which is an odd sort of way of writing numbers. You can think of it like tick marks, so the number zero is represented by no tick marks, the number one is one tick mark, the number two is two tick marks, etc.
That is, a unary natural number is either:
  • zero (i.e., no tick marks), or
  • a natural number followed by another tick mark.
Notice that our definition of natural numbers is in terms of natural numbers themselves---that is, to allow the rules describing the natural numbers are inductive.
Here's the above definition translated into Coq:
Inductive nat : Type :=
  | O
  | S (n : nat).
The clauses of this definition can be read:
  • O is a natural number (note that this is the letter "O," not the numeral "0").
  • S can be put in front of a natural number to yield another one -- if n is a natural number, then S n is too.
Again, let's look at this in a little more detail. The definition of nat says how expressions in the set nat can be built:
  • O and S are constructors;
  • the expression O belongs to the set nat;
  • if n is an expression belonging to the set nat, then S n is also an expression belonging to the set nat; and
  • expressions formed in these two ways are the only ones belonging to the set nat.
The same rules apply for our definitions of day, bool, color, etc.
The above conditions are the precise force of the Inductive declaration. They imply that the expression O, the expression S O, the expression S (S O), the expression S (S (S O)), and so on all belong to the set nat, while other expressions built from data constructors, like true, andb true false, S (S false), and O (O (O S)) do not.
A critical point here is that what we've done so far is just to define a representation of numbers: a way of writing them down. The names O and S are arbitrary, and at this point they have no special meaning -- they are just two different marks that we can use to write down numbers (together with a rule that says any nat will be written as some string of S marks followed by an O). If we like, we can write essentially the same definition this way:
Inductive nat' : Type :=
  | stop
  | tick (foo : nat').
The interpretation of these marks comes from how we use them to compute. We use O and S rather than these longer names because (a) concision is nice, (b) we'll be using nat quite a bit, and (c) that's how the Coq standard library defines them.
We interpret by writing functions that pattern match on representations of natural numbers just as we did above with booleans and days -- for example, here is the predecessor function, which takes a given number and returns one less than that number, i.e.,
pred (S n) should compute to n
For example, pred (S O) computes to O, i.e., 0 is the predecessor of 1.
Definition pred (n : nat) : nat :=
  match n with
    | OO
    | S n'n'
  end.
Our definition has an unfortunate property: pred O computes to O, i.e., 0 is its own predecessor. A mathematician might take a different course of action, saying that pred O is undefined---there really is no natural number that comes before 0, since 0 is the first one! Coq won't let us leave anything undefined, though, so we'll simply have to be careful to remember that pred O = O.
The second branch can be read: "if n has the form S n' for some n', then return n'."
Because natural numbers are such a pervasive form of data, Coq provides a tiny bit of built-in magic for parsing and printing them: ordinary arabic numerals can be used as an alternative to the "unary" notation defined by the constructors S and O. Coq prints numbers in arabic form by default:
Check (S (S (S (S O)))).
  (* ===> 4 : nat *)

Definition minustwo (n : nat) : nat :=
  match n with
    | OO
    | S OO
    | S (S n') ⇒ n'
  end.

Compute (minustwo 4).
  (* ===> 2 : nat *)
The constructor S has the type nat nat, just like pred and functions like minustwo:
Check S.
Check pred.
Check minustwo.
These are all things that can be applied to a number to yield a number. However, there is a fundamental difference between the first one and the other two: functions like pred and minustwo come with computation rules -- e.g., the definition of pred says that pred 2 can be simplified to 1 -- while the definition of S has no such behavior attached. Although it is like a function in the sense that it can be applied to an argument, it does not do anything at all! It is just a way of writing down numbers. (Think about standard arabic numerals: the numeral 1 is not a computation; it's a piece of data. When we write 111 to mean the number one hundred and eleven, we are using 1, three times, to write down a concrete representation of a number.)
For most function definitions over numbers, just pattern matching is not enough: we also need recursion, i.e., functions that refer to themselves. For example, to check that a number n is even, we may need to recursively check whether n-2 is even. To write such functions, we use the keyword Fixpoint.
In your prior programming experience, you may not have spent a lot of thought on recursion. Many languages use loops (using words like while or for) rather than recursion. Coq doesn't have loops---only recursion. Don't worry if you're not particularly familiar with recursion... you'll get lots of practice!
Fixpoint evenb (n:nat) : bool :=
  match n with
  | Otrue
  | S Ofalse
  | S (S n') ⇒ evenb n'
  end.
The call to evenb n' is a recursive call. One might worry: is it okay to define a function in terms of itself? How do we know that evenb terminates?
We can check an example: is 3 even?

     evenb 3
is the same as

     evenb (S (S (S O)))
computes to

     match (S (S (S O))) with
     | Otrue
     | S Ofalse
     | S (S n') ⇒ evenb n'
     end
computes to

     match (S (S (S O))) with
     | S Ofalse
     | S (S n') ⇒ evenb n'
     end
computes to

     match (S (S (S O))) with
     | S (S n') ⇒ evenb n'
     end
computes to

     evenb n' (* with n' bound to S O *)
is the same as

     evenb (S 0)
computes to

     match (S O) with
     | Otrue
     | S Ofalse
     | S (S n') ⇒ evenb n'
     end
computes to

     match (S O) with
     | S Ofalse
     | S (S n') ⇒ evenb n'
     end
computes to

    false More generally, evenb only ever makes recursive calls on smaller inputs: if we put in the number n, we'll make a recursive call on two less than n.
A surprising fact: Coq will only let us write functions that are guaranteed to terminate on all inputs. A good rule of thumb is that every recursive function should make a recursive call only on subparts of its input.
We can define oddb by a similar Fixpoint declaration, but here is a simpler definition:
Definition oddb (n:nat) : bool := negb (evenb n).

Compute (oddb 1).
Compute (oddb 4).

Comparisons

When we say that Coq comes with almost nothing built-in, we really mean it: even equality testing for numbers is a user-defined operation! We now define a function eqb, which tests natural numbers for equality, yielding a boolean. Note the use of nested matches.
Fixpoint eqb (n m : nat) : bool :=
  match n with
  | Omatch m with
         | Otrue
         | S m'false
         end
  | S n'match m with
            | Ofalse
            | S m'eqb n' m'
            end
  end.
The leb function tests whether its first argument is less than or equal to its second argument, yielding a boolean.

Exercise: 2 stars, standard (leb)

Fixpoint leb (n m : nat) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
(* Do not modify the following line: *)
Definition manual_grade_for_test_leb : option (nat×string) := None.
Compute (leb 2 2).
Compute (leb 2 4).
Compute (leb 4 2).

Exercise: 1 star, standard (ltb)

The ltb function tests natural numbers for less-than, yielding a boolean. Instead of making up a new Fixpoint for this one, define it in terms of a previously defined function.
For full points, don't make any calls to eqb or negb!
Definition ltb (n m : nat) : bool
  (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
(* Do not modify the following line: *)
Definition manual_grade_for_ltb : option (nat×string) := None.
Compute (ltb 2 2).
Compute (ltb 2 4).
Compute (ltb 4 2).
We can define a notion of minimum.
Definition min (n1 n2 : nat) : nat :=
  if leb n1 n2
  then n1
  else n2.
We can lift this definition to talk about three things. Here's a lifting via a total rewrite:
Definition min3 (n1 n2 n3 : nat) : nat :=
  if leb n1 n2
  then if leb n1 n3
       then n1
       else n3
  else if leb n2 n3
       then n2
       else n3.
And here's how we'd reuse the existing definition. Do you suppose this is equivalent to min (min n1 n2) n3?
Definition min3' (n1 n2 n3 : nat) : nat :=
  min n1 (min n2 n3).

Functions as Data

Like most modern programming languages -- especially other "functional" languages, including OCaml, Haskell, Racket, Scala, Clojure, etc. -- Coq treats functions as first-class citizens, allowing them to be passed as arguments to other functions, returned as results, stored in data structures, etc.

Higher-Order Functions

Functions that manipulate other functions are often called higher-order functions. Here's a simple one:
Definition twice (f:natnat) (n:nat) : nat :=
  f (f n).
The argument f here is itself a function (from X to X); the body of twice applies f twice to some value n. More precisely, it applies f to n and then applies f to the result of that call.
Check twice : (nat nat) nat nat.

Compute (twice minustwo 9). (* ==> 5 *)
We can also define unnamed or "anonymous" functions to use in places where a function is expected with the fun keyword:
Compute (twice (fun nS n) 2). (* ==> 4 *)
Note that fun n S n is morally equivalent to just S, since it takes one argument and passes it right along:
Compute (twice S 2). (* ==> 4 *)
We can construct all kinds of functions "on the fly" without declaring them at the top level or giving them names.
Check (fun nn × n). (* ==> nat -> nat *)
The expression (fun n n + n) can be read as "the function that, given a number n, yields n + n", i.e., a function that doubles its input.
Compute (twice (fun nn + n) 2). (* ==> 8 *)
Work out on paper how twice (fun n n + n) 2 evaluate. Here are the first few steps:

   twice (fun nn + n) 2
]] =

[[
   (fun nn + n) ((fun nn + n) 2)
]] =

[[
   (fun nn + n) (2 + 2)
Pick up from here. Note: there's more than one way to do this, but you should get the same answer no matter what.
Here's a (debatably) natural example of using a function to pick a good day.
Definition favorite_day (f: day bool) : day :=
  if f monday then monday
  else if f tuesday then tuesday
       else if f wednesday then wednesday
            else if f thursday then thursday
                 else if f friday then friday
                      else if f saturday then saturday
                           else sunday.
You can give it any day-picking function you like and it gives you the first day which matches your preference (or else sunday if none do):
Compute favorite_day is_weekday. (* ===> monday *)
Compute favorite_day
        (fun dmatch d with
                  | tuesdaytrue
                  | _false
                  end). (* ===> tuesday *)
Let's think back to min. min only works on numbers, so using it can be a little tedious---and sometimes not really what we want. For example, if we want to find the least unhealthy thing on a fast food menu, we could convert the food items to calorie counts or something and then take the min of those; but then we'd have to figure out which food item had which calorie count, which is awkward. Instead, we can define a new function argmin, which yields the least of its arguments according to a given cost function.
This next definition uses the let ... in form to define a local variable; Definition and Fixpoint give you new top-level definitions, but let is how you get new local definitions. (Well, each branch of a match has its own definitions, but that's not exactly the same!)
The general syntax is:
    let NAME : TYPE := EXPR in
    EXPR
Observe that let ... in is equivalent to a single-arm match:

    match EXPR with
    | NAMEEXPR
    end
Definition argmin (cost : nat nat) (o1 o2 : nat) : nat :=
  let c1 := cost o1 in
  let c2 := cost o2 in
  if leb c1 c2
  then o1
  else o2.

Compute argmin (fun nif evenb n then Nat.pow n 2 else Nat.pow 2 n) 8 5. (* ===> 5, since 8^2 = 64 and 2^5 = 32 *)
Compute argmin (fun nif evenb n then Nat.pow n 2 else Nat.pow 2 n) 7 8. (* ===> 8, since 8^2 = 64 and 2^7 = 128 *)
As with min, we can lift it either by rewriting it to take three arguments:
Definition argmin3 (cost : nat nat) (o1 o2 o3 : nat) : nat :=
  let c1 := cost o1 in
  let c2 := cost o2 in
  let c3 := cost o3 in
  if leb c1 c2
  then if leb c1 c3
       then o1
       else o3
  else if leb c2 c3
       then o2
       else o3.
Or by using the definition we had lying around already:
Definition argmin3' (cost : nat nat) (o1 o2 o3 : nat) : nat :=
  argmin cost o1 (argmin cost o2 o3).

Exercise: 1 star, standard (mirror)

Now you try: write a higher order function mirror that takes a (nat nat nat) function like plus or minus as an input, and also takes a nat and returns a nat, returning the result of applying the function with the given nat for both arguments.
We've supplied two tests for you to check against.
(* Compute mirror plus 4. (* ===> 8 *) *)
(* Compute mirror minus 7. (* ===> 0 *) *)

(* FILL IN HERE *)

(* Do not modify the following line: *)
Definition manual_grade_for_mirror : option (nat×string) := None.

Functions That Construct Functions

Most of the higher-order functions we have talked about so far take functions as arguments. Let's look at some examples that involve returning functions as the results of other functions. To begin, here is a function that takes a value x (drawn from some type X) and returns a function from nat to X that yields x whenever it is called, ignoring its nat argument.
Definition constfun (x: nat) : natnat :=
  fun (k:nat) ⇒ x.

Definition f7 := constfun 7.

Compute f7 0. (* ===> 7 *)

Compute (constfun 5) 99. (* ===> 5 *)

Definition flip (f: nat bool) : nat bool :=
  fun (x:nat) ⇒ negb (f x).

Compute (flip evenb) 5. (* ===> true *)

Definition mirror' (f: nat nat nat) : nat nat :=
  fun (x:nat) ⇒ (f x x).
In fact, the multiple-argument functions we have already seen are also examples of passing functions as data. To see why, recall the type of plus.
Check plus : nat nat nat.
Each in this expression is actually a binary operator on types. This operator is right-associative, so the type of plus is really a shorthand for nat (nat nat) -- i.e., it can be read as saying that "plus is a one-argument function that takes a nat and returns a one-argument function that takes another nat and returns a nat." In the examples above, we have always applied plus to both of its arguments at once, but if we like we can supply just the first. This is called partial application.
Definition plus3 := plus 3.
Check plus3 : nat nat.

Compute (plus3 4). (* ==> 7 *)
Check twice : (nat nat) nat nat.
Check (twice plus3) : nat nat.
Compute (twice plus3 0). (* ==> 6 *)
Compute (twice (plus 3) 0). (* ==> 6 *)

Exercise: 1 star, standard (partial_app_minus)

Partial application is a powerful tool, used frequently by functional programmers. It can be very confusing at first, though. Explain what answer you might expect just from reading the code. Why do we get the answer we do?
Hint: Try showing every step of the computation.
Compute (twice (minus 1) 10).

(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_partial_app_minus : option (nat×string) := None.
(* 2022-01-14 16:11 *)