(** * Intro: Functional Programming in Coq *) (** The [Require Export] statement on the next line tells Coq to use the [String] module from the standard library. We won't really be using strings ourselves, but we need to [Require] it here so that the grading scripts can use it for internal purposes. *) From Coq Require Export String. (* REMINDER: ##################################################### ### PLEASE DO NOT DISTRIBUTE SOLUTIONS PUBLICLY ### ##################################################### (See the [Preface] for why.) *) (* ################################################################# *) (** * Introduction *) (** This course is about two things: - functional programming, and - (inductive) proof. The primary goal of this course is to get you "thinking like a computer scientist": how to structure code, how to think about what code does, and how to justify your beliefs with proof. The course has three parts: - a functional programming part in Coq (YOU ARE HERE), - a formal proof part in Coq, and - an informal proof part on paper. In the first part of the course, we'll use Coq's (French for 'rooster') functional programming language Gallina (Spanish for 'hen') to write programs. In the second part of the course, we'll use Coq's unnamed tactic language to learn to write proofs. In the third and final part of the course, we'll adapt what we've learned to write proofs on paper. *) (** This course is unconventional. Most computer science departments simply teach a "discrete math" course that more or less resembles the third part of this course. There are two reasons we teach this funny course at Pomona College. - Some students skip the intro course (CS051), and we want everyone to have the same programming fundamentals. Teaching functional programming is a good way to achieve that. - A lot of what people teach in discrete math courses isn't that relevant to many computer scientists. We'd rather focus on the parts that are most important. It's also unusual (but not unheard of) to teach undergraduates using Coq, a powerful tool not often encountered until graduate school. First, we trust you to handle this difficult material. Second, Coq is a critical part of this course's "middle": formal proof about functional programs you've written. We think that Coq makes an excellent tutor, making sure you follow the rules when you're learning how proof works. You'll be better at paper proofs after having learned formal proof in Coq. *) (* ================================================================= *) (** ** What is functional programming? *) (** The functional programming style is founded on simple, everyday mathematical intuition: If a procedure or method has no side effects, then (ignoring efficiency) all we need to understand about it is how it maps inputs to outputs -- that is, we can think of it as just a concrete method for computing a mathematical function. This is one sense of the word "functional" in "functional programming." The direct connection between programs and simple mathematical objects supports both formal correctness proofs and sound informal reasoning about program behavior. The other sense in which functional programming is "functional" is that it emphasizes the use of functions (or methods) as _first-class_ values -- i.e., values that can be passed as arguments to other functions, returned as results, included in data structures, etc. The recognition that functions can be treated as data gives rise to a host of useful and powerful programming idioms. Finally, a third sense of "functional" is that... it works! Functional programming rules out a variety of bugs that can occur in imperative programming. Learning a functional programming language will help you think clearly about programming... in any language. Other common features of functional languages include _algebraic data types_ and _pattern matching_, which make it easy to construct and manipulate rich data structures, and sophisticated _polymorphic type systems_ supporting abstraction and code reuse. Coq offers all of these features. The first half of this chapter introduces the most essential elements of Coq's functional programming language, called _Gallina_. The second half introduces some basic _tactics_ that can be used to prove properties of Coq programs. *) (* ================================================================= *) (** ** Getting the tools in order *) (** We'll be using Emacs and Proof General throughout the course. You'll need to: 1. Download and install Coq: - [https://coq.inria.fr/download] On macOS, it should install into [/Applications/CoqIDE_8.12.0]. On Windows, it should install into [C:\Coq]. 2. Download and install Emacs - Windows: [http://mirrors.ocf.berkeley.edu/gnu/emacs/windows/emacs-26/emacs-26.3-x86_64.zip] I recommend extracing the zipfile into [C:\Program Files\Emacs]; you then want to add a shortcut to [C:\Program Files\Emacs\bin\runemacs.exe] to your Desktop and maybe pin it to your taskbar. - macOS: [https://emacsformacosx.com/] (or `brew cask install emacs`) - Linux: `apt install emacs` (depending on distro) 3. Install the init.el configuration file for Emacs, which will download Proof General automatically. - [https://cs.pomona.edu/~michael/courses/csci054f20/downloads/init.el] On macOS, you'll need to put the `init.el` file in `~/.emacs.d/`. If `init.el` is in your `Downloads` folder, run the command `mkdir ~/.emacs.d/; mv ~/Downloads/init.el ~/.emacs.d`. On Windows, you'll need to put the `init.el` file in `C:\Users\username\AppData\Roaming\.emacs.d\.init.el`. Best to copy it by hand in Explorer. Working directly with your computers filesystem may be new to you: you may be used to dragging and dropping things in the GUI, or using "open recent" or other automatic suggestions. Those are all great tools, but programming often means getting into the guts of the machine. 4. Make a directory where you'll keep all of your CS054 files. Don't create a subdirectory for each assignment, since each one depends on the previous ones. You need to create a [_CoqProject] file in that directory. There are two ways to do this. a. Download [https://cs.pomona.edu/~michael/courses/csci054f20/downloads/_CoqProject], put it in the directory you want. Be attentive: your OS may not be happy with a file without an extension! If you're using Windows, we recommend enabling "developer mode" and having the Explorer show file extensions. a. Make it yourself. With Emacs open to this file, create it manually. Type [C-x C-f _CoqProject RET] to open/create the file in your current directory. Then type [-Q . DMFP] as the contents, then save the file with [C-x C-s]. Once you've gotten all the software installed and set up, fire up Emacs. You need to check that everything is hunky dory: 1. Run the Emacs tutorial. You can get there by pressing [C-h t], i.e., press control and h at the same time, let go, then press the letter t. 2. Once you've learned the basic Emacs ropes, open up this file in Emacs [C-x C-f PATH/TO/Day01_intro.v]. It might take some time to get used to this way of working! (You can also drag and drop or use the menu bar.) 3. Double check that you can compile the file. Go to the very end [M->] and ask Coq to check everything [C-c RET], where [RET] means the enter or return key. It's not as nice as the tutorial, but typing [C-h m] when you have a Coq file open will show you help for your current 'mode'. There's also documentation online at https://proofgeneral.github.io/doc/master/userman/, but it's written for a more experienced audience. *) (** Everything working? If not, contact Prof. Greenberg or TA. If so, great... let's get started! *) (* ################################################################# *) (** * Data and Functions *) (* ================================================================= *) (** ** Enumerated Types *) (** One notable aspect of Coq is that its set of built-in features is _extremely_ small. For example, instead of providing the usual palette of atomic data types (booleans, integers, strings, etc.), Coq offers a powerful mechanism for defining new data types from scratch, with all these familiar types as instances. Naturally, the Coq distribution comes preloaded with an extensive standard library providing definitions of booleans, numbers, and many common data structures like lists and so on. But there is nothing magic or primitive about these library definitions. To illustrate this, we will explicitly recapitulate all the definitions we need in this course, rather than just getting them implicitly from the library. Later on, when we're doing proofs, we'll mostly use the library definitions. *) (* ================================================================= *) (** ** Days of the Week *) (** To see how this definition mechanism works, let's start with a very simple example. The following declaration tells Coq that we are defining a new set of data values -- a _type_. *) Inductive day : Type := | monday | tuesday | wednesday | thursday | friday | saturday | sunday. (** The type is called [day], and its members are [monday], [tuesday], etc. Having defined [day], we can write functions that operate on days. *) Definition next_weekday (d:day) : day := match d with | monday => tuesday | tuesday => wednesday | wednesday => thursday | thursday => friday | friday => monday | saturday => monday | sunday => monday end. (** One thing to note is that the argument and return types of this function are explicitly declared. Like most functional programming languages, Coq can often figure out these types for itself when they are not given explicitly -- i.e., it can do _type inference_ -- but we'll generally include them to make reading easier. *) (** Having defined a function, we should check that it works on some examples. There are several different ways to check your work in Coq. Later on, we'll prove our work correct! For now, we can use the [Check] command to type check an expression and [Compute] command to evaluate an expression involving [next_weekday]. *) Check friday. (* ==> friday : day *) Check (next_weekday friday). (* ==> next_weekday friday : day*) Compute (next_weekday friday). (* ==> monday : day *) Compute (next_weekday (next_weekday saturday)). (* ==> tuesday : day *) (** (We show Coq's responses in comments, but, if you have a computer handy, this would be an excellent moment to fire up the Coq interpreter in VS Code and try this for yourself. Load this file, [Day01_intro.v], from the book's Coq sources, find the above example, submit it to Coq, and observe the result.) We can ask Coq to _extract_, from our [Definition], a program in some other, more conventional, programming language (OCaml, Scheme, or Haskell) with a high-performance compiler. This facility is very interesting, since it gives us a way to go from proved-correct algorithms written in Gallina to efficient machine code. (Of course, we are trusting the correctness of the OCaml/Haskell/Scheme compiler, and of Coq's extraction facility itself, but this is still a big step forward from the way most software is developed today.) Indeed, this is one of the main uses for which Coq was developed. We won't really talk about extraction more in this course. *) (* ================================================================= *) (** ** Homework Submission Guidelines *) (** If you are using Software Foundations in a course, your instructor may use automatic scripts to help grade your homework assignments. In order for these scripts to work correctly (so that you get full credit for your work!), please be careful to follow these rules: - The grading scripts work by extracting marked regions of the .v files that you submit. It is therefore important that you do not alter the "markup" that delimits exercises: the Exercise header, the name of the exercise, the "empty square bracket" marker at the end, etc. Please leave this markup exactly as you find it. - Do not delete exercises. If you skip an exercise (e.g., because it is marked Optional, or because you can't solve it), it is OK to leave a partial proof in your .v file, but in this case please make sure it ends with [Admitted] (not, for example [Abort]). - It is fine to use additional definitions (of helper functions, useful lemmas, etc.) in your solutions. You can put these between the exercise header and the theorem you are asked to prove. - As we work our way through the files, keep in mind that we'll grade you in terms of _our_ old definitions, not yours. If you want to use a helper function from an earlier file in a later one, be sure to copy it over. *) (* ================================================================= *) (** ** Booleans *) (** In a similar way, we can define the standard type [bool] of booleans, with members [true] and [false]. *) Inductive bool : Type := | true | false. (** Although we are rolling our own booleans here for the sake of building up everything from scratch, Coq does, of course, provide a default implementation of the booleans, together with a multitude of useful functions and lemmas. (Take a look at [Coq.Init.Datatypes] in the Coq library documentation if you're interested.) Whenever possible, we'll name our own definitions and theorems so that they exactly coincide with the ones in the standard library. Functions over booleans can be defined in the same way as above. First, we can use booleans to define a _predicate_, a function that identifies some _subset_ of a given set: *) Definition is_weekday (d:day) : bool := match d with | monday => true | tuesday => true | wednesday => true | thursday => true | friday => true | saturday => false | sunday => false end. (** We can also define some of the usual operations on booleans. First comes _not_ or _negation_, which is often written as the operator [!]. *) Definition negb (b:bool) : bool := match b with | true => false | false => true end. (** Coq also lets you use conventional [if]/[then]/[else] notation for booleans, as in: *) Definition negb' (b:bool) : bool := if b then false else true. (** Every other datatype will need you to use [match], though! Depending on other languages you've learned, you may have seen a "one-armed [if]" before. You can't do that in Coq---every expression must return a value, and a missing [else] branch would leave Coq wondering what to return. *) (** Another common way of expressing functions from booleans to booleans is with a _truth table_. |---|--------| | b | negb b | |---|--------| | T | F | | F | T | |---|--------| *) (** Each column of the truth table represents an expression of type bool. Here the first column represents an arbitrary input b, which can be [true] (written [T]) or [false] (written [F]). It's typical to consider the initial columns of a truth table as representing inputs and the final column as representing an output. Each row of the truth table gives a possible assignment: you can read the first row as saying that if [b = true], then [negb b = false]; the second row says that if [b = false], then [negb b = true]. *) Definition andb (b1:bool) (b2:bool) : bool := match b1 with | true => b2 | false => false end. (** When constructing a truth table with more than one input, it's important to make sure your truth table has every possible input configuration accounted for. People have different ways of doing so, but I tend to like the following format, where we exhaust all of the possibilities for the first column to be true, and then we consider the cases where the first column is false. Electrical engineers, however, like to do it the opposite way: when false is 0 and true is 1, it makes sense to count "up". It doesn't _particularly_ matter which method you choose, but it's important to be consistent! *) (** |----|----|------------| | b1 | b2 | andb b1 b2 | |----|----|------------| | T | T | T | | T | F | F | | F | T | F | | F | F | F | |----|----|------------| *) Definition orb (b1:bool) (b2:bool) : bool := match b1 with | true => true | false => b2 end. (** |----|----|-----------| | b1 | b2 | orb b1 b2 | |----|----|-----------| | T | T | T | | T | F | T | | F | T | T | | F | F | F | |----|----|-----------| *) (** The last two of these definitions illustrate Coq's syntax for multi-argument function definitions. The corresponding multi-argument application syntax is illustrated by the following "unit tests," which constitute a complete specification -- a truth table -- for the [orb] function: *) Compute (orb true true ). Compute (orb true false). Compute (orb false true ). Compute (orb false false). (** We can also introduce some familiar syntax for the boolean operations we have just defined. The [Notation] command defines a new symbolic notation for an existing definition. *) Notation "x && y" := (andb x y). Notation "x || y" := (orb x y). (** _A note on notation_: In [.v] files, we use square brackets to delimit fragments of Coq code within comments; this convention, also used by the [coqdoc] documentation tool, keeps them visually separate from the surrounding text. In the html version of the files, these pieces of text appear in a [different font]. *) (** **** Exercise: 1 star, standard (nandb) Remove "[Admitted.]" and complete the definition of the following function; then make sure that the [Example] assertions below can each be verified by Coq. (Remove "[Admitted.]" and fill in each proof, following the model of the [orb] tests above.) The function should return [true] if either or both of its inputs are [false]. You can use [negb], but please do not use [andb] when you're defining this function. *) Definition nandb (b1:bool) (b2:bool) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. (* Do not modify the following line: *) Definition manual_grade_for_nandb : option (nat*string) := None. (** [] *) (* What's that box symbol? It represents the end of an exercise. Since it's in a comment, it doesn't _do_ anything. It shouldn't hurt if you remove it, but there's no need to. *) Compute (nandb true true ). Compute (nandb true false). Compute (nandb false true ). Compute (nandb false false). (** Truth tables are a particularly nice way of calculating compound expressions involving booleans. In addition to having input and output columns, we can have intermediate columns representing subexpressions of the boolean we're interested in. When building such a truth table, _every_ subexpression of the final result should show up as a column. |---|--------|-----------------| | b | negb b | orb b (negb b) | |---|--------|-----------------| | T | F | T | | F | T | T | |---|--------|-----------------| *) (** **** Exercise: 1 star, standard (impb) Write a function [impb] such that [impb b1 b2] has the same truth table as [orb (negb b1) b2]. Don't just trivially define it as [orb (negb b1) b2], though! Try using a [match]. *) Definition impb (b1:bool) (b2:bool) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. (* Do not modify the following line: *) Definition manual_grade_for_impb : option (nat*string) := None. (** [] *) (* ================================================================= *) (** ** Function Types *) (** Every expression in Coq has a type, describing what sort of thing it computes. The [Check] command asks Coq to print the type of an expression. *) Check true. (* ===> true : bool *) Check (negb true). (* ===> negb true : bool *) (** Functions like [negb] itself are also data values, just like [true] and [false]. Their types are called _function types_, and they are written with arrows. *) Check negb. (* ===> negb : bool -> bool *) (** The type of [negb], written [bool -> bool] and pronounced "[bool] arrow [bool]," can be read, "Given an input of type [bool], this function produces an output of type [bool]." Similarly, the type of [andb], written [bool -> bool -> bool], can be read, "Given two inputs, both of type [bool], this function produces an output of type [bool]." *) Check orb. (* ===> orb : bool -> bool -> bool *) (** You might think of the function [orb] as taking two arguments, but every function in Coq takes one argument at a time. Each argument gets its own arrow. You can think of this as saying that [orb] is a function that takes a boolean and returns _another_ function that takes _another_ boolean... and that returns a boolean. Look: *) Check (orb true). (* ===> orb true : bool -> bool *) Check (orb true false). (* ===> true || false : bool *) (** Function types are right associative, i.e., parentheses go on the right, i.e., [A -> B -> C] is the same as  [A -> (B -> C)]. Function application is left associative, i.e., parentheses go on the left, i.e., [a b c] is the same [(a b) c]. *) Check ((orb true) false). (* ===> true || false : bool *) (** The [Fail] prefix says that we expect a command to _not_ work. It's useful for examples like this! *) Fail Check (orb (true false)). (* ===> The command has indeed failed with message: Illegal application (Non-functional construction): The expression "true" of type "bool" cannot be applied to the term "false" : "bool" *) (* ================================================================= *) (** ** Case study: DNA nucleotides *) (** We'll use DNA processing as a running example through the course. Bioinformatics---the application of computational techniques to biological data---has richly blossomed over the last thirty years, and we'll only skim the surface. *) (** We can start by defining the types of nucleotides: Cytosine, Guanine, Adenine, and Thymine. *) Inductive base : Type := | C (* cytosine *) | G (* guanine *) | A (* adenine *) | T. (* thymine *) (** DNA has a double helix structure comprising two paired strands, where each [C] corresponds to a [G] and each [A] corresponds to [T]. We won't get to defining DNA strands for a few weeks, but we can already start thinking about DNA in a more detailed way. *) (** The DNA double helix has 'complementary' structure: if you know the bases of one strand, you know the bases of the other. We can express this idea with a function that computes the complement for a given base. *) Definition complement (b : base) : base := match b with | C => G | G => C | A => T | T => A end. (** **** Exercise: 1 star, standard (xorb) Here is the truth table for [xorb] (eXclusive OR on Booleans). |----|----|---------| | b1 | b2 | xorb b1 | |----|----|---------| | T | T | F | | T | F | T | | F | T | T | | F | F | F | |----|----|---------| Define a function [xorb] that takes two booleans and returns a boolean, following the above truth table. *) (* FILL IN HERE *) (* Do not modify the following line: *) Definition manual_grade_for_xorb : option (nat*string) := None. (** [] *) (** **** Exercise: 2 stars, standard (is_classday) Write a function [is_classday : day -> bool] that returns [true] exactly when it's a day we have CS054 (in FA2020, that's MW). It should be of type [day -> bool]. *) (* Do not modify the following line: *) Definition manual_grade_for_is_classday : option (nat*string) := None. (** [] *) (** **** Exercise: 2 stars, standard (eq_base) Write a function [eq_base : base -> base -> bool ] that returns [true] exactly when two bases are equal. *) Definition eq_base (b1 b2 : base) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. (** [] *) (* ################################################################# *) (** * How to succeed in this course *) (** Here's some advice on how to succeed in this course. 1. _Read the book._ It's very tempting to just skim things and go straight to the homework... resist! We recommend a multi-pass approach: read through the chapter but don't fill in any homework. Then watch the videos. Interleaving videos and problem solving is a good idea---having skimmed the chapter first, you should have a clear idea of whether you're properly stuck or merely haven't watched the right video yet. 2. _Don't spin._ It's easy to get stuck in a rut: Coq rejects everything you say, so you just try different things for an hour. Don't waste your time spinning in place! Set a timer for, say, twenty minutes. If you can't make progress that's clearly closer to where you need to be, then... 3. _Ask for help._ It's normal to need help: math and computer science are hard, and even moreso together. We're here to help. *) (* Mon Oct 12 08:48:47 PDT 2020 *)