Setssets, cardinality, and countability


From DMFP Require Export IndProp.

Overview

A set is a collection of distinct objects. For writing down concrete sets, it's conventional to use curly braces. We write bool, the set of booleans, as follows:

    bool = { true, false }
We would write the set of single-digit odd numbers as:

    sdos = { 1, 3, 5, 7, 9 }
The empty set is ordinarily written as a circle with a slash from the top-right to the bottom left: ∅.
The objects in a set should be distinct. For example, {1,2,1} isn't a well-formed set, because 1 shows up twice. Simply write {1,2} to refer to the set containing 1 and 2.
We say that true is in the set bool but not the set sdos; similarly, 3 is a member of the set sdos.
Sets and their various operations are useful programmatically and mathematically. To help you get a sense of how sets work, we'll give an axiomatic account of sets in Coq. That is, we'll assume a few basic definitions and explore their consequences.

A word about priorities

At this point in the course, the most important thing for you to focus on is informal proof.
We've continued including a Coq development to allow you to have some formal definitions to follow along, but you should focus your energy on getting intuition by doing informal exercises.
There are many optional Coq exercises in this file. Doing them will definitely help your intuition, but don't get bogged down!
If it helps you prioritize, there are lots of points for informal work and just a few for Coq proofs.

Adding sets to Coq


Module AxiomaticSets.
Ordinarily, we've defined all of our types inductively. The keyword Parameter here means we're not specifying how sets are actually defined--there are no constructors.
Parameter set : Type Type.
In lieu of constructors, we'll assume that there are three relevant sets:
  • The empty set, typically written ∅.
Parameter Empty : {X:Type}, set X.
  • The universal set, which holds every object of type X. Typically called U or Univ.
Parameter Universe : {X:Type}, set X.
  • The set specified by a predicate: given P : X Prop, the set of objects a : X such that P a.
    Such sets are typically constructed with set-builder notation, where {x | P} denotes the set of x such that P.
    Paper use of set-builder notation will sometimes indicate x's type or the set its being drawn from, e.g., {x : nat | k, x = 2*k} specifies the even naturals. On paper, multiple premises will sometimes be separated by a comma, as in {x : nat | k, x = 2 × k, x > 2} denotes the evens counting from 4. A more precise notation would write {x : nat | ( k, x = 2 × k) x > 2}.
Parameter Spec : {X:Type}, (X Prop) set X.
Given our three set formers, there is one key predicate on sets: membership. A set contains its members, as {true,false} was seen to contain true above.
We typically write a X to mean that a is in X. Other phrasing include "is an element of", "belongs to", "is a member of", "is included in". One might also say X contains a or has a as a member.

Parameter Member : {X:Type}, X set X Prop.
If two sets contain the same elements, they are the same set. (Compare to functional extensionality in Logic.v.

Axiom extensionality : X (S1 S2 : set X),
    ( x, Member x S1 Member x S2) S1 = S2.
In our formulation of sets, there can be no ambiguity: each element is either in or not in the set.
That is, we'll just assume that the law of the excluded middle applies to sets.

Axiom inclusion_exclusion : {X} (S:set X) x,
    Member x S ~(Member x S).
Nothing is a member of the empty set--it's empty!

Axiom member_empty : X (x:X),
    ~(Member x Empty).
But everything is a member of the universal set, since it contains everything.

Axiom member_universe : X (x:X),
    Member x Universe.
Finally, an element is a member of a set built with set-builder notation when it satisfies the predicate. Stating this as separate axioms helps Coq do a better job with inference--if we state the iff explicitly, Coq sometimes guesses wrong when we apply the axiom.
Axiom member_spec_P : X (P : X Prop),
     x, Member x (Spec P) P x.

Axiom P_member_spec : X (P : X Prop),
     x, P x Member x (Spec P).
Given these primitives, we can derive common notions of sets. We'll use Spec extensively.
First, the singleton set holds just one element; it's written {x}.

Definition Singleton {X:Type} (x:X) : set X :=
  Spec (fun yx = y).
The union of two sets holds all of those elements in either of those sets; it's written S1 S2.

Definition Union {X:Type} (S1 S2 : set X) : set X :=
  Spec (fun xMember x S1 Member x S2).
The intersection of two sets holds the elements common to both sets; it's written S1 S2.
Definition Intersection {X:Type} (S1 S2 : set X) : set X :=
  Spec (fun xMember x S1 Member x S2).
Note the deliberate similarity between union/intersection (∪/∩) and disjunction/conjunction (∨/∧).
The difference between two sets are those elements in the first set but not the second. There are two notations in common use: S1 \ S2 and S1 - S2.

Definition Difference {X:Type} (S1 S2 : set X) : set X :=
  Spec (fun xMember x S1 ~(Member x S2)).
The complement of a set are those elements not in the set. To have a complement, we need a notion of a "universe" which elements are we to consider when picking those elements not in the set?
Coq's types give us a natural notion: if S : set X, then Complement S consists of those values of type X that weren't in S. So, for example, Complement {true,false} = and the complement of the odd naturals are the even naturals.
The common notation for complement is to draw a line over the set in question. One can also write U - S, where U stands for the universal set. (See complement__universe_difference for a justification.)

Definition Complement {X:Type} (S:set X) : set X :=
  Spec (fun x¬Member x S).
Given these definitions, we can use the axioms to characterize set membership for our various operations.

Exercise: 2 stars, standard, optional (set_properties)

These are some nice properties of sets; proving them shouldn't be too difficult and is nice practice both in Coq and, more importantly, working with sets. Please prove them informally even if you don't want to take the time to prove them in Coq!

Lemma member_singleton : X (x y:X),
    Member x (Singleton y) x = y.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma member_union : X (S1 S2 : set X) (x : X),
    Member x (Union S1 S2) Member x S1 Member x S2.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma member_intersection : X (S1 S2 : set X) (x : X),
    Member x (Intersection S1 S2) Member x S1 Member x S2.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma member_difference : X (S1 S2 : set X) (x : X),
    Member x (Difference S1 S2) Member x S1 ~(Member x S2).
Proof.
  (* FILL IN HERE *) Admitted.

Lemma member_complement : X (S : set X) (x : X),
    Member x (Complement S) ~(Member x S).
Proof.
  (* FILL IN HERE *) Admitted.
One set is a subset of another if it's contained entirely; we write AB to mean that A is a subset of B.
We might also say that A is included in B.

Definition subset {X : Type} (S1 S2 : set X) : Prop :=
   x, Member x S1 Member x S2.
If A is included in B but doesn't comprise all of B, then A is a proper subset of B; we write AB or AB.
Be careful, though: some authors will write AB to mean plain (not necessarily proper) subset.

Definition proper_subset {X : Type} (S1 S2 : set X) : Prop :=
  subset S1 S2 S1 S2.
The subset relation is reflexive and transitive.

Lemma subset_refl : X (S : set X),
    subset S S.
Proof.
  intros X S. unfold subset. intros x H. apply H.
Qed.

Lemma subset_trans : X (S1 S2 S3 : set X),
    subset S1 S2 subset S2 S3 subset S1 S3.
Proof.
  intros X S1 S2 S3 H12 H23.
  intros x H.
  apply H23.
  apply H12.
  apply H.
Qed.
Subset also provides a way to prove equality of sets: if two sets are subsets of each other, they must be equal. Such a proof of equality is "a proof by mutual inclusion".

Lemma subset_eq : X (S1 S2 : set X),
    subset S1 S2 subset S2 S1 S1 = S2.
Proof.
  intros X S1 S2.
  split.
  - unfold subset. intros [H12 H21].
    apply extensionality. intros x.
    split.
    + apply H12.
    + apply H21.
  - intros Heq. rewrite Heq. split.
    + apply subset_refl.
    + apply subset_refl.
Qed.

Exercise: 2 stars, standard, optional (subset_properties)

Lemma empty_subset : X (S : set X),
    subset Empty S.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma subset_universe : X (S : set X),
    subset S Universe.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma union_subset : X (S1 S2 : set X),
    subset (Union S1 S2) (Union S2 S1).
Proof.
  (* FILL IN HERE *) Admitted.

Lemma union_comm : X (S1 S2 : set X),
    Union S1 S2 = Union S2 S1.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma union_subset_l : X (S1 S2 : set X),
    subset S1 (Union S1 S2).
Proof.
  (* FILL IN HERE *) Admitted.

Corollary union_subset_r : X (S1 S2 : set X),
    subset S2 (Union S1 S2).
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 2 stars, standard, optional (complement_involutive)

You'll need to use inclusion_exclusion to prove this one.
Lemma complement_involutive : X (S : set X),
    Complement (Complement S) = S.
Proof.
  (* FILL IN HERE *) Admitted.
The power set of a set X is the set of all subsets of X. It's written P(X) or 2^X, where ^ means exponentiation.
Definition power_set {X : Type} (S : set X) : set (set X) :=
  Spec (fun S'subset S' S).
For example, the power set of the booleans is:

    P(bool) = { emptyset, {true}, {false}, {true, false} }

Exercise: 3 stars, standard, optional (de_morgan)

Lemma de_morgan : X (S1 S2 : set X),
   Complement (Intersection (Complement S1) (Complement S2)) = Union S1 S2.
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 2 stars, standard, optional (de_morgan2)

This proof can be made much easier by judicious use of of other lemmas.
If you're stuck, maybe you read too fast and haven't looked at all of the wonderful lemmas above. Take a gander and see what might help here.
Corollary de_morgan2 : X (S1 S2 : set X),
   Complement (Union (Complement S1) (Complement S2)) = Intersection S1 S2.
Proof.
  (* FILL IN HERE *) Admitted.
The cartesian product of two sets is the set of pairs of values from each set; the product of X and Y is typically written X × Y.
Definition cartesian_product {X Y : Type} (Sx : set X) (Sy : set Y) : set (X × Y) :=
  Spec
       (fun (p:X×Y) ⇒ let (x,y) := p in Member x Sx Member y Sy).
For example, the product bool × unit is { (true,tt), (false,tt) }.

Exercise: 2 stars, standard, optional (cartesian_properties)

Lemma member_cartesian : {X Y : Type} (x:X) (y:Y) (Sx : set X) (Sy : set Y),
  Member (x,y) (cartesian_product Sx Sy) Member x Sx Member y Sy.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma empty_cartesian_identity_l : X Y (S : set Y),
    cartesian_product (Empty : set X) S = Empty.
Proof.
  (* FILL IN HERE *) Admitted.

A set-theoretic notion of relations and functions

We've defined many functions and relations in Coq. Function definitions have used the Definition and Fixpoint keywords. Relation definitions have used the Inductive keyword to define propositions. In fact, we've defined sets, as well--the Inductive keyword introduces recursive definitions for sets.
Coq takes recursive functions and inductive definitions of sets and relations as primitive; another common perspective is to take sets as primitive, defining functions and relations on top of sets.
For a relation on one thing--a unary relation, like ev or sorted--we simply use a set.
Definition unary_relation (X : Type) : Type := set X.

Definition even_nats : unary_relation nat := Spec (fun nev n).
Definition even_nats' : unary_relation nat := Spec (fun nevenb n = true).
For relations on two things--binary relations, like le or = or Permutation--we use a set of pairs.

Definition binary_relation (X Y : Type) : Type := set (X×Y).
For relations on more things--ternary relations on three things, like R from IndProp.v--we can use larger pairs.

Definition ternary_relation (X Y Z : Type) : Type := set (X×Y×Z).
So a relation R between X and Y can be intrepreted as a set of pairs of (a,b), where a : X and b : Y. We say a R b when (a,b)R.

Definition related_in {X Y : Type} (R:binary_relation X Y) (a : X) (b : Y) : Prop :=
  Member (a,b) R.

Definition empty_relation (X Y : Type) : binary_relation X Y := Empty.

Definition total_relation (X Y : Type) : binary_relation X Y := Universe.

Lemma cartesian_product__total : {X Y} (a:X) (b:Y), related_in (total_relation X Y) a b.
Proof.
  intros X Y a b.
  apply member_universe.
Qed.

Definition diagonal_relation (X : Type) : binary_relation X X :=
  Spec (fun p : X × Xlet (x,y) := p in x = y).

Definition reflexive {X : Type} (R : binary_relation X X) : Prop :=
   a, related_in R a a.

Definition symmetric {X : Type} (R : binary_relation X X) : Prop :=
   a b, related_in R a b related_in R b a.

Definition transitive {X : Type} (R : binary_relation X X) : Prop :=
   a b c, related_in R a b related_in R b c related_in R a c.

Definition functional {X Y : Type} (R : binary_relation X Y) : Prop :=
   a b1 b2, related_in R a b1 related_in R a b2 b1 = b2.

Definition total {X Y : Type} (R : binary_relation X Y) : Prop :=
   a, b, related_in R a b.

Exercise: 2 stars, standard, optional (enigma_functional)

Definition enigma : binary_relation nat nat :=
  Spec
       (fun p : nat×nat
          match p with
          | (O,_)False
          | (S n,m)n = m
          end).

Lemma enigma_functional : functional enigma.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma enigma_not_total : ~(total enigma).
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 1 star, standard, optional (enigma_fenigma)

Definition enigma_fun : nat nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Lemma enigma_enigma_fun : a b, related_in enigma a b enigma_fun a = b.
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 1 star, standard (enigma_converse)

Can you prove the converse? If so, please do so (as an informal proof). If not, prove informally that there exists an a and b such that enigma_fun a = b but ¬ (related_in enigma a b).
(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_enigma_converse : option (nat×string) := None.
Given this set-theoretic notion of functions, we can define a few important sets for any given function.
The domain of a function is the set of its possible inputs; the codomain is the set of possible outputs. In Coq terms, we would say that if f : X Y, then X is the domain and Y is the codomain.

Definition domain {X Y} (f : binary_relation X Y) := X.
Definition codomain {X Y} (f : binary_relation X Y) := Y.

Definition preimage {X Y} (f : binary_relation X Y) :=
  Spec (fun x y, related_in f x y).
Definition image {X Y} (f : binary_relation X Y) :=
  Spec (fun y x, related_in f x y).
If a function f is total, then its preimage and domain coincide.

Lemma total_preimage_is_whole_domain : X Y (f : binary_relation X Y),
  functional f
  total f
  preimage f = Universe.
Proof.
  intros X Y f Hfunc Htotal.
  unfold preimage. apply extensionality.
  intros x. split.
  - intros H. apply member_universe.
  - intros _. apply P_member_spec.
    apply Htotal.
Qed.

Functions for counting: injections, surjections, and bijections

Recall the definition of injective from Logic.v.
A function f : A B is injective if each input is mapped to a distinct output. We also say f is one to one, also written 1:1.
Definition injective {A B : Type} (f : A B) : Prop :=
   x y : A, f x = f y x = y.
A function f : A B is surjective if for every value in B, there's a value in A that maps to it. We also call these functions onto, because f maps "onto" the entirety of B.
f is surjective precisely when its codomain is equal to its image.

Definition surjective {A B : Type} (f : A B) : Prop :=
   y, x, f x = y.

Definition bijective {A B : Type} (f : A B) : Prop :=
  injective f surjective f.
Both injectivitiy and surjectivity are preserved by function composition.

Definition compose {A B C} (f : A B) (g : B C) (a : A) : C :=
  g (f a).

Exercise: 1 star, standard, optional (inj_composition)

Lemma inj_composition : {A B C} (f : A B) (g : B C),
    injective f
    injective g
    injective (compose f g).
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 1 star, standard, optional (surj_composition)

Lemma surj_composition : {A B C} (f : A B) (g : B C),
    surjective f
    surjective g
    surjective (compose f g).
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 1 star, standard, optional (nat_natopt_inj)

Define a function nat_natopt : nat option nat that is injective. Your definition and proof should be in Coq.

Definition nat_natopt_inj (n : nat) : option nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Lemma nat_natopt_inj_correct : injective nat_natopt_inj.
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 2 stars, standard (nat_natopt_bij)

Define a function nat_natopt_bij : nat option nat that is bijective. Your definition and proof should be in Coq.

Definition nat_natopt_bij (n : nat) : option nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Lemma nat_natopt_bij_correct : bijective nat_natopt_bij.
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 2 stars, standard, optional (nat_inj)

Define a function nat_inj : nat nat that is injective but not surjective.

Definition nat_inj (n : nat) : nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.

Lemma nat_inj_injective : injective nat_inj.
Proof.
  (* FILL IN HERE *) Admitted.

Lemma nat_inj_not_surjective : ~(surjective nat_inj).
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 3 stars, standard, optional (bool_inj__surj)

Prove that a function f : bool bool is injective iff it is surjective. Be patient with the case analysis!

Lemma bool_inj__surj : (f : bool bool),
    injective f surjective f.
Proof.
  (* FILL IN HERE *) Admitted.
If f : X Y, there might be an inverse g : Y X that does the opposite of f, i.e., f(x) = y g(y) = x, or, to put it differently, g(f(x)) = x and f(g(y)) = y. It's common to write the inverse of f as f^{-1}, i.e., f raised to the -1th power.

Definition inverse_of {X Y} (f : X Y) (g : Y X) :=
   x y, f x = y g y = x.
There are three critical theorems about functions and inverses:
  • If f : X Y is injective, then there is a g : Y X that is surjective.
  • If f : X Y is surjective, then there is a g : Y X that is injective.
  • If f : X Y is bijective, then there exists a g : Y X that is also bijective and is an inverse of f.
We can't prove these theorems in Coq (without assuming the LEM or properties of X and Y). There's a difference between knowing that a function must exist and knowing how to compute it.
For example, we know that the function square : non-negative real non-negative real that squares non-negative real numbers has an inverse--namely, the square-root function. But simply having the square function and knowing that it's bijective isn't enough to magically create the square-root function! It's up to us to find some implementation that actually knows how to find square roots using, e.g., Newton's method.

Cardinality

The size of a set is called its cardinality; we write the cardinality of X as |X| (not to be confused with absolute value). For finite sets, cardinality is easy: simply count the number of elements.
So |bool| = 2, because it has two elements; |unit| = 1 because it has just one, and |∅| = 0.
But cardinality for infinite sets is trickier. What cardinality is nat? What about real? What about nat nat?
Georg Cantor proposed a framework for understanding the cardinalities of infinite sets: use functions as counting arguments. Suppose we have two sets, A and B, and we want to determine their relative sizes. If we can define a function f : A B that's injective, that means every element of A maps to a distinct element of B, like so:

       f
   a1 > b1
   a2 > b2
   a3 > b3
   a4 > b4
   ...
and so on, for each ai A. Since each of the bi are distinct, there have to be at least as many Bs as there are As.
That is, if f : A B is injective, then |A| |B|. Suppose instead that f : A B is surjective. That means f covers all of B, though some elements might be covered twice.
[ f a1 > b1 a2 > b2 a3 > b1 a4 > b3 ... ] and so on, where each bi B shows up at least once. Since every bi shows up, there have to be least as many As as there are Bs.
That is, if f : A B is surjective, then |B| |A|. Finally, we can combine the two properties to define a notion of "same cardinality": if there is a bijection f : A B, then:
  • every element of A gets mapped to a distinct element of B (injectivity), so |A| |B|;
  • every element of B is accounted for in the mapping (surjectivity), so |B| |A|.
That is, if f : A B is bijective, then |A| = |B|.
Definition same_cardinality (X Y : Type) : Prop :=
   f : X Y, bijective f.
For example, we can define a set with two elements, two, and prove that it has the same cardinality as bool. To do so, we have to come up with a function that maps the elements of bool in a one-to-one and onto fashion, i.e., every element of bool is mapped to a distinct element of two and all elements of two are accounted for.
Inductive two : Type :=
| column_a
| column_b.

Lemma bool_two_cardinality : same_cardinality bool two.
Proof.
We'll have true map to column_a and false map to column_b. It doesn't really matter--we could have gone the other way. But we couldn't have had both true and false map to column_a, since that wouldn't be injective.
   (fun b : boolif b then column_a else column_b).
  split.
  - intros [] [] H.
    + reflexivity.
    + inversion H.
    + inversion H.
    + reflexivity.
  - intros [].
    + true. reflexivity.
    + false. reflexivity.
Qed.

(* An absolute unit. Positively in awe at the size of this lad,
   tt. *)

Print unit.

Lemma unit_bool_cardinality : ¬ (same_cardinality unit bool).
Proof.
  intros [f [Hinj Hsurj]].
  destruct (Hsurj true) as [[] Htrue].
  destruct (Hsurj false) as [[] Hfalse].
  rewrite Htrue in Hfalse. inversion Hfalse.
Qed.

Exercise: 2 stars, standard (bool_nat_cardinality)

Prove that bool and nat have different cardinalities.
Lemma bool_nat_cardinality : ¬ (same_cardinality bool nat).
Proof.
  (* FILL IN HERE *) Admitted.

Exercise: 3 stars, standard, optional (nat__list_unit)

Prove that nat and list unit have the same cardinality. It will be easier to define your function outside of the lemma. Hint: your function will need to be recursive!

Lemma nat__list_unit : same_cardinality nat (list unit).
Proof.
  (* FILL IN HERE *) Admitted.
Proving that two sets have the same cardinality via exhibiting a bijection is a straightforward process... once you've found the bijection.
Cantor is particularly notable because he came up with a clever way of showing that two sets don't have the same cardinality: a proof method called diagonalization.
Here's an example, both informally and in Coq.
  • Theorem: |nat| |nat nat|.
    Suppose, for a contradiction, that there exists a bijection f : nat (nat nat).
    Define g(n) = 1 + f n n. Since g : nat nat and f is surjective, there exists some number m such that f m = g.
    What is g m? By the definition of g, we have g m = 1 + f m m. But we know that f m = g, so we really have g m = 1 + g m... an impossibility! We've reached a contradiction, so f must not be bijective. Qed.
Lemma nat_natfun_diag : ¬ (same_cardinality nat (nat nat)).
Proof.
  intros [f [Hinj Hsurj]].
  unfold surjective in Hsurj.
  remember (fun nS ((f n) n)) as g.
  destruct (Hsurj g) as [n Hf].
  assert ( m, f n n = m). { (f n n). reflexivity. }
  destruct H as [m Hfm].
  assert (f n n = S m). { rewrite Hf. rewrite Heqg. rewrite Hfm. reflexivity. }
  rewrite Hfm in H. apply n_Sn in H. destruct H.
Qed.
Here's another example of the method, showing that |X| |P(X)|, i.e., a set and its powerset are of different cardinality.
I'll provide the Coq proof; you provide the informal proof.

Lemma set_powset_diag : {X}, ¬ (same_cardinality (set X) (set (set X))).
Proof.
  intros X [f [Hinj Hsurj]].
  unfold surjective in Hsurj.
  remember (Spec (fun x¬ (Member x (f x)))) as A.
  destruct (Hsurj A) as [e HA].
  destruct (inclusion_exclusion A e) as [Hin | Hnotin].
  - assert (¬Member e A) as Hnotin.
    { rewrite HeqA in Hin. apply member_spec_P in Hin .
      rewrite <- HA. apply Hin. }
    apply Hnotin. apply Hin.
  - assert (Member e A) as Hin.
    { rewrite HeqA. apply P_member_spec.
      rewrite HA. apply Hnotin. }
    apply Hnotin. apply Hin.
Qed.

Countability

Given that injections, surjections, and bijections give an account of the relative sizes of infinite sets, we can go on to define a particularly interesting class of infinite sets: countable sets are those that are no bigger than the natural numbers.
Since there are two ways to show relative size, there are two ways to show countability of a set X:
  • define a function f : X nat and prove that f is injective (so |X| |nat|)
  • define a function g : nat X and prove that g is surjective (so |X| |nat|)

Definition countable (X : Type) : Prop :=
  ( f : X nat, injective f)
  ( f : nat X, surjective f).

Lemma bool_countable_surj : countable bool.
Proof.
  right.
   (fun neqb n 0).
  intros [].
  - 0. reflexivity.
  - 1. reflexivity.
Qed.

Lemma bool_countable_inj : countable bool.
Proof.
  left.
   (fun b : boolif b then 1 else 0).
  intros [] [] H.
  - reflexivity.
  - inversion H.
  - inversion H.
  - reflexivity.
Qed.
Infinitely countable sets have the same cardinality as the nat--that is, we must exhibit a bijection.
Our definition in Coq of infinite countability allows us to choose which direction our bijection points. Working informally, it doesn't matter, because bijections are invertible. But since Coq functions aren't automatically invertible, this definition makes our (formal) life easier.

Definition infinitely_countable (X : Type) : Prop :=
  same_cardinality X nat same_cardinality nat X.
We've already shown that list unit is infinitely countable.
There are many surprising results in infinite countability.
First, we can add a finite number of elements to any countable set and it will still be countable. For example, you've already shown that option nat is countable, in nat_natopt_bij_correct!
It's possible to add or subtract an infinite number of elements and still have an infinitely countable set. Here's proof that there as many nats as there are even nats.
Let evens = {x:nat| evenb x}. To show |nat| = |evens|, we must define f : nat evens and prove that f is bijective.
We want a mapping like:

       f
    0 > 0
    1 > 2
    2 > 4
    3 > 6
    4 > 8
      ...
So let f(x) = double x. We must show that f is bijective, i.e., injective and surjective.
We have f injective by double_injective.
To see that f is surjective, let an even number m be given. We must show that there is some natural n such that f n = m. By even_bool_prop, we know that if m is even, then there exists a k such that m = double k, so let n be k. The argument underlying the previous proof is sometime called "Hilbert's Hotel", after the mathematician David Hilbert.
Suppose an mathematical conference is being held at a hotel. There are countably infinitely many attendees, one per natural number. Fortunately, the hotel has a countably infinite number of rooms, so we just put the nth guest in the nth room.

    0 > 0
    1 > 1
    2 > 2
    3 > 3
    ...
But what happens when another conference shows up at the hotel, this time with countably infinitely many computer scientists to crash the party? The hotel manager has a clever idea: switch it up so mathematicians are in every other room. There are now infintely many empty rooms for the CS folks!
    0 > 0
      ??? 1
    1 > 2
      ??? 3
    2 > 4
      ??? 5
    3 > 6
    ...

Russell's paradox

Our definition of sets is typed: we say set X to mean a set of values of type X.
What happens if we don't use types to "ramify" our sets, in the sense of "divide into branches or subdivisions; as, to ramify an art, subject, scheme" (Webster's unabridged 1913 dictionary).
Allowing heteregeneity adds flexibility--we can define a set not just of nats, but nats mixed with bools and lists.
Allowing such definitions of sets yields a dangerous circularity, though: can a set be a member of itself?
Bertrand Russell discovered that allowing sets to be members of themselves is contradictory--it's simply not sound mathematics.
His argument went as follows. Call a set that contains itself extraordinary; sets that don't contain themselves are ordinary. So S = {1,3,true} is ordinary, but T = {1,T} is extraordinary. Each set is ordinary or extraordinary--it either is a member of itself or it isn't.
  • Russell's paradox. Let the set X be defined as the set of ordinary sets, i.e. X = { A | A is ordinary}, i.e., X = {A | A A}.
    Is X ordinary or extraordinary? We find
    • If X is ordinary, then XX. But then, by definition, every ordinary set should be a member of X... including itself! So then X X and X is extraordinary... contradicting our assumption that it was ordinary.
    • If, on the other hand, that X is extraordinary, then X X. But X's members are exactly the ordinary sets, i.e., those A such that A A--so it's a contradiction to have X X.
We assumed that we could define X as the set of ordinary sets, but we arrived at a contradiction.
We've typically arrived at a contradiction after assuming some proposition we're trying to disprove: to show ¬P, we assume P and then derive an absurdity.
But this situation is different: we weren't trying to prove anything--we were just defining something! What does it mean to reach a contradiction from a definition?
If you're concerned, the feeling is justified. If a definition can produce a contradiction, what does mathematics even mean? How do we make sure we only define good definitions, that won't produce such paradoxes? These concerns wracked the mathematical community in the first part of the 20th Century.
Math has settled on two solutions to the problem.
  • First, Russell and Whitehead produced an incredible work, Principia Mathematica, which showed how to build up a theory of "ramified sets"--sets where a set of subdivisions or levels indicated which sets could be members of others. So began type theory--the underlying framework that Coq uses. Type theory has come a long way--Principia is famously unreadable, full of opaque and tedious calculations to find simple facts like 1 + 1 = 2. (It took them 352 pages to prove that fact!) Nothing like Coq, of course.
  • Later, Ernst Zermelo and Abraham Fraenkel came up with a set of axioms for working with sets; they axioms are collectively called ZF. Rather than dealing with the tedium of ramified sets, they instituted an axiom called regularity (a/k/a the axiom of foundation) that prevents dangerous circularities (while still allowing heterogeneous sets):

  regularity : x,
                a, ax
                y, yx ∧ (~ z, zyzx)
That is, for every x, if there is some element a of x, then there is another element y of x such that x and y are disjoint (i.e., there are no z such that z is an element of both y and x).
Most mathematicians work in ZF, but without ever really thinking about it. So long as you define ordinary sets, you'll never violate the ZF axioms. But if you're doing something weird--using extraordinary sets or really big infinite sets--you need to watch out.
One particularly good litmus test for a definition is to try to put it into Coq. While there are some good definitions that Coq won't accept, if Coq does accept your definition, then so will mathematicians everywhere.

(* Mon Apr 6 09:16:57 PDT 2020 *)