Set cardinality

We say the size of its set is its cardinality, written with vertical bars as in $|A|$ (from Latin cardinalis, "the hinge of a door", i.e., that on which a thing turns or depends---something of fundamental importance).

We'll spend today trying to understand cardinality.

Finite sets

Finite sets, like those we defined on day 8, are easy to count: their cardinality is just the number of elements in them. So we can say:

\[ \begin{array}{rcl} |\emptyset| &=& 0 \\ |\mathsf{unit}| &=& 1 \\ |\texttt{bool}| &=& 2 \\ |\mathsf{RPS}| &=& 3 \\ |\texttt{base}| &=& 4 \\ \end{array} \]

The various representations we had for sets had natural ways to compute cardinality: for a list representation, a set's cardinality was merely the length of the list; for a tree representation, a set's cardinality was merely the size of the tree.

The naturals

What about infinite sets? What's the cardinality of the naturals, $\mathbb{N}$? There's a great deal of folk mathematics around infinity, commonly written $\infty$, with folk theorems like $\infty + 1 = \infty$.

Mathematics wasn't particularly rigorous around infinities until Georg Cantor came up with the notion of cardinality and counting we'll see here. There's a rich mathematics of infinites involving ordinal and cardinal numbers; we'll only be skimming the surface.

We can say up front that there are different kinds of infinities. The cardinality of the naturals is the smallest infinity, $\aleph_0$, typically pronounced "aleph null". (Aleph is the first letter of the Aramaic/Hebrew/Phoenician alphabet. There are varying reasons as to why Cantor chose that letter.)

But before we can go further, we need to build up an understanding of how to determine relative cardinality.

Functions and relative cardinality

Cantor had many great insights, but perhaps the greatest was that counting is a process, and we can understand infinites by using them to count each other. That is, we can use functions to establish the relative size of sets.

Suppose we have a function $f : \mathbb{N} \rightarrow A$ for some set $A$. What can we say about relative size, i.e., how do $|\mathbb{N}|$ and $|A|$ relate? Different properties of $f$ will yield different realtions.

Counting with an injection

If $f : \mathbb{N} \rightarrow A$ is injective, then we can write out $f$ as follows:

\[ \begin{array}{cccccc} 0 & 1 & 2 & 3 & 4 & \dots \\ \mapsto & \mapsto & \mapsto & \mapsto & \mapsto & \\ f(0) & f(1) & f(2) & f(3) & f(4) & \dots \\ = & = & = & = & = & \\ a_0 & a_1 & a_2 & a_3 & a_4 & \dots \
\end{array} \]

Since $f$ is injective, we know that $a_i = a_j$ implies that $i = j$, i.e., each $a_i$ is only equal to itself. How many elements does the set $A$ have? The function $f$ counts distinct elements from $A$, one for each natural. So there can be no fewer elements in $A$ than there are in $\mathbb{N}$, i.e., $|\mathbb{N}| \le |A|$.

Now, $A$ could be much larger than $\mathbb{N}$---we don't know whether or not $f$ lists everything in $A$.

Formally, we say that if $f : A \rightarrow B$ is injective, then $|A| \le |B|$. If you can find an injection from a set $A$ to a set $B$, then $B$ is no smaller in than $A$.

Counting with a surjection

If $g : \mathbb{N} \rightarrow A$ is surjective, then we can write out $g$ as follows:

\[ \begin{array}{cccccc} 0 & 1 & 2 & 3 & 4 & \dots \\ \mapsto & \mapsto & \mapsto & \mapsto & \mapsto & \\ g(0) & g(1) & g(2) & g(3) & g(4) & \dots \\ = & = & = & = & = & \\ a_0 & a_1 & a_2 & a_3 & a_4 & \dots \
\end{array} \]

Since $g$ is surjective, we know that eventually every element of $A$ shows up as an $a_i$. Now, we might have some elements show up thousands of times---or even infinitely many times, since $\mathbb{N}$ is infinite!---but everything shows up at least once. So there can be no more elements in $A$ than there are in $\mathbb{N}$, i.e., $|A| \le |\mathbb{N}|$.

Now, $A$ could be much smaller than $\mathbb{N}$---because we're allowed to repeat elements.

Formally, we say that $f : A \rightarrow B$ is surjective, then $|B| \le |A|$. If you can find a surjection from a set $A$ to a set $B$, then $B$ is no larger than $A$.

Bijections

If $f : A \rightarrow B$ is injective, then $|A| \le |B|$; if $f$ is surjective, then $|B| \le |A|$. So if $f$ is bijective, we can expect that $|A| = |B|$.

That is, if $f : A \rightarrow B$ is a bijection, then there is a one-to-one correspondence of elements in $A$ to elements in $B$. Since $f$ is surjective, every element of $B$ shows up in $\mathrm{Image}(f)$; since $f$ is injective, each element of $A$ gets a unique partner.

\[ \begin{array}{cccccc} a_0 & a_1 & a_2 & a_3 & a_4 & \dots \\ \mapsto & \mapsto & \mapsto & \mapsto & \mapsto & \\ f(a_0) & f(a_1) & f(a_2) & f(a_3) & f(a_4) & \dots \\ = & = & = & = & = & \\ b_0 & b_1 & b_2 & b_3 & b_4 & \dots \
\end{array} \]

While the listing may go on infinitely, $A$ and $B$ are paired up neatly---so they must be the same size!

If-and-only-if

In fact, the definitions we gave of relative cardinalities are if-and-only-ifs:

Countability

We say that a set $A$ is countable when $|A| \le \mathbb{N}$. People also use the word denumerable, which is a fancy, Latinate way of saying countable. We say a set $A$ is infinitely countable when $|A| = \mathbb{N}$.

Finite sets

Our definition of countable sets includes finite sets. It should come as no surprise that finite sets are countable: every set you've ever successfully counted is finite!

Finite sets have natural numbers as their cardinality. For example, we already saw that $|\texttt{bool}| = 2$ above.

It's easy to generate surjections from the naturals to non-empty finite sets. A finite set $B$ with $n$ elements is really of the form $B = \{ b_0, b_1, \dots, b_{n-1} \}$, so we can say hit every element of $B$ early on and then just repeat an arbitrary one:

\[ f(x) = \begin{cases} b_0 & x = 0\\ b_1 & x = 1 \\ \vdots & \\ b_{n-1} & x = n-1 \\ b_0 & \text{otherwise} \\ \end{cases} \]

Question: Would the strategy above work for the empty set, $\emptyset$? If not, what could you do instead?

Finite sets of functions: $\texttt{bool} \rightarrow \texttt{bool}$

Surprisingly, some sets of functions are countable. For example, there are four functions of type $\texttt{bool} \rightarrow \texttt{bool}$: the identity, negation, constant-$\top$, and constant-$\bot$.

How do we know this? Well, a function $f : \texttt{bool} \rightarrow \texttt{bool}$ is really a total and functional relation $f \subseteq \texttt{bool} \times \texttt{bool}$, of which there are four: they're all of the form $\{ (\top, b_0), (\bot, b_1) \}$, with two possible choices for $b_0$ and $b_1$, i.e., $2 \cdot 2 = 4$ possibilities.

It's very tempting to say that $|\texttt{bool} \rightarrow \texttt{bool}| = |\texttt{bool} \times \texttt{bool}| = |\texttt{bool}| \cdot |\texttt{bool}|$. Such cardinal multiplication is well defined, but you must be very careful once infinities are involved---as we shall see.

Infinite sets

What about infinite sets? The naturals $\mathbb{N}$ are infinitely countable by definition, but also because the identity function is a bijection on any set, so $|\mathbb{N}| = |\mathbb{N}|$ trivially. What about other infinite sets?

Positives

Let's start with the positives, $\mathbb{N}^+ = \{ S(n) \mid n \in \mathbb{N} \} = \{ 1, 2, 3, \dots \}$. How big are the positives compared to the naturals? It's plain as day: there's one fewer positive number, since $0 \in \mathbb{N}$ but $0 \not\in \mathbb{N}^+$. So the positives must be smaller, right? Unfortunately, infinites are not so intuitive as all that.

We can come up with a bijection from the positives to the naturals as follows:

\[ f(n) = n - 1 \]

Why is $f$ a bijection? First, we can see that $f : \mathbb{N}^+ \rightarrow \mathbb{N}$, defined as the relation $\{ (S(n), n) \mid n \in \mathbb{N} \}$. Let's prove the rest.

Theorem: $|\mathbb{N}^+| = |\mathbb{N}|$, i.e., the positives are infinitely countable.

Proof: We show that the $f$ given above is a bijection, i.e., it is injective and surjective.

Surprised? That's fair! Infinities are hard to think about. When Georg Cantor first published his proofs, many mathematicians were upset and in disbelief. People were even worried about the religious implications, as infinity and the divine were thought to somehow correspond.

There's a crucial moral here: intuitions can bite you. We have $|\mathbb{N}| = \aleph_0$; we have $|\mathbb{N}^+| = |\mathbb{N}| - 1$... so $\aleph_0 = \aleph_0 - 1$. Cardinal numbers don't behave like the numbers you already know. Don't try to transfer your intuitions from arithmetic!

Integers

Recall that $\mathbb{Z}$ is the integers (where the "Z" comes from German zahlen, "numbers"), i.e., the set $\{ 0, 1, -1, 2, -2, \dots \}$. How do $\mathbb{Z}$ and $\mathbb{N}$ relate? Surely there are more integers! There are almost twice as many integers, right?

Theorem: $|\mathbb{Z}| = |\mathbb{N}|$, i.e., the integers are infinitely countable.

Proof using a surjection: Since $\mathbb{N} \subseteq \mathbb{Z}$, we know that $|\mathbb{N}| \le |\mathbb{Z}|$. It suffices to find an surjective function $g : \mathbb{N} \rightarrow \mathbb{Z}$ to show that $|\mathbb{Z}| \le |\mathbb{N}|$. We say:

\[ g(n) = \begin{cases} \frac{n}{2} & n \text{ even} \\ \frac{n+1}{-2} & n \text{ odd} \\ \end{cases} \]

First, to gain some intuition, note that $g(0) = 0$, $g(1) = -1$, $g(2) = 1$, $g(3) = -2$, $g(4) = 2$, $g(5) = -3$, and so on.

We must show that $g$ is surjective. Let $x \in \mathbb{Z}$ be given: we must find an $n \in \mathbb{N}$ such that $g(n) = x$.

If $x < 0$, then let $n = -2 \cdot x - 1$. We can see that $n \in \mathbb{N}$ because if $x<0$ then $-2 \cdot x \ge 2$, so $-2 \cdot x - 1 \ge 1$. Next, $-2 \cdot x$ must be even, so $-2 \cdot x - 1$ must be odd... and so $g(n) = \frac{(-2 \cdot x - 1) + 1}{-2} = \frac{-2 \cdot x}{-2} = x$, as desired.

If $x \ge 0$, then let $n = 2 \cdot x$. We can directly find that $n \in \mathbb{N}$ because $x \ge 0$. Moreover, $n$ must be even. So $g(n) = \frac{2 \cdot x}{2} = x$, as desired. QED

Proof using an injection: Since $\mathbb{N} \subseteq \mathbb{Z}$, we know that $|\mathbb{N}| \le |\mathbb{Z}|$. It suffices to find an injective function $h : \mathbb{Z} \rightarrow \mathbb{N}$ to show that $|\mathbb{Z}| \le |\mathbb{N}|$. We say:

\[ h(x) = \begin{cases} 2 \cdot x & x \ge 0 \\ -2 \cdot x -1 & x < 0 \\ \end{cases} \]

We must show that $h$ is injective. Let $x, y \in \mathbb{Z}$ be given such that $h(x) = h(y)$; we must show $x = y$.

First, observe that $h$ maps non-negative numbers to even numbers and negative numbers to odd ones. So if $h(x) = h(y)$, then it must be that $x$ and $y$ are both positive or both negative.

If they're both positive, then $h(x) = 2 \cdot x = 2 \cdot y = h(y)$, which means that we must have $x = y$.

On the other hand, if they're both negative, then $h(x) = -2 \cdot x - 1 = -2 \cdot y - 1 = h(y)$, and again we must have $x = y$. QED

Question: The proof above uses the fact that $A \subseteq B$ implies $|A| \le |B|$. How would you prove that?

Question: Could you prove the theorem above using a bijection?

Question: How do $g$ and $h$ relate?

Pairs of naturals

What about pairs of naturals, i.e., $\mathbb{N} \times \mathbb{N}$? Shouldn't we be able to say that $|\mathbb{N} \times \mathbb{N}| = \aleph_0 \times \aleph_0 = \aleph_0^2$. Every grade schooler knows that "infinity squared" is bigger than infinity! Right? Let's see what happens.

Let's consider the following matrix:

\[ \begin{array}{c|cccccc} & 0 & 1 & 2 & 3 & 4 & 5 & \dots \\ \hline 0 & 0 & 1 & 3 & 6 & 10 & 15 & \ddots \\ 1 & 2 & 4 & 7 & 11 & 16 & \ddots & \\ 2 & 5 & 8 & 12 & 17 & \ddots & & \\ 3 & 9 & 13 & 18 & \ddots & & & \\ 4 & 14 & 19 & \ddots & & & & \\ 5 & 20 & \ddots & & & & & \\ \vdots & \ddots & & & & & & \\ \end{array} \]

Before we explain what's happening here in detail, what do you notice? What do you wonder? Stop reading and think!

Okay: we'd have you notice a few things. First, every natural number will appear at some point in the table, and none of them will appear more than once. Next, every pair of natural numbers will appear as a row/column pair, and none will appear more than once.

Each appears once---that's surjectivity. None appears more than once---that's injectivity. This matrix isn't quite a function definition, but it's a good intuition towards one!

Having seen this matrix, you should be able to see that there must be the same number of naturals as there are pairs of naturals: we can set up a bijection between them. Learning that $|\mathbb{N} \times \mathbb{N}| = |\mathbb{N}|$ is often a troubling fact---take a moment to sit with it! It's normal to be confused here. Infinity is weird.

Let's turn this into a real function. We can go either way: from $\mathbb{N} \times \mathbb{N}$ to $\mathbb{N}$ or vice versa. We'll do both.

Pairs of numbers to numbers

Let's write $f : \mathbb{N} \times \mathbb{N} \rightarrow \mathbb{N}$ to implement the intuition of the matrix.

\[ f(i, j) = \begin{cases} 0 & i = j = 0 \\ f(j-1, 0) + 1 & i = 0, j \ge 0 \\ f(i-1, j+1) + 1 & i \ge 0, j \ge 0 \\ \end{cases} \]

So we have:

\[ \begin{array}{rclclcr} f(0, 0) & & & & & = & 0 \\ f(0, 1) &=& f(1-1,0) + 1 &=& f(0,0) + 1 & = & 1 \\ f(1, 0) &=& f(1-1,0+1) + 1 &=& f(0,1) + 1 & = & 2 \\ f(0, 2) &=& f(2-1,0) + 1 &=& f(1,0) + 1 & = & 3 \\ f(1, 1) &=& f(1-1,1+1) + 1 &=& f(0,2) + 1 & = & 4 \\ f(2, 0) &=& f(2-1,0+1) + 1 &=& f(1,1) + 1 & = & 5 \\ \end{array} \]

Hopefully you're convinced our function $f$ correctly models the numbers in the relation. You might be worried that $f$ isn't obviously terminating: but notice that on every recursive call either (a) the sum of $i$ and $j$ decreases by one, or (b) the sum stays the same but $i$ itself decreases. If you keep doing (a) and (b) arbitrarily long to a pair $(i, j)$, eventually you'll reach the base case, $(0, 0)$, no matter what.

It remains to see that this function is injective and surjective. How might you prove that?

Numbers to pairs of numbers

Let's go the other direction, constructing $g : \mathbb{N} \rightarrow \mathbb{N} \times \mathbb{N}$ in line with the intuition of the matrix.

To write $g$, we want a helper function we'll call $\newcommand{\next}{\mathsf{next}}\next : \mathbb{N} \times \mathbb{N} \rightarrow \mathbb{N} \times \mathbb{N}$.

\[ \next(i,j) = \begin{cases} (0, i+1) & j = 0 \\ (i+1, j-1) & \text{otherwise} \\ \end{cases} \]

So we have:

\[ \begin{array}{rclcr} \next(0, 0) &=& (0\phantom{{}+1}, 0+1) &=& (0, 1) \\ \next(0, 1) &=& (0+1, 1-1) &=& (1, 0) \\ \next(1, 0) &=& (0\phantom{{}+1}, 1+1) &=& (0, 2) \\ \next(0, 2) &=& (0+1, 2-1) &=& (1, 1) \\ \next(1, 1) &=& (1+1, 1-1) &=& (2, 0) \\ \next(2, 0) &=& (0\phantom{{}+1}, 2+1) &=& (0, 3) \\ \end{array} \]

Hopefully you're convinced that $\next$ computes the next entry. Now we can give a very simple definition for $g$:

\[ g(n) = \begin{cases} (0,0) & n = 0 \\ \next(g(n-1)) & \text{otherwise} \\ \end{cases} \]

Hopefully you're convinced that our function $g$ correctly models the intuition of the matrix. Our $g$ function is obviously terminating, unlike $f$ above.

It remains to see, though, that $g$ is injective and surjective. How might you prove that?

A note on the videos

In the videos, I show a slightly different matrix---one where it's easy to draw a line rather than the right-to-left, top-to-bottom zig zag we used here. It turns out that the boustrophedon line in the video is easier to understand, but the "take it from the top" line is easier to model formally. Sorry for any confusion!

Rationals

Our last example of a countable set is the rational numbers or fractions, written $\mathbb{Q}$. It turns out that $|\mathbb{Q}| = |\mathbb{N}|$. Surprising, perhaps, but at this point you ought to be ready for anything.

We could try to come up with a bijection between the rationals and the naturals, but it's actually quite tricky: we have to contend with the fact that $\frac{1}{1} = \frac{2}{2} = \dots$, or that $\frac{1}{2} = \frac{2}{4} = \dots$. We could do it, I'm sure, but it would be pretty annoying.

There's a better way: we'll hem in $\mathbb{Q}$ so there's no way to escape, showing that $|\mathbb{N}| \le |\mathbb{Q}|$ and $|\mathbb{Q} \le |\mathbb{N}|$.

Inclusion

First, note that every natural number is a rational number, i.e., $\mathbb{N} \subseteq \mathbb{Q}$. That's enough to see that $\mathbb{Q}$ is at least as large as $\mathbb{N}$.

Theorem: If $A \subseteq B$, then $|\mathbb{A}| \le |\mathbb{B}|$.

Proof: Let $A$ and $B$ be given such that $A \subseteq B$. We define the inclusion map or insertion $\iota : A \rightarrow B$ as $\iota(x) = x$ (where $\iota$ is the Greek letter "iota"). Since $A \subseteq B$, we know that $\iota$ is well typed. Furthermore, $\iota$ is injective by definition, since it's just the identity function. QED

So we have $|\mathbb{N}| \le |\mathbb{Q}|$ by inclusion. What about the other direction?

Rationals are pairs

We saw already that pairs of naturals are infinitely countable, i.e., $|\mathbb{N} \times \mathbb{N}| = |\mathbb{N}|$. But any rational number is really just a pair of numbers. Let's define the mapping $p : \mathbb{Q} \rightarrow \mathbb{N} \times \mathbb{N}$ as $p(\frac{a}{b}) = (a, b)$.

First, observe that $p$ is well typed. Next, notice that $p$ is injective: a fraction is just mapped to the pair of its numerator and denominator. So $|\mathbb{Q}| \le |\mathbb{N} \times \mathbb{N}| = |\mathbb{N}|$.

A rock and a hard place

We found that $|\mathbb{N}| \le |\mathbb{Q}| \le |\mathbb{N} \times \mathbb{N}| = |\mathbb{N}|$, so it must be that $|\mathbb{Q}| = |\mathbb{N}|$, i.e., the rationals are infinitely countable.

The trick we used here is a convenient one: sometimes it's easier to prove countability of some closely related set (i.e., pairs of naturals rather than naturals themselves), and then use other, simpler mappings to tie things together.

What else?

Lots of other useful sets are countable: the algebraic numbers are countable. Every inductive type we've defined in Coq is countable: lists of countable things, trees of countable things, and arithmetic expressions... to name a few.

Uncountability

At this point, you might be wondering what isn't countable. Having learned that $\aleph_0 \cdot \aleph_0 = \aleph_0$, it may seem implausible that anything could be bigger than the naturals!

Function spaces

Let's consider the set of unary functions on the naturals, $\mathbb{N} \rightarrow \mathbb{N}$. Could this set be countable? You might try to construct some kind of matrix, like we did for $\mathbb{N} \times \mathbb{N}$... but it won't work!

Theorem: $\mathbb{N} \rightarrow \mathbb{N}$ is uncountable, i.e., it is infinite but not countable, i.e., $|\mathbb{N}| < |\mathbb{N} \rightarrow \mathbb{N}|$.

Proof: First, it's easy to show that $|\mathbb{N}| \le |\mathbb{N} \rightarrow \mathbb{N}|$ with the map $f(n) = \texttt{fun _} \Rightarrow n$, i.e., each number $n$ is mapped to the constant-$n$ function, i.e., $f(0)$ is the constant-$0$ function, $f(1)$ is the constant-$1$ function, and so on. (You might write $f(n)(\_) = n$, though I find that harder to read.) Since $f$ is an injection, we have $|\mathbb{N}| \le |\mathbb{N} \rightarrow \mathbb{N}|$.

To see that $|\mathbb{N}| \ne |\mathbb{N} \rightarrow \mathbb{N}|$, it suffices to show that either there can be no injection $(\mathbb{N} \rightarrow \mathbb{N}) \rightarrow \mathbb{N}$ or no surjection $\mathbb{N} \rightarrow (\mathbb{N} \rightarrow \mathbb{N})$. We will go the latter route.

Suppose for a contradiction that there exists a surjection $g : \mathbb{N} \rightarrow (\mathbb{N} \rightarrow \mathbb{N})$.

Let $d(n) = g(n)(n) + 1$. That is, $d$ looks up the $n$th function in $g$, applies that function to $n$, and adds one.

Since $d : \mathbb{N} \rightarrow \mathbb{N}$ and we've assumed $g$ is surjective, it must be the case that there exists some $m \in \mathbb{N}$ such that $g(m) = d$.

We find a contradiction: what is $d(m)$? By definition: \[ \begin{array}{rclr} d(m) &=& g(m)(m) + 1 & \text{by definition} \\ &=& d(m) + 1 & \text{by assumption that $g(m) = d$} \end{array} \] How can it be that $d(m) = d(m) + 1$?! We've arrived at a contradiction, but we've merely assumed that $g$ is surjective... so it must not be the case that $g$ is surjective! QED

Another matrix intuition

How did that proof just work? We showed that it was impossible to find a certain surjection. The trick was to define something using and then contradicting said surjection. The name for the "something" is a gadget or diagonalization gadget.

Why diagonalization? Let's think of how we might write out the (totally nonexistent) surjection $g$ from the proof. We have $g : \mathbb{N} \rightarrow (\mathbb{N} \rightarrow \mathbb{N})$, i.e., $g(n)$ is a function that maps each natural to another natural. We can draw $g$ out as a matrix:

\[ \begin{array}{c|cccccc} & 0 & 1 & 2 & 3 & 4 & \dots \\ 0 & n_{00} & n_{01} & n_{02} & n_{03} & n_{04} & \dots \\ 1 & n_{10} & n_{11} & n_{12} & n_{13} & n_{14} & \dots \\ 2 & n_{20} & n_{21} & n_{22} & n_{23} & n_{24} & \dots \\ 3 & n_{30} & n_{31} & n_{32} & n_{33} & n_{34} & \dots \\ 4 & n_{40} & n_{41} & n_{42} & n_{43} & n_{44} & \dots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{array} \]

We would say that $g(0) = h$ where $h(m) = n_{0m}$. And, say, $g(1) = k$, where $k(i) = n_{1i}$, and so on. In general, $g(i)(j) = n_{ij}$. Since $g$ is surjective, every function from naturals to naturals should show up as some row in the matrix.

We defined our gadget $d$ so that $d(m) = g(m)(m) + 1 = n_{mm} + 1$. That tricky ${} + 1$ is the turn of the screw. Where could we possibly put $d$ in the matrix? If $d$ is in the $k$th row, then $n_{k0} = d(0)$ and $n_{k1} = d(1)$, and so on. In particular, $n_{kk} = d(k) = g(k)(k) + 1 = n_{kk} + 1$, which is impossible.

The essential thing is that (1) $d$ identifies the diagonal of the matrix underlying $g$, and (2) $d$ contradicts whatever is on the diagonal. The combination of these two qualities means that there can be entry for $d$ in $g$, which means $g$ can't be surjective.

Reals

What about the real numbers $\mathbb{R}$? These are the numbers you probably have seen the most in math classes, they're used extensively in trigonometry, algebra, and calculus.

The reals are the classical example of an uncountable set---they're the first set that Cantor proved to be uncountable. It's a bit tricky to do the proof for all reals in general, so we'll restrict our view to just a narrow window on the reals: the open interval $(0,1)$, i.e., every real number between $0$ and $1$, exclusive. We'll call it $\mathbb{R}_{(0,1)}$. (Note that our interval looks like a tuple, but it isn't!)

The interval $(0,1)$

First, let's see that $\newcommand{\Rint}{\mathbb{R}_{(0,1)}}|\Rint| = |\mathbb{R}|$. (This alone should seem like trouble... aren't the reals much bigger than that interval?)

Well, we can find a nice bijection that clearly shows they're the same size. In fact, we can find two (though there are many others):

\[ \begin{array}{rcl} f(x) = -\mathrm{tan}^{-1}(\pi \cdot x) \\ g(x) = \mathrm{tan}(\pi \cdot x + \frac{\pi}{2}) \\ \end{array} \]

Here's a graph (the two function lines are the same):

a graph of f and g from the text; they swoop from deep negatives close to 0 up to high positives close to 1

We won't do the legwork, but you can verify that $f(x) = g(x)$, that $f$ and $g$ are defined on every real in $(0,1)$ (but not at $0$ and $1$ themselves), and that for any $y \in \mathbb{R}$, we can find an $x$ such that $f(x) = g(x) = y$.

Having established that the interval and the reals have the same cardinality, if we can show that $|\mathbb{N}| < |\Rint|$, then we'll know that $|\mathbb{N} < |\mathbb{R}|$, too.

Theorem: $|\mathbb{N}| < |\Rint|$.

Proof: We've already established that $\Rint$ is infinite, since it's equal in size to $\mathbb{R}$, which itself contains the naturals. So we only need to show that it has strictly greater cardinality.

Suppose, for the sake of a contradiction, suppose that there exists some surjective function $h : \mathbb{N} \rightarrow \Rint$.

Every number in $\Rint$ is of the form $0.d_0\ d_1\ d_2\ \dots$, where $d_i$ is a digit, i.e, 0 through 9. (If we're being very, very precise, we have to be careful: $0.000\dots = 0$ isn't in our set, nor is $0.999\dots = 1$. Let's ignore this subtlety---we're technically working a slight superset of $|\Rint|$, but it won't matter.) So $h(n) = 0.d_{n0}\ d_{n1}\ d_{n2}\ \dots$.

Define a "flipped" digit as one that turns a 4 into a 5 and everything else into a 4, i.e.:

\[ \overline{d} = \begin{cases} 5 & d = 4 \\ 4 & \text{otherwise} \\ \end{cases} \]

Notice that $d \ne \overline{d}$ for any digit $d$. (By not using $0$s or $9$s, we avoid uncomfortable questions about representations of the reals.)

Let our gadget $q$ be defined as the flipped digits on the diagonal, i.e., $q = 0.\overline{d_{00}}\ \overline{d_{11}}\ \overline{d_{22}}
\dots$.

Since $h$ is surjective, there must be some $k$ such that $q = h(k)$. But then what is the $k$th digit of $q$? By definition, it's $d_{kk}$... but we said the $k$th digit is $\overline{d_{kk}}$, and it can't be the case that $d_{kk} = \overline{d_{kk}}$. QED

Question: Can you draw the matrix that gives the intuition for why $q$ captures the diagonal?

A recipe for uncountability

What just happened? We used Cantor's diagonal argument, also called diagonalization. To see that an infinite set was strictly bigger than another, we showed there could be no surjection from the smaller set to the bigger one.

We can sum up diagonalization as follows.

Recipe: $|A| < |B|$

(makes 1 diagonal argument)

Ingredients:

Steps:

  1. Prove that $|A| \le |B|$ by conventional methods (i.e., injection or surjection).
  2. Suppose for a contradiction that there exists $f : A \rightarrow B$, surjective.
  3. Define a gadget that uses $f$ "on itself" somehow and then modifies the result, i.e., disrupts the diagonal.
  4. Observe that there can be no $a \in A$ such that $f(a)$ yields your gadget.
  5. Since $f$ doesn't hit your gadget, $f$ can't be surjective---a contradiction!

The hardest part here is step 3: how do you find a gadget? Sadly, the answer is: ingenuity and experience. For $\mathbb{N} \rightarrow \mathbb{N}$, our gadget on $n$ looked up the $n$th function, called it on $n$, and added 1. On the reals, it looked up the $n$th number and its $n$th digit and then it... flipped it.