Download the homework: Day25_relations.tex

We've defined relations like $\le$ in Coq... what are they like in mathematics?

Relations as sets of tuples

The conventional account of relations is that they are sets: a unary relation or "property" is just the set of elements that satisfy the relation; a binary relation is a set of pairs; and an $n$-ary relation is a set of $n$-tuples.

So we might say, concretely, that unary relation $\mathsf{isPrime}$ is just the set ${ 2, 3, 5, 7, 11, \dots }$, while the binary relation $\ge$ is the set ${ (0,0), (1, 0), (2, 0), \dots, (1, 1), (2, 1), \dots }$. We've already seen ternary relations: say, the mystery relation $R$ from day 16.

Here's an example of a binary relation we can use throughout: $B \subseteq \mathsf{RPS} \times \mathsf{RPS}$ defines the "beats" relation for rock/paper/scissors:

\[ B = \{ (\mathsf{rock}, \mathsf{scissors}), (\mathsf{scissors}, \mathsf{paper}), (\mathsf{paper}, \mathsf{rock}) \} \]

Relation notation

For a unary relation $R$---a relation on one thing---we might write:

$x \in R$, or
$R(x)$

to say that the relation holds. For a binary relation $S$, we might write:

$(x, y) \in S$, or
$x \mathrel{S} y$, or
$S(x, y)$

to say that $x$ is related to $y$ in $S$. The tuple membership notation using $\in$ is universally understood, but there are many different notations people use.

Concretely, we might say that $\mathsf{rock} \mathrel{B} \mathsf{scissors}$ or $(\mathsf{rock}, \mathsf{scissors}) \in B$ to say that $\mathsf{rock}$ beats $\mathsf{scissors}$.

To get nicer typesetting, use \mathrel{P} to tell $\LaTeX$ to interpret $P$ as a relation. Compare the spacing: x P y gives $x P y$ and x \mathrel{P} y gives $x \mathrel{P} y$. You can even use a macro, writing \newcommand{\P}{\mathrel{P}} to define a newcommand \P. Check out \mathbin (for binary operators, like $+$) and \mathord (to treat an operator or relation as an ordinary symbol), too.

Types

Each relation has a type; whenever you introduce a relation, you should give its type.

Unary relations are essentially subsets, i.e., $\mathsf{isPrime} \subseteq \mathbb{N}$ and $\mathsf{isPrime} \in \mathcal{P}(\mathbb{N})$ are two ways to give a type to $\mathsf{isPrime}$.

Binary (and on through higher arity) relations are subsets of a product set. So $\mathord{\ge} \subseteq \mathbb{N} \times \mathbb{N}$, and the mystery relation $R \subseteq \mathbb{N} \times \mathbb{N} \times \mathbb{N}$. Equivalently, we might say that $\mathord{\ge} \in \mathcal{P}(\mathbb{N} \times \mathbb{N})$ or that $R \in 2^{\mathbb{N} \times \mathbb{N} \times \mathbb{N}}$.

I gently prefer the $R \subseteq \dots$ notation to the various powerset notations.

Inference rules as sets of tuples

In Coq, we use inductive definitions to define relations. How does that work here? Let's recall the definition of sorted:

Inductive sorted: list nat -> Prop :=
  | sorted_nil          : sorted []
  | sorted_1    (x:nat) : sorted [x]
  | sorted_cons (x y:nat) (l:list nat) (Hxy : x <= y) (H : sorted (y::l)) : sorted (x::y::l).

We can turn the inductive Coq definition into an inductively defined unary relation $\newcommand{\sorted}{\textsf{sorted}}\sorted \subseteq \texttt{list}(\mathbb{N})$ as follows:

(sorted_nil) $\newcommand{\nil}{\texttt{[]}}\nil \in \sorted$
(sorted_1) $\texttt{[$x$]} \in \sorted$ for all $x \in \mathbb{N}$
(sorted_cons) if $y::l \in \sorted$ and $x \le y$ then $x::y::l \in \sorted$

Notice how the axioms sorted_nil and sorted_1 just become assertions about some members of the $\sorted$ relation, but the inference rule sorted_cons turns into an if/then statement about membership.

We could also use function notation, writing:

(sorted_nil) $\sorted(\nil)$
(sorted_1) $\sorted(\texttt{[$x$]})$ for all $x \in \mathbb{N}$
(sorted_cons) if $\sorted(y::l)$ and $x \le y$ then $\sorted(x::y::l)$

Or we could even just use English:

The empty list, $\nil$ is sorted.
A singleton list, $\texttt{[$x$]}$ is sorted (for any $x$).
A list $x::y::l$ is sorted when $y::l$ is sorted and $x \le y$.

All three of these approaches are good, common ways to define relations.

What about a binary relation, like $\newcommand{\perm}{\textsf{permutation}}\perm$? The idea is the same, but the syntax changes to account for being a binary relation. First, recall the Coq definition:

Inductive Permutation {A : Type} : list A -> list A -> Prop :=
| perm_nil: Permutation [] []
| perm_skip (x:A) (l l':list A) (H : Permutation l l') : Permutation (x::l) (x::l')
| perm_swap (x y:A) (l:list A) : Permutation (y::x::l) (x::y::l)
| perm_trans (l1 l2 l3 : list A) (H12 : Permutation l1 l2) (H23 : Permutation l2 l3) 
  : Permutation l1 l3.

Here's a way to define it directly using tuples:

$(\nil, \nil) \in \perm$
$(x::l, x::l') \in \perm$ when $(l, l') \in \perm$
$(x::y::l, y::x::l) \in \perm$
$(l_1, l_3) \in \perm$ when $(l_1, l_2) \in \perm$ and $(l_2, l_3) \in \perm$

It's not uncommon for binary relations to use infix notation. We might define a relation $\newcommand{\P}{\mathrel{P}}\mathord{\P} \subseteq \texttt{list}(A) \times \texttt{list}(A)$ for any set $A$ as:

$\nil \P \nil$
$x::l \P x::l'$ when $l \P l'$
$x::y::l \P y::x::l$
$l_1 \P l_3$ when $l_1 \P l_2$ and $l_2 \P l_3$

Here, we'd read $x \P y$ as "$x$ permutes $y$" or "$x$ is a permutation of $y$".

Properties of relations

We've already seen several properties of relations---let's meet their formal, set-theoretic definitions.

For the first three of these properties, we'll talk about relations "on" a set, i.e., relations which relate a set to itself, i.e., $R \subseteq A \times A$.

Reflexivity

A relation $R \subseteq A \times A$ is reflexive when it relates every element of $A$ to itself. Here are some examples of reflexive relations:

equality ($=$)
less-than-or-equal ($\le$)
greater-than-or-equal ($\le$)
permutation ($\perm$)

And here are some relations that aren't reflexive:

inequality ($\ne$)
less than ($<$)
greater than ($>$)
beats in rock-paper-scissors ($B$, above)

Formally, we say $R \subseteq A \times A$ is reflexive iff $\forall a \in A, (a,a) \in R$.

Symmetry

A relation $R \subseteq A \times A$ is symmetric when you can swap places. Here are some examples of symmetric relations:

equality ($=$)
inequality ($\ne$)
permutation ($\perm$)

And here are some relations that aren't symmetric:

less-than-or-equal ($\le$)
greater-than-or-equal ($\le$)
less than ($<$)
greater than ($>$)
beats in rock-paper-scissors ($B$, above)

Formally, we say $R \subseteq A \times A$ is symmetric iff $\forall a, b \in A, (a,b) \in R \Rightarrow (b,a) \in R$. Notice that we don't bother writing an if-and-only-if---one direction is enough, because $a$ and $b$ range over arbitrary elements from the set $A$.

Antisymmetry

A relation $R$ is antisymmetric when symmetry only occurs for equal things. That is, formally, $\forall a, b \in A, (a,b) \in A \wedge (b,a) \in A \Rightarrow a = b$. Partial orders like $\le$ and $\ge$ are classic examples of antisymmetric relations.

Transitivity

A relation $R \subseteq A \times A$ is transitive when you can combine hops. Here are some examples of transitive relations:

equality ($=$)
less-than-or-equal ($\le$)
greater-than-or-equal ($\le$)
less than ($<$)
greater than ($>$)
permutation ($\perm$)

And here are some relations that aren't transitive:

inequality ($\ne$)
beats in rock-paper-scissors ($B$, above)

Formally, we say $R \subseteq A \times A$ is transitive iff $\forall a, b, c \in A, (a,b) \in R \wedge (b,c) \in R \Rightarrow (a,c) \in R$.

Determinism

Finally, a relation $R \subseteq A \times B$ is deterministic when everything is related to at most one thing. Here some examples of deterministic relations:

equality ($=$)
beats in rock-paper-scissors ($B$, above)
is the successor of ($\{ (n, S(n)) \mid n \in \mathbb{N} \}$)
is the predecessor of ($\{ (S(n), n) \mid n \in \mathbb{N} \}$)

And here are some that are not deterministic:

inequality ($\ne$)
less-than-or-equal ($\le$)
greater-than-or-equal ($\le$)
less than ($<$)
greater than ($>$)
permutation ($\perm$)

Formally, we say that $R \subseteq A \times B$ is deterministic when: \[ \forall a \in A, b_1, b_2 \in B, (a, b_1) \in R \wedge (a, b_2) \in R \Rightarrow b_1 = b_2. \]

We can talk about higher arity relations being deterministic, too: we treat the last position specially. So $Q \subseteq A \times B \times C \times D$ is deterministic when $(a,b,c,d_1) \in R$ and $(a,b,c,d_2) \in R$ implies $d_1 = d_2$.

Equality is deterministic?

You might be surprised to see equality in the list of deterministic relations. How can equality be deterministic when $4 = 2 + 2$ but $4 = 3 + 1$?

The real question here is: what even is equality? Equality on the natural numbers is defined as:

\[ E \doteq \{ (n,n) \mid n \in \mathbb{N} \} \]

Where we say $n = m$ iff $(n,m) \in E$. (We write $\doteq$ in the definition to emphasize the difference between the defining equal sign and the symbol of the $=$ itself.)

If you wanted to talk about equality on some other set---say, the set of arithmetic expressions that evaluate to the same answer---then indeed that notion of equality is not deterministic.

Operations on relations

Inverse

Given a relation $R \subseteq A \times B$, we can talk about its inverse, written $R^{-1} \subseteq B \times A$, defined as $R^{-1} = \{ (b,a) \mid (a,b) \in R \}$. To put it another way:

\[ a \mathrel{R} b \Leftrightarrow b \mathrel{R^{-1}} a \]

Concretely, the inverse of $\le$ is $\ge$, i.e., $a \ge b$ iff $b \le a$. The inverse of our $B$ relation above is:

\[ B^{-1} = \{ (\mathsf{scissors},\mathsf{rock}), (\mathsf{paper},\mathsf{scissors}), (\mathsf{rock},\mathsf{paper}) \} \]

We might call $B^{-1}$ the "loses to" relation.

The notation here is an analogy to $x^{-1} = \frac{1}{x}$, i.e., the multiplicative inverse of $x$. We'll learn more about inverses when we look at inverses of functions.

Composition

Relations can be composed or chained together. Suppose we had the following two relations: $R \subseteq \mathsf{City} \times \mathsf{Dish}$ and $S \subseteq \mathsf{Dish} \times \mathsf{Ingredient}$. We define them as:

\[ \begin{array}{rcl} R &=& \{ (\mathsf{Vienna}, \mathsf{schnitzel}), (\mathsf{Philadelphia}, \mathsf{cheesesteak}), (\mathsf{Puebla}, \mathsf{al pastor}) \} \\ S &=& \{ (\mathsf{schnitzel}, \mathsf{veal}), (\mathsf{schnitzel}, \mathsf{pork}), \\ && \phantom{\{} (\mathsf{cheesesteak}, \mathsf{wiz}), (\mathsf{cheesesteak}, \mathsf{provolone}), (\mathsf{hargao}, \mathsf{shrimp}) \} \\ \end{array} \]

We can compose these two relations together to get a relation between cities and ingredients: think of it as using the dish as a transitive "hop", a mathematical layover flight. We'd compose them to find a new relation:

\[ T = \{ (\mathsf{Vienna}, \mathsf{veal}), (\mathsf{Vienna}, \mathsf{pork}), (\mathsf{Philadelphia}, \mathsf{wiz}), (\mathsf{Philadelphia}, \mathsf{provolone}) \} \]

We say that $T = R;S$ or $T = S \circ R$. These differing notations with differing orderings are indeed confusing. Here the types of $R$ and $S$ make it unambiguous---there's only one way to compose these two relations. If the types make it ambiguous, always check which way folks mean! I tend to prefer the $R;S$ notation, but we'll meet $\circ$ again.

Formally, we define composition as:

\[ R;S = \{ (a,c) \mid (a,b) \in R, (b, c) \in S \} \]

Use this definition to check for yourself that $T = R;S$. Notice that some things are left unrelated: $\mathsf{al pastor}$ isn't related to anything by $S$, and $\mathsf{hargao}$ isn't related to anything by $R$... so $\mathsf{Puebla}$ and $\mathsf{shrimp}$ are left out of $T$.

Closure operations

It's not uncommon to have a "starter" relation that one defines and then modifies in some way. Since relations are just sets, all of the set operations apply---union, intersection, difference, and restriction (a/k/a set builder notation, $\{(a,b) \in R \mid P(a, b)\}$) are all meaningful and useful things to do relations.

There are three so-called closure operations that take a relation and produce a similar relation that also enjoys a useful relational property.

Reflexive closure and the diagonal relation

Given a relation $R \subseteq A \times A$, we can make it reflexive by taking the reflexive closure, i.e., taking $R$ and adding in some tuples to make sure that our new relation is reflexive.

For example, the reflexive closure of $<$ is $\le$ and the reflexive closure of $>$ is $\ge$. The reflexive closure $\ne$ is the total relation---the relation that relates everything.

Formally, we define the reflexive closure of $R \subseteq A \times A$ as $R \cup \{ (a,a) \mid a \in A \}$.

The diagonal relation and the matrix intuition

People sometimes call $\Delta_A = \{ (a,a) \mid a \in A \}$ the diagonal relation on $A$. Why?

One way to visualize a relation $R \subseteq A \times A$ is as a (possibly infinite) matrix. Let the rows and columns correspond to elements of $A$ in some fixed order, and $i$th row and the $j$th column is $\top$ iff $(a_i, a_j) \in R$.

Here's how we'd write $B$ for $\mathsf{RPS}$:

\[ \begin{array}{c|ccc} & \mathsf{rock} & \mathsf{paper} & \mathsf{scissors} \\ \hline \mathsf{rock} & \bot & \bot & \top \\ \mathsf{paper} & \top & \bot & \bot \\ \mathsf{scissors} & \bot & \top & \bot \\ \end{array} \]

Notice that $\mathsf{paper}$ beats $\mathsf{rock}$, so the row for $\mathsf{paper}$ has $\top$ in the column for $\mathsf{rock}$.

Now, suppose we have some arbitrary set $A$. Here's $\Delta_A$:

\[ \begin{array}{c|cccc} & a_1 & a_2 & a_3 & \dots \\ \hline a_1 & \top & \bot & \bot & \dots \\ a_2 & \bot & \top & \bot & \dots \\ a_3 & \bot & \bot & \top & \dots \\ \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{array} \]

The diagonal of the matrix for $\Delta_A$ is all $\top$ and everything else is $\bot$!

Question: if $R$ is already reflexive, does taking the reflexive closure produce the same relation or a different one?

Question: does it make sense to take the reflexive closure of a relation $R \subseteq A \times B$?

Symmetric closure and the inverse operation

Given a relation $R \subseteq A \times A$, we can generate a similar relation that enjoys symmetry by adding in $R$'s inverse. For example, the symmetric closure of $\le$ is the total relation; the symmetric closure of $<$ is $\ne$.

Formally, we define the symmetric closure of a relation $R \subseteq A \times A$ as $R \cup R^{-1}$.

Question: how would you prove that the symmetric closure produces symmetric relations?

Question: what is the symmetric closure of $B \subseteq \mathsf{RPS} \times \mathsf{RPS}?$

Question: if $R$ is already symmetric, does the taking the symmetric closure produce the same relation or a different one?

Question: does it make sense to take the symmetric closure of a relation $R \subseteq A \times B$?

Transitive closure and the composition operation

Finally, we can take a relation $R \subseteq A \times A$ and produce a similar relation that's transitive by iteratively composing $R$ with itself, i.e., taking $R \cup (R;R) \cup (R;R;R) \cup \dots$.

Negation

Let the relation $A = \{ (\top, \bot), (\bot, \top) \}$, i.e., $A \subseteq \texttt{bool} \times \texttt{bool}$ is the relation corresponding to negb, boolean negation.

What is $A;A$? It's defined as $\{ (x,z) \mid (x,y) \in A \wedge (y,z) \in A \}$. On the one hand, we could say $x = \top$ and $y=\bot$ (first tuple) and $z=\top$ (second tuple). So $(\top, \top) \in A;A$. On the other hand, we could say $x = \bot$ and $y = \top$ (second tuple) and $z = bot$ (first tuple). So $(\bot, \bot) \in A;A$.

We've looked at all pairs of tuples in $A$, so $A;A = \{ (\top,\top), (\bot, \bot) \} = \Delta_{\texttt{bool}}$. To compute the transitive closure of $A$, we really need to compute $\bigcup_{1 \le i} A^i = A^1 \cup A^2 \cup A^3 \cup \dots = A \cup (A;A) \cup (A;A;A) \cup \dots$.

We've already got the first two parts of that infinite union: $A$ and $A;A$. $A \cup A;A = \{ (\top, \bot), (\bot, \top) \} \cup \{ (\top,\top), (\bot, \bot) \} = \texttt{bool} \times \texttt{bool}$. Okay, we can stop---we've saturated, there's no way to add anything more.

Operational intuition

While we formally define transitive closure using an infinite union, there's a clean operational intuition for finding the transitive closure of some relation $R$. Pick a tuple $(a,b) \in R$. Find every tuple of the form $(b,c) \in R$ for some $c$, and now add $(a,c)$ to $R$. Keep this process up until you can't find any new tuples---and then you're done!

What we've just described is called a least fixed point operation. They turn up all over computer science.

Successor

For example, the transitive closure of the successor relation $N \subseteq \mathbb{N} \times \mathbb{N} = \{ (n, S(n)) \mid n \in \mathbb{N} \}$ is $<$. How come? Here's a matrix for $N$ itself:

\[ \begin{array}{c|cccc} & 0 & 1 & 2 & 3 & 4 & \dots \\ \hline 0 & \bot & \top & \bot & \bot & \bot & \dots \\ 1 & \bot & \bot & \top & \bot & \bot & \dots \\ 2 & \bot & \bot & \bot & \top & \bot & \dots \\ 3 & \bot & \bot & \bot & \bot & \top & \dots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{array} \]

Here's $N;N$:

\[ \begin{array}{c|cccc} & 0 & 1 & 2 & 3 & 4 & \dots \\ \hline 0 & \bot & \bot & \top & \bot & \bot & \dots \\ 1 & \bot & \bot & \bot & \top & \bot & \dots \\ 2 & \bot & \bot & \bot & \bot & \top & \dots \\ 3 & \bot & \bot & \bot & \bot & \bot & \dots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{array} \]

And here's $N;N;N$:

\[ \begin{array}{c|cccc} & 0 & 1 & 2 & 3 & 4 & \dots \\ \hline 0 & \bot & \bot & \bot & \top & \bot & \dots \\ 1 & \bot & \bot & \bot & \bot & \top & \dots \\ 2 & \bot & \bot & \bot & \bot & \bot & \dots \\ 3 & \bot & \bot & \bot & \bot & \bot & \dots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{array} \]

Notice that with each composition, the $\top$'s "march" one step further right. If we take infinitely many, we can find $(n, n+m) \in N;\dots;N$, where we've composed $N$ with itself $m$ times.

Formalism

People use the following notation:

\[ \begin{array}{rcl} R^0 &=& \Delta_A \\ R^1 &=& R \\ R^{n+1} &=& R;R^n \\ \end{array} \]

That is, $R^n$ is $R$ composed with itself $n$ times.

And then we formally define transitive closure as:

\[ R^+ = \bigcup_{1 \le i} R^i \]

People also define the reflexive, transitive closure as:

\[ R^* = \bigcup_{0 \le i} R^i = \Delta_A \cup R^+ \]

The star in $R^*$ is called "Kleene star", after Stephen Cole Kleene. You'll learn more about it in CS 101!

Question: if $R$ is already transitive, does the taking the transitive closure produce the same relation or a different one?

Question: does it make sense to take the transitive closure of a relation $R \subseteq A \times B$?

Functions as relations

We've spent most of this chapter understanding relations as a special kind of set. It turns out that we can understand a function as a special kind of relation. Recall our "beats" relation, $B \subseteq \mathsf{RPS} \times \mathsf{RPS}$:

\[ B = \{ (\mathsf{rock}, \mathsf{scissors}), (\mathsf{scissors}, \mathsf{paper}), (\mathsf{paper}, \mathsf{rock}) \} \]

We say $\mathsf{rock} \mathrel{B} \mathsf{scissors}$ to say that $\mathsf{rock}$ beats $\mathsf{scissors}$. But we could also think of $B$ as a function: given an element $x$ of $\mathsf{RPS}$, our relation $B$ identifies a unique other element that $x$ beats. We might even go so far as to say $B(\mathsf{rock}) = \mathsf{scissors}$, using function notation.

But we have to be careful---not every relation is a function! For example, the $<$ relation isn't a function: $0 < 1$ and $0 < 2$.

Which relations are functions, and what justifies using the function notation $f(x)$? Two characteristics are critical:

determinism, and
totality.

Determinism a/k/a functionality

A relation is deterministic when: \[ \forall a \in A, b_1, b_2 \in B, (a, b_1) \in R \wedge (a, b_2) \in R \Rightarrow b_1 = b_2. \]

To be considered a function, a relation must be deterministic. Such relations are also called functional, i.e., they can be treated like functions. That is, a relation $R \subseteq A \times B$ is a function from $A \rightarrow B$ when each $a \in A$ is related to at most one element in $B$.

So $B$ is a function, and so are the successor relation ($\{ (n, S(n)) \mid n \in \mathbb{N} \}$) and the predecessor relation ($\{ (S(n), n) \mid n \in \mathbb{N} \}$). The relation that relates a list to its length is a function, but the relation that relates a length to every list of that length isn't. None of $=$ and $\ne$ and $<$ and $\perm$ are functions.

Totality

We said a functional relation $R \subseteq A \times B$ relates each element in $A$ to at most one element in $B$. Two of our examples---the $B$ relation on $\mathsf{RPS}$ and the successor relation on $\mathbb{N}$---relate every $A$ to a $B$. But the predecessor relation doesn't relate $0$ to anything.

We say that $B$ and successor are total functions, while predecessor is a partial function. Formally, a relation $R \subseteq A \times B$ is total when:

\[ \forall a \in A, \exists b \in B, (a, b) \in R \]

Some of the nomenclature here is confusing: a relation "is total" when it relates everything to at least one thing; the "total relation" relates everything to everything.

Function notation

When a relation $R \subseteq A \times B$ is deterministic and total, we can justify writing (1) $R : A \rightarrow B$ to say that $R$ is a function, and (2) $R(a)$ to uniquely determine an element $b \in B$ such that $(a, b) \in R$. Why?

Since $R$ is total, we know that there is at least one $b \in B$ such that $(a, b) \in R$. Since $R$ is deterministic, we know that at most one such $b \in B$. So there must be exactly one $b \in B$ such that $(a,b) \in R$. We say that $R(a)$ denotes that unique $b$.

People will sometimes write $\exists^! b \in B, (a,b) \in R$ to mean "there exists exactly one". You can think of this notation as:

\[ \exists^! x \in S, P(x) \equiv \exists x \in S, \left[ P(x) \wedge \forall y \in S, P(y) \Rightarrow x =y \right] \]

One final note: the general convention is that $f$ and $g$ and the like are common function variable names, while $R$ and $S$ are common relation names. When we start defining relations and then showing that they're functions, we'll end up with confusing utterances like $R : A \rightarrow B$. There's no avoiding it for now, but when you're doing you're own work, you can try to plan names that are appropriate (i.e., lowercase ones for functions and uppercase ones for relations).

Revisiting exponentiation

If we write $A \rightarrow B$ as $B^A$ (as we saw in day 24), we can see an interesting fact: a function from $A$ to $B$ is an $A$-fold product of $B$'s, i.e., for each element of $A$ we have a $B$. Concretely, a function $f : \texttt{bool} \rightarrow \mathbb{N}$ can be described as $\mathbb{N} \times \mathbb{N}$: the number for $f(\top)$ and the number for $f(\bot)$. That is, we could model the function

\[ f(b) = \begin{cases} 47 & b = \top \\ 54 & b = \bot \\ \end{cases} \]

As a relation $f = \{ (\top, 47), (\bot, 54) \} \subseteq \texttt{bool} \times \mathbb{N}$. But we can also see that $f = (47, 54) \in \mathbb{N}^\texttt{bool}$ using the ordering $\top, \bot$. Or we might write a mapping, as in $f = \langle \top \mapsto 47, \bot \mapsto 54 \rangle$. More generally:

\[ \begin{array}{rc} A \rightarrow B &= B^A =& B \times B \times \dots \times B \\ && \text{$|A|$ times} \\ \end{array} \]

So we might say:

\[ f = (b_1, b_2, \dots, b_{|A|}) = \langle a_1 \mapsto b_1, a_2 \mapsto b_2, \dots, a_{|A|} \mapsto b_{|A|} \rangle \]

The first notation simply fixes some order for the elements of $A$; the second notation is a bit more explicit, using angle brackets to differentiate this $A$-wise mapping from an ordinary tuple. (Folks use square or curly brackets, too.) You can think of this mapping as a record or object, where each field of that object is an element of $A$.

In any case, if $A$ is infinite, the dots above can only be informally interpreted... but maybe this intuition helps you!

Parts of a function

There are some key terms for thinking about functions that are useful to have in your pocket: domain and co-domain, and pre-image and image.

Domain and codomain

When $f : A \rightarrow B$, we say that $f$'s domain is $A$ and its codomain is $B$. People write $\operatorname{dom}(f)$ to mean the domain and $\operatorname{cod}(f)$ to mean the codomain. So $B : \mathsf{RPS} \rightarrow \mathsf{RPS}$ has $\operatorname{dom}(B) = \operatorname{cod}(B) = \mathsf{RPS}$, while $\texttt{sum} : \texttt{list}(\mathbb{N}) \rightarrow \mathbb{N}$ has domain $\texttt{list}(\mathbb{N})$ and codomain $\mathbb{N}$.

Image

Recall the successor relation, $\{ (n, S(n)) \mid n \in \mathbb{N} \}$. Let's call it a function, $\mathit{succ}$. (Now, note that $\mathit{succ}(n) = S(n)$, so this definition doesn't get us anywhere we couldn't go already---but let's ignore that for now!)

The image of $\mathit{succ}$ is the set of things that it maps to, i.e., $\mathrm{Image}(\mathit{succ}) = \{ m \mid \exists n, \mathit{succ}(n) = m \}$. Here, the image of $\mathit{succ}$ is the positives: there is no $n$ such that $\mathit{succ}(n) = 0$!

Formally, we say that:

\[ \mathrm{Image}(f) = \{ y \in \operatorname{cod}(f) \mid \exists x \in \operatorname{dom}(f), f(x) = y \} \]

Observe that $\mathrm{Image}(f) \subseteq \operatorname{cod}(f)$ by definition.

Pre-image

Consider the function $\texttt{sum} : \texttt{list}(\mathbb{N}) \rightarrow \mathbb{N}$. Which lists $l$ yield $sum(l) = 0$? Well, the empty list $\texttt{[]}$ is surely one. But so is $\texttt{[$0$]}$, and $\texttt{[$0$, $0$]}$, and so on.

Another way to phrase our question is to ask what the pre-image (a/k/a the inverse image) of $\{ 0 \}$ is for $\texttt{sum}$. For a given set $B \subseteq \operatorname{cod}(f)$, the pre-image of $B$ under $f$ is the set of inputs which will cause $f$ to produce an element in $B$. Formally:

\[ \mathrm{Pre}_f(B) = \{ x \in \operatorname{dom}(f) \mid f(x) \in B \} \]

So $\mathrm{Pre}_{\mathit{succ}}(\{ 0 \}) = \{ l \mid \forall x, x \text{ in } l \Rightarrow x = 0 \}$.

Question: It's a theorem that $\mathrm{Pre}_f(\operatorname{cod}(f)) = \operatorname{dom}(f)$ iff $f$ is a total function. How would you prove it?

Home on the range

You may have heard the word "range" when talking about a function. There's a problem, though: some people think a function's range is its codomain, and some think a function's range is its image. Rather than have to constantly clarify, it's better to simply avoid the word and use more precise language.

Properties of functions

There are a number of important properties of functions. There are three in particular we'll talk about: partiality/totality, injectivity, and surjectivity.

Partial functions

We learned about partial functions already: a function $f : A \rightarrow B$ is called total when $f(x)$ is defined for all $x \in A$. A function is partial when it isn't total, often writing $f : A \rightharpoonup B$ to say that $f$ is partial.

We've already met many partial functions: $\texttt{pred} : \mathbb{N} \rightharpoonup \mathbb{N}$ is a partial function, as is subtraction on naturals. But note that in Coq, every function was total---we made $\texttt{pred}$ total by saying $\texttt{pred}(0) = 0$. In math, folks are usually content to leave $\texttt{pred}(0)$ undefined.

Undefinedness is a harzard to navigation. If a function $g$ is partial, then you must always be careful: if you say $g(x)$, need to know that $g$ is defined for $x$!

Other examples of partial functions are: square root (not defined on negative numbers, at least for the reals $\mathbb{R}$), division (not defined when its second argument is 0), the head and tail functions on lists (not defined on empty lists), and the mean function on lists (not defined on empty lists). You'll meet many others as you learn more!

Confusingly, some people say total functions are a subset of the partial functions, i.e., every total function is also a partial function. I find that confusing!

Injectivity a/k/a one-to-one

A function is injective (from Latin for "throwing in") when no two inputs yield the same output. Some examples of injective functions are successor and (true) predecessor. Some non-injective functions are $f(x) = x^2$ (since $f(-5) = f(5)$) or the Coq predecessor function (which says pred 1 = pred 0 = 0).

Formally, a function $f : A \rightarrow B$ is injective when:

\[ \forall a_1, a_2 \in A, f(a_1) = f(a_2) \Rightarrow a_1 = a_2 \]

Note that we've seen a very similar but not identical property in determinism. A relation is deterministic when no input has two outputs; a function is injective when no output has two inputs.

People will say that an injective function is one-to-one or 1:1, meaning that each (one) input is mapped uniquely to one input.

Surjectivity a/k/a onto

A function is surjective (from Latin for "throwing on") when everything in the codomain is a possible output, i.e., there's no part of the codomain that the function can't 'hit'. Examples of surjective functions are predecessor (true or Coq's truncating), and list length. Some non-surjective functions are successor (0 isn't the successor of any natural) and absolute value on the integers, $\mathbb{Z}$, or reals, $\mathbb{R}$ (absolute value doesn't hit negative numbers).

Formally, a function $f : A \rightarrow B$ is surjective when:

\[ \forall b \in B, \exists a \in A, f(a) = b \]

Or, to put it another way, a function is onto when its codomain and its image coincide.

Note that we've seen a very similar but not identical property in totality. A relation is total when every input has an output; a function is surjective when every output has an input. In both cases, the existentials are of the "at least one" form.

People will say that surjective function onto, meaning that the domain is mapped 'onto' the codomain in its entirety.

Bijectivity

A function bijective (from Latin for "throwing both"... but it's really just bi- "both" from the common -jective "throwing) when it's both injective and surjective.

Bijective functions are also called invertible functions, isomorphisms (from Greek isos "same, equal", morphos "shape, form"), or---and this is most confusing---a one-to-one correspondence, not to be confused with a function being "one to one". (Best to know about but not use this form.)

An example of a bijection on $\mathbb{N}$ is following the $\mathit{swap}$ function:

\[ \mathit{swap}(n) = \begin{cases} n+1 & n \text{ even} \\ n-1 & n \text{ odd} \\ \end{cases} \]

Question: How would you prove that $\mathit{swap}$ was bijective?

Function composition

We've already seen relation composition. If functions are relations, what does composition of functions mean? Let's consider two functions on the naturals, $f, g : \mathbb{N} \rightarrow \mathbb{N}$:

\[ \begin{array}{rcl} f(x)) &=& x + 1 \\ g(y) &=& 2\cdot y \\ \end{array} \]

Using composition, we can build a new function, $h : \mathbb{N} \rightarrow \mathbb{N}$ where $h(z) = 2 \cdot z + 1$. We could simply say that:

\[ h = f \circ g = g ; f\]

(We use $\circ$ rather than $;$, as the former notation is slightly more common on functions.) How does that work? Recall the definition of composition:

\[ S \circ R = R;S = \{ (a,c) \mid (a,b) \in R, (b, c) \in S \} \]

Let's compute the definition of $h$:

\[ \begin{array}{rcl} h &=& \{ (a,c) \mid (a,b) \in g, (b, c) \in f \} \\ &=& \{ (a,c) \mid g(a) = b, f(b) = c \} \\ &=& \{ (a,c) \mid 2 \cdot a = b, b + 1 = c \} \\ &=& \{ (a,b + 1) \mid 2 \cdot a = b \} \\ &=& \{ (a,(2 \cdot a) + 1) \} \\ \end{array} \]

That is, $h(z) = 2 \cdot z + 1$!

We don't need a special definition for function composition---it's just a special case of relation composition---but it may helpful to have it spelled out. Formally, if $f : B \rightarrow C$ and $g : A \rightarrow B$, then we define the composition of $f$ and $g$, written $f \circ g$ or $g;f$ as:

\[ (f \circ g)(x) = f(g(x)) \]

Question: It's a theorem that $\texttt{map}(f, \texttt{map}(g, l)) = \texttt{map}(f \circ g, l)$. How might you prove it?

Function inverses

One final concept: function inverses. We've defined inverse relations as mere swapping: $R^{-1} = \{ (b,a) \mid (a,b) \in R \}$, so we might expect $f^{-1}(x) = y$ when $f(y) = x$.

The $\mathsf{RPS}$ beats relation, $B$, is a function and has an inverse: the loses-to function. What is $f^{-1}$, and when does it exist?

If $f : A \rightarrow B$, then $f$'s inverse, written $f^{-1} : B \rightarrow A$ is a function that does the "opposite" of $f$. In particular, $(f \circ f^{-1})(x) = x$, i.e., a function and its inverse 'cancel' each other and compose to form the identity.

Involutive functions

Recall the function $\mathit{swap}$ (from our discussion of bijectivity)---$\mathit{swap}$ is its own inverse. Similarly, negb is its own inverse, as is complement and reverse... and any other involutive function. We've stumbled on an interesting fact: involutive functions---like negb, complement, and reverse---are their own inverses.

Oddly, the inverse itself is involutive: if $f^{-1} = g$, then $g^{-1} = f$. To put it another way, $(f^{-1})^{-1} = f$.

A non-example

Consider the constant-five function, $g : \mathbb{N} \rightarrow{N}$, where $g(x) = 5$ for all $x$. Could $g$ have an inverse? It's impossible: $g(0) = 5$ and $g(1) = 5$... what could $g^{-1}(5)$ be? It couldn't be the case that $g^{-1}$ is even deterministic! Worse still, what would $g^{-1}(3)$ be? Since there is no $x$ such that $g(x) = 3$, it can't be the case that $g^{-1}$ is total, either.

Question: Before you read the next two sections... when do you think a function has an inverse? If you have an idea, test it with a few examples.

Mirror images

To see which functions have inverses, we'll take a detour through interesting function properties. Let's recall the four we're interested in most for a given function $f : A \rightarrow B$, rewriting determinism and totality to use function notation:

Determinism: $\forall a \in A, b_1, b_2 \in B, f(a) = b_1 \wedge f(a) = b_2 \Rightarrow b_1 = b_2$
Totality: $\forall a \in A, \exists b \in B, f(a) = b$
Injectivity: $\forall a_1, a_2 \in A, f(a_1) = f(a_2) \Rightarrow a_1 = a_2$
Surjectivity: $\forall b \in B, \exists a \in A, f(a) = b$

We can pair them up: determinism and injectivity; totality and surjectivity. Where determinism uniquely determines the codomain, injectivity uniquely determines the domain. Where totality covers the domain, surjectivity covers the codomain. There's a mirror image quality here---a duality.

Theorem: If $f$ is injective, then $f^{-1}$ is a deterministic relation (and vice versa).

Theorem: If $f$ is surjective, then $f^{-1}$ is a total relation (and vice versa).

How might you prove these theorems? There are some interesting consequences: injections have inverses, but they might be partial (because not everything in the domain may be 'hit' by the original function); surjections have inverses, but they might not be functions (because they might not be deterministic).

Bijective functions have inverses

Suppose we have $f : A \rightarrow B$, bijective. That means that $f$ is deterministic and total (since it's a function!) as well as injective and surjective. What do we know about $f^{-1}$?

The two theorems above tell us quite a bit:

$f$ is deterministic, so $f^{-1}$ is injective
$f$ is total, so $f^{-1}$ is surjective
$f$ is injective, so $f^{-1}$ is deterministic
$f$ is surjective, so $f^{-1}$ is total

That is, $f^{-1}$ is also deterministic, total, injective, and surjective---$f^{-1} : B \rightarrow A$ is also a bijection.

Bijections are called invertible functions because they necessarily have inverses which are themselves bijections.

CS054 Discrete Mathematics and Functional Programming

Fall 2020

Day 25 - Set Theoretic Relations and Functions