$\newcommand{\C}[2]{\left( \begin{array}{@{}c@{}} {#1} \\ {#2} \\ \end{array} \right)}$

Pascal's Triangle

In the choice worksheet, we show Pascal's triangle; here are the first seven rows (source).

{\displaystyle {\begin{array}{c}1\\1\quad 1\\1\quad 2\quad 1\\1\quad 3\quad 3\quad 1\\1\quad 4\quad 6\quad 4\quad 1\\1\quad 5\quad 10\quad 10\quad 5\quad 1\\1\quad 6\quad 15\quad 20\quad 15\quad 6\quad 1\\1\quad 7\quad 21\quad 35\quad 35\quad 21\quad 7\quad 1\\\end{array}}}

If you haven't taken a look at that worksheet yet, please read it and work through it as much as you can. There's no need to get the right answer the first time, but you'll get the most out of this chapter if you do the reflective exercises first.

Pascal's Identity

One thing to notice in Pascal's triangle is that the each value is the sum of the values diagonally above it, as demonstrated by this animation, taken from Wikipedia:

Another observation to note is that the $k$th entry in the $n$th row is $\C{n}{k}$, e.g., for the fourth row:

\[ \begin{array}{ccccc} \C{4}{0} & \C{4}{1} & \C{4}{2} & \C{4}{3} & \C{4}{4} \\ \parallel & \parallel & \parallel & \parallel & \parallel \\ 1 & 4 & 6 & 4 & 1 \\ \end{array} \]

These two facts combined suggest the following equation:

\[ \C{n}{k} = \C{n-1}{k-1} + \C{n-1}{k} \]

That equation holds for $0 < n$ and $0 \le k \le n$, if you say that $\C{n}{k} = 0$ when $n < k$ or when $n$ or $k < 0$. How might you prove that Pascal's identity holds?

Choice, recursively

Pascal's identity suggests an alternative definition for choice. We originally said that:

\[ \C{n}{k} = \frac{n!}{k! \cdot (n-k)!} \]

But we could instead define a recursive function $C : \mathbb{N} \times \mathbb{N} \rightarrow \mathbb{N}$:

\[ C(n, k) = \begin{cases} 0 & n < k \\ 1 & n = k = 0 \\ C(n-1, k-1) + C(n-1, k) & \text{otherwise} \\ \end{cases} \]

How might you prove that $C(n, k) = \C{n}{k}$?

Sums: a digression

Before getting to the heart of today's material, let's take a tiny diversion to talk about summations or sums. If you've taken calculus, you might have already discussed series and summations. In particular, one might define the (finite) series:

\[ 1 ~~~ 4 ~~~~ 6 ~~~~ 4 ~~~~ 1 \]

And its summation:

\[ 1 + 4 + 6 + 4 + 1 = 16 \]

Mathematicians tend to use $\sum$ to denote sums---the capital Greek letter sigma (pronounced like 's' for 'summation'!). So we might write the summation above as:

\[ \sum_{k=0}^4 \C{4}{k} = \C{4}{0} + \C{4}{1} + \C{4}{2} + \C{4}{3} + \C{4}{4} = 1 + 4 + 6 + 4 + 1 = 16 \]

Let's take a closer look at the formula $\sum_{k=0}^4 \C{4}{k}$. (Notice that $\LaTeX$ typesets sums slightly differently in display and inline math modes.) The $\sum$ itself just says "hey, we're doing a sum". The $k=0$ says that $k$ will be index of our sum, and it will start at 0---our lower bound. The 4 on top says that we'll stop at 4---our upper bound. Note that sum indices are treated inclusively, while most programming languages treat the lower bound inclusively but the upper bound exclusively. The expression after the $\sum$---here, $\C{4}{k}$---is an arbitrary arithmetic expression in terms of $k$ that says what to do as we "iterate" the sum.

You can think of sums as being defined recursively as follows:

\[ \begin{array}{rclr} \displaystyle \sum_{i=l}^u f(i) &=& 0 & \text{when $l > u$} \\ \displaystyle \sum_{i=l}^u f(i) &=& \displaystyle f(l) + \sum_{i=l+1}^u f(i) & \text{otherwise} \\ \end{array} \]

Or, we can express it a iteratively (but a bit less formally):

\[ \sum_{i=l}^u f(i) = f(l) + f(l+1) + \dots + f(u-1) + f(u) \]

Or, we can express it even more simply, as a loop (noting that Python's range function is a typical one, with inclusive lower bound and exclusive upper bound):

def sum(lower, upper, f):
    total = 0
    for i in range(lower, upper+1):
        total += f(i)
    return total

To return to our example sum above, we've chosen $l=0$ and $u=4$, where $f(k) = \C{4}{k}$, so $\sum_{i=l}^u f(i) = f(0) + f(1) + f(2) + f(3) + f(4)$, which is equal to the first expansion of our sum above.

Identities

There are many ways to work with sums, and having some facility with them is important when you analyze the complexity of various algorithms. Here are some interesting identities, though there are many more on Wikipedia:

\[ \begin{array}{cccr} \displaystyle \sum_{i=l}^u f(i) = \sum_{i=l-1}^{u-1} f(i+1) & ~~~~~~~~ & \displaystyle \sum_{i=l}^u f(i) = \sum_{i=l+1}^{u+1} f(i-1) & \text{shifting}\\[2em] \displaystyle \sum_{i=l}^u f(i) = f(l) + \sum_{i=l+1}^u f(i) && \displaystyle \sum_{i=l}^u f(i) = f(u) + \sum_{i=l}^{u-1} f(i) & \text{extracting} \\[2em] \displaystyle \sum_{i=l}^u f(i) + \sum_{i=l}^u g(i) = \sum_{i=l}^u \left[ f(i) + g(i) \right] && \displaystyle \sum_{i=l}^u f(i) - \sum_{i=l}^u g(i) = \sum_{i=l}^u \left[ f(i) - g(i) \right] & \text{combining} \\[2em] \displaystyle \sum_{i=l}^u C \cdot f(i) = C \cdot \left[ \sum_{i=l}^u f(i) \right] && \displaystyle \sum_{i=l}^u \frac{f(i)}{C} = \frac{\sum_{i=l}^u f(i)}{C} & \text{distributing} \end{array} \]

Notations

Finally, there are some important notations that you'll see in the future---we'll mention them here. None of the following information is critical for this course.

First, the default lower bound is zero. If you see $\displaystyle\sum_i^{10} \C{10}{i}$, then that's the same as $\displaystyle\sum_{i=0}^{10} \C{10}{i}$.

Next, multiple sums are often combined. If you see $\displaystyle\sum_{i,j}^n f(i,j)$, that means $\displaystyle\sum_{i,j=0}^n f(i,j)$ which means $\displaystyle\sum_{i=0}^n \sum_{j=0}^n f(i, j)$. Notice that this compressed "single-$\sum$" syntax only works when the bounds are the same---otherwise you have to write two distinct sums. These nested sums are like nested for loops. You can freely interchange nested sums so long as there's no dependency:

\[ \displaystyle \sum_{i=0}^n \sum_{j=0}^m f(i,j) = \sum_{j=0}^m \sum_{i=0}^n f(i,j) \]

But you have to be careful---there could be dependency, and then you can't swap them. The following isn't even coherent mathematics:

\[ \displaystyle \sum_{i=0}^n \sum_{j=0}^i f(i,j) \ne \sum_{j=0}^i \sum_{i=0}^n f(i,j) \]

In the sum on the right, what is $i$ in the outer sum?!

The Binomial Theorem

Another thing to notice about Pascal's triangle is that the sum of the $n$th row is $2^n$. For example, we've already seen the fourth row:

\[ \sum_{k=0}^4 \C{4}{k} = \C{4}{0} + \C{4}{1} + \C{4}{2} + \C{4}{3} + \C{4}{4} = 1 + 4 + 6 + 4 + 1 = 16 \]

You can verify for yourself that other rows enjoy this property. So we can say:

\[ 2^n = \sum_{k=0}^n \C{n}{k} \]

How might you prove that equation is a theorem?

Binomials

Binomial means "two named", from Latin: bi means two (e.g., biscuits are twice cooked) and nomen means name. A binomial is a polynomial with two variables, e.g., $x^2 + 2xy + y^2 = (x+y)^2$ is a binomial in the variables $x$ and $y$. It turns out that Pascal's triangle has a close relationship to binomials! To see why, let's write out $(x+y)^n$ for a variety of $n$.

\[ \begin{array}{rcc} (x+y)^0 &=& 1\\ (x+y)^1 &=& 1 \cdot x + 1 \cdot y \\ (x+y)^2 &=& 1 \cdot x^2 + 2 \cdot x y + 1 \cdot y^2 \\
(x+y)^3 &=& 1 \cdot x^3 + 3 \cdot x^2 y + 3 \cdot x y^2 + 1 \cdot y^3 \\ (x+y)^4 &=& 1 \cdot x^4 + 4 \cdot x^3 y + 6 \cdot x^2 y^2 + 4 \cdot xy^3 + 1 \cdot y^4 \\ \end{array} \]

Will you take a look at that! The coefficients of each term end up matching Pascal's triangle exactly. In fact, people call the choice operation $\C{n}{k}$ the "binomial coefficient", since $\C{n}{k}$ computes the $k$th coefficient of the $(x+y)^n$.

Aside: choice is symmetric

Notice that it's the $k$th when you list it in an organized way like we did, with $x$'s exponent decreasing and $y$'s increasing. We could flip the order and get the same answer---because choice is symmetric:

\[ \C{n}{k} = \C{n}{n - k} \]

The Theorem

We can use this observation to come up with a theorem---which you'll prove in your homework.

\[ (x+y)^n = \sum_{k=0}^n \C{n}{k} x^{n-k} y^{k} \]

Test it out to confirm for yourself: the $n-k$ and $k$ business can be a bit confusing. Here's an example at $n=4$:

\[ \begin{array}{rl} & (x+y)^4 \\ = & \sum_{k=0}^4 \C{4}{k} x^{4-k} y^{k} \\ = & \C{4}{0} \cdot x^4 + \C{4}{1} \cdot x^3 y + \C{4}{2} \cdot x^2 y^2 + \C{4}{3} \cdot xy^3 + \C{4}{4} \cdot y^4 \\ = & 1 \cdot x^4 + 4 \cdot x^3 y + 6 \cdot x^2 y^2 + 4 \cdot xy^3 + 1 \cdot y^4 \\ \end{array} \]

Using the Binomial Theorem

The Binomial Theorem is---if you ask me---interesting. But it's also useful, as it (a) can give you a quick solution to a tricky sum, and (b) can quickly tell you things about a binomial without having to multiply it out.

Here are a few examples.

Power of two

We can pick $x$ and $y$ to be particular values. Let's let $x = y = 1$. Then we have:

\[ 2^n = (1+1)^n = \sum_{k=0}^n \C{n}{k} 1^{n-k} 1^{k} = \sum_{k=0}^n \C{n}{k} \]

So our earlier observation the sums of rows of Pascal's triangle were powers of two is a corollary of the Binomial Theorem.

Symmetry

Let's let $x = 1$ and $y = -1$. Then we have:

\[ 0 = 0^n = (1 - 1)^n = (1 + (-1))^n = \sum_{k=0}^n \C{n}{k} 1^{n-k} (-1)^{k} = \sum_{k=0}^n \C{n}{k} (-1)^{k} \]

The term $(-1)^k$ will alternate from $1$ (on even exponents) to $-1$ (on odd exponents). Let's pick $n = 4$; then we have:

\[ \begin{array}{rl} \sum_{k=0}^4 \C{4}{k} (-1)^{k} = & \C{4}{0} \cdot (-1)^0 + \C{4}{1} \cdot (-1)^1 + \C{4}{2} \cdot (-1)^2 + \C{4}{3} \cdot (-1)^3 + \C{4}{4} \cdot (-1)^4 \\ = & \C{4}{0} \cdot 1 + \C{4}{1} \cdot -1 + \C{4}{2} \cdot 1 + \C{4}{3} \cdot -1 + \C{4}{4} \cdot 1 \\ = & \C{4}{0} + -\C{4}{1} + \C{4}{2} + -\C{4}{3} + \C{4}{4} \\ = & \C{4}{0} - \C{4}{1} + \C{4}{2} - \C{4}{3} + \C{4}{4} \\ \end{array} \]

That is, the Binomial Theorem let's us see some of the symmetries. It's starker in odd rows (which have an even number of entries), where one of $\C{n}{k}$ and $\C{n}{n-k}$ is negated and the other isn't---showing us that $\C{n}{k} = \C{n}{n-k}$ (at least for those rows). Try it yourself for $n=5$!

Set-based interpretations of the Binomial Theorem and choice

There is one final, important interpretation of choice: a set-theoretic one. We've already talked about $\C{n}{k}$ as meaning "choose a set of $k$ elements from a set of $n$ elements", i.e., an unordered selection.

We'll be talking more about sets on day 24, but you can use the Binomial Theorem to count subsets: there are $2^n = \sum_{k=0}^n \C{n}{k}$ subsets of a set with $n$ elements. The two sides of this equation identify two different ways of counting.

Using the power of two: bitstrings and subsets

To see that there are $2^n$ subsets of a set $A$ with $n$ elements, think of a subset as a string of $n$ booleans or bits. If we have:

\[ A = \{ a_1, a_2, \dots, a_n \} \]

Then a subset $B \subseteq A$ can be described by the bitstring $b_1 b_2 \dots b_n$, where $b_i = \top$ iff $a_i \in B$. Concretely, suppose $A = \{ \textsf{rock}, \textsf{paper}, \textsf{scissors} \}$; we can fix that order to model a subset as $b_1 b_2 b_3$, where $b_1 = \top$ iff $\textsf{rock}$ is in the subset, $b_2 = \top$ iff $\textsf{paper}$ is in the subset, and $b_3 = \top$ iff $\textsf{scissors}$ is in the subset. So model the subset $B = \{ \textsf{rock}, \textsf{scissors} \}$, we might write $\top\bot\top$, i.e., just $b_1$ and $b_3$ are set; the empty set would be $\bot\bot\bot$ and $A$ itself would be $\top\top\top$. Since there are $2^n$ possible bit-strings, and each bit-string corresponds to a subset of $A$, there must be $2^n$ subsets! (We'll develop this intuition when we talk about countability on day 26.)

Using the sum: subsets of each possible size

Why are there $\sum_{k=0}^n \C{n}{k}$ subsets of a set of size $n$? Well, $\C{n}{k}$ is the number of ways to choose a $k$-sized subset of $n$ things. For a set of size $n$, we have to consider subsets of size 0, size 1, size 2, and so on, up through size $n$. That is:

\[ \text{# of subsets of a set of size $n$} = \C{n}{0} + \C{n}{1} + \dots + \C{n}{n-1} + \C{n}{n} = \sum_{k=0}^n \C{n}{k} \]

All roads lead to Rome

We've just counted something two different ways. If we did our work right, that means the two ways of counting ought to be equal. A proof of the Binomial Theorem justifies both of these methods of counting as corret.

Binomials

But wait, you might say... doesn't the Binomial Theorem say something about binomials in $x$ and $y$? The examples you gave set $x = y = 1$! You can relate the set-based intuition back to binomials by thinking of choice the following way: n $(x+y)^n$, we have the $k$th entry when we list by powers of $x$ descending will have be $c_k \cdot x^{n-k} \cdot y^k$ for some constant coefficient $c_k$. Where does $c_k$ come from?

When you multiply out a binomial, some terms appear more than once. Two 'normalize' the binomial, you group like terms. For example:

\[ \begin{array}{rl} (x+y)^2 =& (x+y) \cdot (x+y) \\ =& x \cdot (x+y) + y \cdot (x+y) \\ =& x^2 + x \cdot y + y \cdot x + y^2 \\ =& x^2 + x \cdot y + x \cdot y + y^2 \\ =& x^2 + 2 \cdot x \cdot y + y^2 \\ =& x^2 + 2 x y + y^2 \\ \end{array} \]

Notice how $x \cdot y$ and $y \cdot x$ are the same (because multiplication is commutative), and the term occurs twice---hence the coefficient 2.

We can state a general rule: the of the $k$th term's coefficient, $c_k$, will be the number of ways you can find $x^{n-k} \cdot y^k$. As you multiply out $(x+y)^n = (x+y) \cdot \dots \cdot (x+y)$, there are $\C{n}{n-k} = \C{n}{k}$ ways to choose the $x$ part $n-k$ times (or, symmetrically, the $y$ part $k$ times). So it must be that the coefficient $c_k = \C{n}{k}$: because there are precisely that many ways to choose terms with the $k$th shape.