Today we discussed closed sets and the Cantor set. Click here for more details.

Today we continued discussing point-set topology, in particular, the notion of limit point. Click here for more details.

Today we discussed point-set topology, in particular, the notion of open set. Click here for more details.

Today we considered more examples of metric spaces. Click here for more details.

Today we proved Holder's inequality and gave more examples of metric spaces. Click here for more details.

Today we gave preliminary examples of a metric space. Click here for more details.

Today we proved the Monotone Subsequence Theorem, as well as the alternating series test. Click here for more details.

On these days we proved theorems about absolute convergence, the Cauchy Criterion, and Bolzano-Weierstrass. Click here for more details.

Today we proved the Root and Ratio tests. Click here for more details.

Today we proved the Monotone Convergence Theorem and the Integral Test. Click here for more details.

Today we introduce sequences and series. Click here for more details.

Today we proved that any two infinite cardinals are comparable. Click here for more details.

I started by discussing some more of the history
of the Continuum Hypothesis. We then proved that
ℵ_{0} is the smallest infinite cardinal;
this is a fancy way of saying that any infinite set which
injects into ℕ is countable. This result is similar
to results we've proved before, and (following
suggestions of Dan) we were able to give the following
quick proof.
Suppose A is an infinite set which injects into ℕ;
in our terminology, A ≤ ℕ. On the other hand,
we know that A must contain a countable subset, whence
ℕ ≤ A. Cantor-Bernstein implies that A ~ ℕ,
or in words, A is countable.

This is an elegant proof, but is rather inexplicit. Is
it possible to write down the bijection between A and
ℕ? The first realization was that this is a bit
difficult from a purely notational standpoint, since A
is some arbitrary (infinite) set; we have no idea whether
its elements are numbers, or animals, or jalopies, or
whatever other crazy thing. However, we know that A injects
into ℕ; this means that *the image of A is
(set-theoretically) indistinguishable from A itself*.
This means that anything we want to prove about A, we
can prove about its image instead. The advantage is that
the image is some subset of ℕ.

To be more specific, suppose the function f maps A
injectively into ℕ. Let A'=f(A). Then A'⊆ℕ,
and A' ~ A. By Cantor-Bernstein, it suffices to find an
injection g from ℕ into A'. We can do this fairly
explicitly using the well-ordering principle. First, let
g(1) = least[A']. Next, we let g(2) be the second smallest
element of A'; formally, this can be written
g(2) = least[A'-{g(1)}]. (It's good to write things in terms
of the least element rather than in terms of the second-smallest
element, because this way we can apply the well-ordering
principle.) Generally, set
g(n) = least[A'-g(ℕ_{< n})].
Note that this is well-defined, since A'-g(ℕ_{< n})
is nonempty for all n. Finally, why is g an injection? Well,
suppose m and n are distinct natural numbers. WLOG, m < n, i.e.
m ∈ ℕ_{< n}. It follows that
g(m) ∉ A'-g(ℕ_{< n}). By contrast,
g(n) ∈ A'-g(ℕ_{< n}) by construction. Thus,
g(m) ≠ g(n). We conclude that g is injective, as claimed.

In the next lecture, we'll see that this proof generalizes very nicely.

Last time we found an injection from
{0,1}^{ℕ} into [0,1). (Recall
that the former is the set of functions
f: ℕ → {0,1}; the latter is the
half-open unit interval.) Now we wish to find
an injection in the other direction. In other words,
to each number x ∈ [0,1) we wish to associate some
function f_{x} ∈ {0,1}^{ℕ},
in such a way that distinct numbers always correspond
to distinct functions. Vaguely, the idea is simple:
given x, our function will encode `walking directions'
to that number. Here's a more precise description of
the idea.

Fix x ∈ [0,1). To define a function
f_{x} ∈ {0,1}^{ℕ}, I need to
tell you the values f_{x}(1), f_{x}(2),
etc. We will define the function recursively.
f_{x}(1) will indicate which half of the interval
x belongs to. If x ∈ [0,1/2), then set
f_{x}(1) = 0, and let a_{1} = 0 and
b_{1} = 1/2. Otherwise, if x ∈ [1/2,1), set
f_{x}(1) = 1, and let a_{1} = 1/2 and
b_{1} = 1. In either case, we have now specified
f_{x}(1), and have that
x ∈ [a_{1}, b_{1}), which is an interval
of length 1/2. We now proceed similarly, at each stage
chopping the interval in half and using f_{x} to
indicate which half of the interval x belongs to. More
precisely, at the n-th stage we are given that x belong to
an interval [a_{n-1}, b_{n-1}) of length
1/2^{n-1}. If x belongs to the left half of the
interval, set f_{x}(n) = 0; otherwise, set
f_{x}(n) = 1. Either way, we now know that x
lives in some interval [a_{n}, b_{n})
which is half the length of
[a_{n-1}, b_{n-1}). This allows the
recursion to continue.

Having defined a function f_{x} we must
now show that the map sending x to f_{x} is
injective. Suppose f_{x} = f_{y}.
This means that f_{x}(k) = f_{y}(k)
for every natural number k. This in turn implies
that for every natural number n, there exists an
interval [a_{n}, b_{n}) of length
1/2^{n} such that both x and y live in that
interval. (This step is not entirely obvious; make
sure you can justify it rigorously!) But this implies
that |x-y| ≤ 1/2^{n} for every n, which
shows that x = y.

We have thus proved that the map sending x to
f_{x} is an injection. We observed in class
that it is NOT a surjection. For example, the function
g defined g(1) = 0 and g(n) = 1 for all n≥2 will
never appear as an output of our map. (Why?)

Applying Cantor-Bernstein, we conclude that
{0,1}^{ℕ} ~ [0,1). Since
P(ℕ) ~ {0,1}^{ℕ}, we conclude that
P(ℕ) ~ [0,1). In particular, [0,1) must be
uncountable.

We concluded lecture with a discussion of infinite
cardinals. The cardinality of ℕ is called
ℵ_{0} (ℵ, read aleph, is the
first letter of the Hebrew alphabet). The cardinality
of ℝ is called c, the continuum. We've proved that
ℵ_{0} < c, a strict inequality. Is there
anything in between? The famous Continuum Hypothesis
asserts that the answer is no: there is no set whose
cardinality is strictly in between ℵ_{0}
and c. In 1940, Kurt Gödel proved that one cannot
disprove the Continuum Hypothesis by using the axioms of
set theory alone (called ZFC, which we will discuss next
week). Twenty years later, Paul Cohen proved that one
cannot prove the CH to be *true* using the axioms of set
theory, either. Together, these statements imply that the
ZFC axioms for set theory are not robust enough to say
anything about CH. The same is true about the Generalized
Continuum Hypothesis (GCH), which asserts that there
does not exist any cardinal strictly between that of A
and P(A), for any infinite set A.

Last time, we proved that A injects into B iff B surjects onto A. Using this, we can restate Cantor-Bernstein: if there exists an injection of A into B and a surjection of A onto B, then there is a bijection from A to B.

Next, we started proving that P(ℕ) ~ [0,1],
where P(A) denotes the power set of A. First, you
will prove in your assignment that
P(ℕ) ~ {0,1}^{ℕ}, where the latter
denotes the set of all function from ℕ to the
set {0,1} of two elements. Thus, it suffices to prove
that {0,1}^{ℕ} ~ [0,1]. We do this by
using Cantor-Bernstein; thus, we must find injections
in both directions. Inspired by an
idea of Kelvin, we mapped a function f to the number
∑ f(n) / 3^{n}, and proved that this
gives an injection. Next lecture we will discuss
an injection in the other direction (or, equivalently,
a surjection in the same direction).

We began by cleaning up the last step of the proof
of the Cantor-Bernstein theorem. We next illustrated the
utility of the theorem by giving a nicer proof that
ℚ is countable. To do this, we first proved that
ℚ_{>0} ~ ℕ. Since ℕ trivially
inject into ℚ_{>0}, it suffices (by Cantor-
Bernstein) to find an injection in the other direction.
There are many such; the one we used mapped the reduced
fraction a/b to 2^{a}3^{b}. Thus, we
deduce that ℚ_{>0} ~ ℕ, so there exists
a bijection f : ℚ_{>0} → ℕ. We
easily extended this to a bijection from ℚ to ℤ.
Finally, since ℤ ~ ℕ, we conclude that ℚ
must be countable.

Next, we turned to uncountable sets. Given a set A, how
does one make a bigger set? One way is to look at the
*power set* of A, denoted P(A); this is the set of all
subsets of A. We proved that any set A is strictly smaller
than its power set; in other words, we proved that there
exists an injection from A into P(A), but there does **not**
exist an injection from P(A) into A. There are a couple of nice
consequences of this theorem. First off, it gives a nicer proof
of the existence of uncountable sets: P(ℕ) must be
uncountable, since it is strictly larger than ℕ! (In fact,
we will prove that P(ℕ) ~ ℝ.) Moreover, it shows that
uncountable sets come in many different sizes: for, if A is any
uncountable set, then P(A) is a strictly larger one!

To prove the theorem, we first proved a lemma: A injects into B iff B surjects onto A. Thus, to prove the theorem, it suffices to prove that there does not exist a surjection from A onto P(A). (We also have to prove that A injects into P(A), but this is trivial.) Let g : A → P(A) be any map, and consider the set B = {a ∈ A : a ∉ g(a)}. Clearly, B ∈ P(A). If g were a surjection, there would have to exist an element b ∈ A such that g(b) = B. But this easily leads to a contradiction (does b ∈ B?). This proves that g cannot be a surjection. In other words, there exists no surjection from A onto P(A).

We concluded the proof of the Cantor-Bernstein theorem.

We proved most of the Cantor-Bernstein theorem (click the link for a summary of the proof).

After recalling the definitions, we reinterpreted the notion of countability: an infinite set is countable iff there exists an algorithm which enumerates the elements in such a way that any given element is enumerated by the algorithm after a finite number of steps. We then proved that the open interval (0,1) is uncountable. The idea was to show that, given any algorithm for enumerating the numbers in (0,1), we can construct some number which the algorithm will never reach in finite time. The argument we gave is sometimes called the `diagonal argument'; it is due to Cantor.

Last time, we proved that any infinite set contains a countable subset. We gave an explicit such subset for (0,1). This led to a discussion of comparing sizes of infinite sets. We already have a notion for this: if there is a bijection from A to B, we said that A and B have the same size (which we denoted A~B). But what about sets which don't have the same size, such as ℕ and (0,1)? Given two sets A and B, we will say A ≤ B iff there is an injection from A into B. There are three scenarios:

(1) A ≤ B and B ≤ A;

(2) A ≤ B but B is not ≤ A; and

(3) A is not ≤ B and B is not ≤ A.

On Thursday, we'll prove that if (1) holds, then A~B (this is the Cantor-Bernstein theorem). If (2) holds, we write A < B; we shall see that ℕ < (0,1). Finally, next week we'll prove that (3) never happens; this will be a consequence of the notorious well-ordering theorem.

We recalled the notions (and notations) of injections, surjections, and bijections. I pointed out a different way one could think about injections and surjections: f: A → B is an injection iff [a = b whenever f(a) = f(b)], and a surjection iff [for all b in B there exists a in A such that f(a) = b].

The notion of bijection gives a nice way to compare two infinite
sets: we'll say that sets A and B have the same size (denoted
A~B) iff there exists a bijection from A to B. Note that, even
if A~B, it's possible to construct maps from A to B which are
*not* bijections! We checked that ~ is an equivalence relation
(i.e. it's reflexive, symmetric, and transitive). We considered
various examples (ℕ~2ℕ, ℕ~ℤ and
ℕ~ℚ); we called a set *countable* iff it has the
same size as ℕ. (Note that any countable set must be
infinite.) We left as an exercise to find an explicit bijection
between ℚ and ℕ.

We concluded lecture by proving two quick theorems. The first was that every infinite set contains a countable subset. The second asserts that every infinite set has the same size as one of its proper subsets. We'll discuss these, and some further developments, next week.

We first reviewed the notion of equivalence relation
(and gave as an example the relation ~ on positive integers
defined by a~b iff the sum of the digits of a equals the
sum of the digits of b; e.g. 701~44). We next recalled the
definition of partition, and the fact that any partition on
a set induces an equivalence relation on that set. Next,
we proved the converse: an equivalence relation on a
set induces a partition of that set. More precisely, if
~ is an equivalence relation on S, then S can be partitioned
into sets S_{α} := {x in S : x~α}.

Next we returned to functions. Recall that if
f : A → B is a function, then the image of any element
of A consists of a single element of B (i.e. |f(a)| = 1 for
every a in A). If the preimage of any element of B also consists
of a single element of A (i.e. |f^{-1}(b)| = 1 for
every b in B), then f is called a *bijection*. In practice,
one often proves that f is a bijection in two stages:

(1) prove that |f^{-1}(b)| ≤ 1 for all b in B, and

(2) prove that |f^{-1}(b)| ≥ 1 for all b in B.

Any function satisfying condition (1) is called an *injection*;
any function satisfying condition (2) is called a *surjection*.
Thus a bijection is a function which is both injective and
surjective. I also introduced the fancy arrow notations for such
functions (curvy arrow for injections, double-headed arrow for
surjections).

We concluded lecture with an analogy for these concepts. Imagine
a theatre with a bunch of seats, and a bunch of people with tickets
to the theatre; the set of people is A, the set of seats is B, and
the function f: A → B plays the role of the ticket. (It's a
function means that each ticket assigns a *single seat*, i.e.
no ticket has two different seat numbers on it.) Then f is an
injection iff no two people are assigned the same seat, and f is
a surjection iff every seat in the house is filled (possibly with
more than one person!). Finally, f is a bijection iff each seat has
precisely one person sitting in it.

After clarifying the notation for unions of multiple
sets, we talked about functions. We talked about the image
and preimage of a function. Among other things, even if a
function f: A → B is not invertible, we denote the preimage
of B'⊆B under f by f^{-1}(B'). We proved several
propositions: the preimage of a union (or intersection) is the
union (or intersection) of the preimages, and the image of a
union is the union of the images. Perhaps surprisingly, the image
of an intersection is *not* necessarily the intersection of
the images; Kaidi came up with the example f(x) = x^{2} with
A=[-1,0] and B=[0,1] to demonstrate this.

Next, we discussed partitions and equivalence relations. A
partition is a way of breaking a set up into non-overlapping
(`pairwise disjoint') pieces; formally, a partition of S is a
way of writing S as the union of sets S_{α} where
any two sets S_{α} and S_{β} are either
secretly the same set, or are disjoint. This may seem abstract, but
we do this all the time in life (and in math). For example,
implicit in many jokes is a partition of the set of all people
by hair colour
(blond, brunette, redhead, etc.). Or we partition ℤ into
even and odd numbers. In these examples, we treat elements in each
piece of the partition as equivalent (even if they're not equal).
For example, the integers 2 and 4 are different, but they're both
even; so when we partition ℤ into evens and odds, we are
effectively overlooking the differences between 2 and 4 and focusing
our attention on their shared characteristic of evenness.
Thus, the set of even numbers is an `equivalence class',
and the set of odds is another.
Stereotypes operate on the same principle -- all members of a single
group (race, nationality, gender, religion, age, affluence, hair
colour, etc.) are considered equivalent, despite obvious variation between
the individuals in most respects.
Another example of this we discussed
was from Euclidean geometry. A big part of geometry is studying
conditions under which two triangles are congruent. Of course, just
because two triangle are congruent doesn't mean they're the
*same* -- if they're located in different places in the plane,
they're definitely not the same! --
but we overlook this and treat them as equivalent.

We tried to capture what these different equivalences have in common
by introducing the concept of an equivalence relation. We say that a binary
relation ~ on a set S is an *equivalence relation* if it satisfies
three properties: reflexivity, symmetry, and transitivity. We then
observed that given any set S, a partition on S induces a natural
equivalence relation on S (a~b iff a and b live in the same slice of
the partition). It turns out that the converse to this also
holds: any equivalence relation on a set S induces a natural partition
of S. We will prove this converse theorem in the next lecture.

In MATA31 and MATA37, you studied the structure of
the set ℝ and how functions act on elements of that
set. In Linear Algebra, you studied the structure of
ℝ^{n} and how matrices act on its elements.
There were a lot of similarities between the two,
and in this course we're going to take this a step
further and unify the theory by talking about more
general sets and the `operators' which act on their elements.

As an example, consider the set *S* of functions whose
domain and range are both ℝ. There is a natural way
of combining together elements of this set -- composition --
which satisfies most of the properties we know about
addition in ℝ (associativity, existence of identity,
existence of inverses). In other words, in this space,
functions play the same role which number do in ℝ.
What plays the role of functions, then? There were a couple
of ideas: the differential d/dx, and the integral. Certainly,
both of these operations can be used to transform one function
into another (although not all functions can be differentiated
or integrated). There are other important transforms, like the
Fourier transform, which you will learn about later. The point
is, *S* behaves in many ways like ℝ. This observation
immediately leads to questions about how deep the similarity goes.
For example, recall that the rationals are dense in the reals;
i.e. even though we*cannot write down* most real numbers
in any way, we can approximate any real number with arbitrary
precision by fractions, which *are* quite simple to write down.
Is there a similar result for *S*? In other words, is there
some family of particularly simple functions which are dense in
*S*? As soon as you pose the question, you're forced to
ask another: what does it mean for one function to be `close' to
another? How does one measure distances between functions? These
are the sorts of questions we will address in this course.
Among other things, we will develop a notion of how to measure
distances between two elements of rather general spaces --
`metric spaces' -- and develop a theory of them and the operators
which act upon their elements.

After this, we discussed naive set theory. (Naive, because we never define the terms `set' or `element'. As we discussed, this is actually a pretty serious issue, since it leads to paradoxes. For our purposes, we'll get by naively.) We recalled some basic notations, including unions and intersections of multiple sets. New material begins next lecture!