Additional comments related to material from the
class. If anyone wants to convert this to a blog, let me know. These additional
remarks are for your enjoyment, and will not be on homeworks or exams. These are
just meant to suggest additional topics worth considering, and I am happy to
discuss any of these further.
- Thursday, December 15. Last day! Today
was a chance to see things come full circle, specifically how so many of the
topics connect to each other. The motivation for a lot of this was starting
with the Euler
totient function (phi(n)), and exploring some of its properties.
-
We looked at data for phi(n) for n small and found lots of patterns, which
we then tried to prove. The key result is that phi is a
multiplicative function, though sadly not
completely multiplicative. We saw that if we can understand it for
primes and prime powers we can understand it everywhere. Proving these
facts is a nice challenge. It isn't too bad to do n = p, and then a little
thought gives n = p^k or n = pq (distinct primes) by counting (essentially
inclusion/exclusion). Then it becomes interesting as to how to attack
in general. We can use unique factorization to write n and m as products
of prime powers, and then study phi(nm), pulling factors out one at a
time. This indicates that perhaps induction might be useful, as pulling
out a factor leaves us with a smaller number, or perhaps inducting on the
number of prime factors.
-
We saw an interesting pattern: Sum_{d|n} phi(d) = n, where d ranges over
all divisors of n (including 1 and n). At first there was a suspicion that
this was due to our choosing special values of n; we looked at n=15 and
n=24, both one less than a perfect square. Next 27 was suggested, but
that's a perfect cube.
Are all numbers
interesting? We then tried 18, which also worked. How should we prove
this in general? We first did n = p then n = p^k, and saw it worked there.
Next we tried n = pq for distinct primes, and saw that everything canceled
and it worked. We then moved on to p^n q^m. Here we spent a lot of time on
the best way to write the factors. For numbers like 15, 18 or 24 we can
write the factors in increasing order, but for a general number like p^n
q^m, that's no longer possible. We saw two good approaches. The first was
to write the factors in rows; the first row was 1, p, p^2, ..., p^n; the
next row was q, q p, q p^2, ..., q p^n, amd so on until we reach q^m, q^m
p, ..., q^m p^n. The first row (by an earlier result) sums to p^n, writing
q as q * 1 gives the second to phi(q) p^n, and so on until the last which
is phi(q^m) p^n. Adding these (and noting the first row is phi(1) * p^n)
gives (phi(1) + phi(q) + ... + phi(q^m)) * p^n, which by our earlier
result is q^m p^n. We could generalize this to a number involving three
distinct prime factors and think of some cube, but this is hard to
visualize (and then four factors...). A better approach is to note we can
write an arbitrary divisor as p^i q^j, and thus Sum_{d|p^nq^m} phi(d) =
Sum_{i=0 to n} Sum_{j=0 to m} phi(p^i q^j); the claim now follows from
using the multiplicativity of the phi function and our result for sums of
the divisors of the powers of a prime. Both of these facts
are needed here; we have many multiplicative functions, but we do need to
have results about sums of prime powers at our function.
-
We talked about other number theoretic functions, such as the
Mobius function.
We saw how this function arises in 1/zeta(s), and see yet again the
connections between the Riemann zeta function and the building block
functions of number theory.
-
A great conjecture about the Mobius function involves how large the sum of
mu(n) for n <= x can be; sadly this conjecture,
Merten's
Conjecture, is false (though it may only fail by a few powers of log).
-
We ended with one last tour of the connections between different subjects,
and mentioned how he reduced all of physics to U = 0 -- while this is
correct, it isn't tractable, and it is important to have something we can
extract information from. I strongly urge you to read Feynman's books.
They're entertaining and have some great stories.
- Tuesday, December 13. Lots of great
stuff today: error correcting codes, fixed point theorems, and square-free
numbers.
-
Error correcting / detecting codes,
Hamming codes,
UPC and
ISBN, and of course
Hamming himself.
Here's a bit on
inventories. Laser
seems to be late 1950s. The wikipedia entry on
barcode readers
doesn't seem to mention when they came to market.
-
Intermediate value theorem,
sperner's lemma,
brouwer
fixed point theorem,
prisoner's
dilemna, zero
sum games,
nash equilibrium, and of course
Nash.
Nash's thesis is available here.
-
We studied zeta(2),
using the
method of inclusion / exclusion (which was used by
Brun to show the
sum of the
reciprocals of twin primes converge). It is sadly very common for
people to be careless in dropping the floor function in the proof -- this
is the basis of many false proofs of `true' mathematical results (such as
the twin
prime conjecture).
- Thursday, December 8. Today we
discussed sphere packing and the sketch of the proof of the Prime Number
Theorem, and how the zeros of zeta(s) enter.
-
Some links:
Kepler's conjecture,
kissing number,
sphere packing,
Leech lattice,
Thomas C.
Hales,
automated proof checking, Flyspeck project.
-
Here's a repeat from Thursday, December 1, on Riemann's paper and his
hypothesis.
Terrific
advice given to all young mathematicians (and this advice applies to many
fields) is to read the greats. In particular, you should read Riemann's
original paper.
In case your mathematical German is poor, you can click
here for the English translation of Riemann's paper.
The key passage is on page 4 of the paper:
- One now finds indeed approximately this number of real roots
within these limits, and it is very probable that all roots are real.
Certainly one would wish for a stricter proof here; I have meanwhile
temporarily put aside the search for this after some fleeting futile
attempts, as it appears unnecessary for the next objective of my
investigation.
- Riemann did have a formula for computing zeros, and in fact verified
the first six had real part equal to 1/2, but never mentioned this in
his paper! Siegel discovered Riemann's formula many years after
Riemann's death when going through Riemann's old papers. It's now known as
the
Riemann-Siegel formula, and I believe was better than anything
developed after Riemann's paper and before this was re-earthed.
- We spent a lot of time today discussing how complex analysis leads to
a proof of the Prime Number Theorem (PNT). We did this for several
reasons. First, the PNT is a huge result in the subject, and it is good to
know what goes into the proof. Our tour showed us much of the complex
landscape. We got to see how the zeros of the zeta function enter the
picture, which wasn't clear initially. The integration we did showed why
we need to be able to extend the definition of zeta(s) beyond
Re(s) > 1. We saw the power of the Euler product (the product over
primes); we needed to take a logarithm to convert that to a sum, but the
logarithm sets us up beautifully for a logarithmic derivative, which is
easy to integrate. It's so nice when everything works out!
- The argument
principle shows that the
integral of the logarithmic derivative of a nice function along a closed
curve is equal to the number of zeros minus the number of poles of the
function in the interior. This is a key ingredient in the proof of three
big results in complex analysis (discussed below). Instead of integrating
f'(z)/f(z) against 1, we could integrate it against a test function g(z).
This leads to what is known as explicit
formulas; these weighted versions of the argument principle appear
throughout number theory (see
for instance the proof sketch on Wikipedia of the Prime Number Theorem).
We used g(z) = x^z/z; a big part of the subject is finding good test
functions.
-
Rouche's
theorem is the first of
many consequences of the argument principle; in fact, the remaining big
theorems are consequences of Rouche's theorem. Rouche's theorem can be
used to prove theFundamental
Theorem of Algebra. From the Wikipedia article: One popular,
informal way to summarize this argument is as follows: If a person were
to walk a dog on a leash around and around a tree, and if the length of
the leash is less than the minimum radius of the walk, then the person
and the dog go around the tree an equal number of times.
- The Open
Mapping Theorem is the
next consequence. Again, the complex situation differs enormously from
the real case.
- The Maximum
Modulus Principle is the
biggest implication of the Open Mapping Theorem. This states that a
holomorphic function attains its maximum on the boundary. Applications
of this include the Schwarz
lemma (which is a key
ingredient in proving the Riemann
Mapping Theorem, which allows us to reduce the study of simply
connected open sets not all of the complex plane to studying the unit
disk), and thePhragmen
- Lindelof Principle, which is very useful in bounding quantities in
number theory. One such application is in proving convexity bounds for
the Riemann zeta function (and more generally L-functions); see for
instance Heath-Brown's
note. It's a very good exercise to work through some similar
examples for real valued functions and see what goes wrong.
Specifically, look at f(x) = 1-x^2 on (-1,1). Note f((-1,1)) = (0,1].
Note the maximum is attained in the interior, not on the boundary.
-
The complex analytic proof of the Prime
Number Theorem uses
several key facts. We need the functional equation of the Riemann zeta
function (which follows from Poisson summation and properties of the Gamma
function), the Euler product (namely that zeta(s) is a product over
primes), and one important fact that no one questioned in class: what
if the Riemann zeta function has a zero on the line Re(s) = 1!
If this happened, then the main term of x from integrating zeta'(s)/zeta(s)
* x^s/s arising from the pole of zeta(s) at s=1 would be cancelled by the
contribution from this zero! Thus it is essential that there be no zero of
zeta(s) on Re(s) = 1. There are many proofs of this result. My
favorite proof is
based on a wonderful trig identity: 3 + 4 cos(x) + cos(2x) = 2 (1 - cos(x))^2
>= 0 (many people have said that w^2 >= 0 for real w is the most important
inequality in mathematics). If people are interested I'm happy to give
this proof in class next week (or see Exercise 3.2.19 in our textbook;
this would make a terrific aside if anyone is still looking for a
problem). There is an elementary proof of the prime number theorem (ie,
one without complex analysis). For those interested in history and some
controversy, see
this article by Goldfeld for a terrific analysis of the history of the
discovery of the elementary proof of the prime number theorem and the
priority dispute it created in the mathematics community.
- Tuesday, December 6. We heard two nice
talks today on generalized factorial functions and origami, and then heard a
bit more about complex analysis.
- Thursday, December 1. We heard two
nice talks today on protein folding and the Chinese Remainder Theorem in
cryptography, and then I discussed the beginnings of complex analysis, one of
the most powerful tools in number theory.
-
The complex analytic proof of the Prime
Number Theorem uses
several key facts. We need the functional equation of the Riemann zeta
function (which follows from Poisson summation and properties of the Gamma
function), the Euler product (namely that zeta(s) is a product over
primes), and
the Riemann zeta function has no zeros on the line Re(s) = 1!
There is an elementary proof of the prime number theorem (ie, one without
complex analysis). For those interested in history and some controversy, see
this article by Goldfeld for a terrific analysis of the history of the
discovery of the elementary proof of the prime number theorem and the
priority dispute it created in the mathematics community.
-
Terrific
advice given to all young mathematicians (and this advice applies to many
fields) is to read the greats. In particular, you should read Riemann's
original paper.
In case your mathematical German is poor, you can click
here for the English translation of Riemann's paper.
The key passage is on page 4 of the paper:
- One now finds indeed approximately this number of real roots
within these limits, and it is very probable that all roots are real.
Certainly one would wish for a stricter proof here; I have meanwhile
temporarily put aside the search for this after some fleeting futile
attempts, as it appears unnecessary for the next objective of my
investigation.
- To see the connection between zeros of the Riemann zeta function and
the distribution of primes requires some results from complex analysis.
The main input we will need is that integrals along circles (or more
generally nice curves) of the logarithmic derivative of a nice function is
just the order of the zero or pole at the center of the circle. In other
words, if we have a Taylor expansion f(z) = a_k z^k + ... (where k is the
first non-zero term; thus a_k is not zero and if k > 0 we say the function
has a zero of order k at the origin, while if k < 0 we say the function
has a pole of order k). The Residue theorem (which we'll discuss as time
permits later) then gives: (1 / 2 pi i) Integral_{|z| = r} f'(z)/f(z) dz =
k. Note that if the function doesn't have a zero or pole at the origin
then this integral is zero (for r sufficiently small). More generally, if
g(z) is a nice function (1 / 2 pi i) Integral_{|z| = r}g(z) f'(z)/f(z) dz
= k g(0). We will use a further generalization of this on Monday to relate
the zeros of the Riemann zeta function to counting the number of primes at
most x. For more details on the complex analysis we are using, see Cauchy-Riemann
equations, Cauchy-Goursat
Theorem, Residue
Theorem, Green's
Theorem.The key takeaways from today's class are: (1) we can convert
certain types of integrals to finding the a_{-1} coefficient in a Taylor
expansion (and this is good as algebra is easier than integration); (2)
integrating the logarithmic derivative is useful as the answer is related
to the zeros and poles of the function.
- Some useful facts about the exponential function:
Recall the exponential
function exp is defined by
e^z = exp(z) = sum_{n = 0 to oo} z^n/n!. This series converges for all z.
The notation suggests that e^z e^w = e^(z+w); this is true, but it needs
to be proved. (What we have is an equality of three infinite sums; the
proof uses the binomial
theorem.) Using the Taylor series expansions for cosine
and sine, we find e^(iθ) = cos θ + i sin θ. From this we find |e^(iθ)|
= 1; in fact, we can use these ideas to prove all trigonometric
identities! For example:
- Inputs: e^(iθ) = cos θ + i sin θ and
e^(iθ) e^(iφ) = e^(i (θ+φ))
- Identity: from e^(iθ) e^(iφ) = e^(i (θ+φ))
we get, upon substituting in the first identity, that (cos θ + i sin θ)
(cos φ + i sin φ) = cos(θ+φ) + i sin(θ+φ). Expanding the left hand side
gives (cos θ cos φ - sin θ sin φ) + i (sin θ cos φ + cos θ sin φ) =
cos(θ+φ) + i sin(θ+φ). Equating the real parts and the imaginary parts
gives the identities
- One can prove other identities along
these lines....
- Tuesday, November 29. While the
previous two lectures were designed to emphasize the need to ask the right
question, today's lecture fills a different purpose: technical mastery. It's
important to be able to do both. Today we saw some very powerful techniques in
number theory, and got a sense for the type of results possible.
- We began by studying the Riemann zeta function. We may write it as a
sum or as a product because of unique factorization. Sadly, truncating
either makes the other side complicated. This led to a change in
perspective -- rather than going for a strict equality, instead we tried
to rewrite our desired quantity in terms of upper and lower bounds of
simpler beasts. When doing the analysis, we often don't have to be too
careful on the error terms, as they're dwarfed by the main terms. We saw
this in showing that Sum 1/p diverges. We started with an inequality
(summing all integers up to y gave a contribution less than the product of
primes up to y). We then took logarithms as, well, we saw a product! We
Taylor expanded the logarithm, and noted that log(sum_{n<y} 1/n) is about
log log y, so errors that are bounded don't matter much. We were able to
get sum 1/p is at least log log y (up to a constant, which is dwarfed by
large y). With a bit of work, we could have obtained an upper bound as
well, and then had the true rate of growth. Using
partial summation,
can you get pi(x) is at least as large as sqrt(x) now? Or some other
technique?
- We then turned to consequences of Chebyshev, and saw the power of
dyadic decompositions. You'll see this technique again and again in upper
level analysis courses. What's nice is we can bound our true value by best
and worst cases, and we see that there isn't 'much' difference between teh
two. Again, it takes awhile to truly appreciate this perspective, but it
is quite powerful. If we just care about the general behavior, we can
often get it without too much work (again, don't let the algebra
intimidate -- it beats integrating!).
- Finally, we turned to
analytic continuations of the Riemann zeta function. Notice they give
the series expansion we had, but with (-1)^{n+1}; it's always tricky as to
what is the 'right' notation / normalization. The idea of the proof here
is similar to that of the (boring) proof of the Geometric Series Formula,
and we thus see this is a technique and not just a one-time trick. See
this paper by
Sondow
for more on the alternating zeta function.
- Speaking of alternating series, there's a lot of fun things that can
be said.
Check
out this fun paper.
- Tuesday, November 22. We gave the
standard and a `fun' proof of the geometric series formula, and talked about
the need to extend functions from their original domain to a more extensive
one. We then showed how to do this for the factorial function, and then
started discussions of the Riemann Zeta function.
- We discussed the geometric
series formula. The standard proof is nice; however, for our course
the `basketball' proof is very important, as it illustrates a key concept
in mathematics. Specifically, if we have a memoryless
game, then frequently after some number of moves it is as if the game
began again. This is how we were able to quickly calculate the probability
that the first shooter wins, as after both miss it is as if the game just
started.
- The geometric series formula only makes sense when |r| < 1, in which
case 1 + r + r^2 + ... = 1/(1-r); however, the right hand side makes sense
for all r other than 1. We say the function 1/(1-r) is a(meromorphic)
continuation of
1+r+r^2+.... This means that they are equal when both are defined;
however, 1/(1-r) makes sense for additional values of r. Interpreting
1+2+4+8+.... as -1 or 1+2+3+4+5+... a -1/12 actually DOES make sense, and
arises in modern physics and number theory (the latter is zeta(1), where
zeta(s) is the Riemann
zeta function)!
- We next considered the Gamma
function, which generalizes the standard factorial function. We gave a
proof of its functional equation, Γ(s+1) = sΓ(s); this allows us to take
the Gamma function (initially defined only when the real part of s is
positive) and extend it to be well-defined for all s other than the
non-positive integers. For more on the Gamma function and another proof of
the value of Γ(1/2), see
my (sadly handwritten) lecture notes. This approach uses the Beta
distribution.
- One nice application of the Gamma
function and normalization constants is a proof of Wallis'
formula,which says π/2 = (2·2 / 1·3) (4·4 / 3·5) (6·6 / 5·7) ···. I
have a proof which is mostly elementary (see
my article in the American Mathematical Monthly). Not surprisingly,
the proof uses one of my favorite techniques, the theory of normalization
constants (caveat: it does have on advanced ingredient from measure
theory, namely Lebesgue's
Dominated Convergence Theorem). For us, today we showed how to
get the normalization constant for the standard normal by noting it is
Gamma(1/2). In fact, one can determine all the moments of the standard
normal via the Gamma function, if one can find a way to evaluate
the Gamma function! Fortunately there are some beautiful
change of variable techniques to do so -- let me know if you're interested
and I'll pass along some notes (some might be in my scanned handwritten
notes mentioned above).
-
There are nice formulas
for the volume and area of
n-dimensonal spheres. Interestingly, there are connections
between how many spheres can be packed into a given space and codes in
information theory!
The formulas depend crucially on the gamma function!
-
One last word on the Gamma function: it is useful to interpolate the
factorial function to be defined for all real numbers (more so, complex
too!) as we can then use powerful techniques to estimate its values, and
we do want to be able to estimate large values. A great
example of the need for this is in estimating binomial coefficients. Thus
we need a formula to approximate n!. This is done by
Stirling's
formula, which says n! is approximately (n/e)^n sqrt(2 pi n) (1 +
1/12n + ...).
-
Finally, we mentioned the Riemann zeta function briefly: ζ(s) = sum_{n = 1
to∞} 1/ns=
(1 - 1/ps)2.
This is intimately tied to the distribution of the primes (which isn't
surprising as it related something we want to know about (the primes) to
something very well understood (the integers).
- Thursday, November 17. After talking about Carmichael numbers and Korselt's criterion, we moved on to sums of divisors and perfect numbers.
- We proved that all Carmichael numbers are even. The proof highlights a lot of good habits. We somehow have to work the definition of the object of interest into the proof. In this case, that meant working in the definition of Carmichael numbers, which states a^n = a mod n for all a. Then it becomes a question of how to use. Do we take a indepenent of n, or dependent on n? We tried a=2 but that didn't seem fruitful; we didn't try 0, 1 or n as they were trivially true. We then tried a=n-1, and that was useful, and led to a proof. Often in these universal statements the proof boils down to finding a good, clever choice.
- We then turned to divisors and sums of divisors. One can look at either the sum of all divisors or the sum of all proper divisors (those less than n); as mentioned in class we have issues as to what would be the sum of the proper divisors of 1! We spent a lot of time trying to come up with questions and conjectures for this function. Letting sigma(n) be the sum of the proper divisors of n, we saw sigma^{-1}(1) is the set of primes, sigma^{-1}(2) is empty, and then the big proof was that sigma(n) --> oo as n --> oo through the composites. We tried at first to show sigma(n) >= n/2 for n composite, but that failed (took n = 3p). Then we tried sigma(n) >= g(n) for n composite for some function g(n). While one can take g(n) = log(n), it isn't too hard to show one may take g(n) = sqrt(n) (which means our sum function is quite bumpy!). This implied that sigma^{-1}(k) is bounded in size for any fixed k (there are at most k^2 elements). Thus if we want to find an interesting set other than the primes, the input must be allowed to vary. This led us to studying sigma(n) = n, or the perfect numbers.
- If there's interest, perhaps we'll show the average growth rate of the sum of the divisors function is logarithmic.
- The Mathematica program I wrote to find perfect numbers is available here. To be brutally honest, it sucks. It's very slow and inefficient as it doesn't use any knowledge about perfect numbers. The even number could be checked much faster by using the Mersenne connection, and that would increase the speed by a factor of 2. It is illustrative to see how well one can do by brute force. It took about 53 minutes for the program to brute force check all numbers up to 100,000,000, and it did find the five perfect numbers in that range: 6, 28, 496, 8128, 33,550,336. According to Wikipedia, the earliest known reference to the fifth perfect number is 1456 (the first four go back to Euclid).
- If you are interested in helping to find Mersenne primes, join GIMPS.
- Tuesday, November 15. Today we discussed Carmichael numbers and primality testing. There have been huge advances in the past few years in these fields in particular, and efficient algorithms in general.
- Remarkably, we now have a fast, deterministic way of showing a number is prime, and it is an unconditional test (ie, we don't need to assume the Riemann Hypothesis). This is the famous Primes is in P paper (the in P is a reference to algorithmic complexity; the P vs NP problem is one of the Clay Millennium Problems). Click here for a nice summary of the hoopla Primes in P generated.
- Read the abstract for this paper and hopefully you'll smile!
- Here's another fun paper.
- Here is the classic paper in the subject: There are infinitely many Carmichael numbers, W. R. Alford, A. Granville and C. Pomerance, Ann. of Math. (2) 139 (1994), 703–722.
- We talked a lot about (Fermat) witnesses, which involve using Fermat's little Theorem to test for primality. See the Mathematia notebook for code and plots of the investigations. There are a lot of fascinating problems and conjectures one can generate about these witnesses. Is there a limiting distribution to the first witness? Can we easily show most composites have many witnesses? We'll talk a bit more about these questions on Thursday, but you should also be thinking of your own questions.
- A nice feature about the prevalence of witnesses is that it is often fast to determine if a number is composite. Speed matters. There are lots of other efficiencies we can search for and use. The Strassen algorithm (see also the Mathworld entry here, which I think is a bit more readable) multiples two NxN matrices A and B in about N^(log_2 7) multiplications; the reason for this savings is that they can multiply two 2x2 matrices with seven and not 8 multiplications (3 = log_2 8); the standard way we learn to multiply matrices takes N^3 operations! The moral? Efficiencies are everywhere. The best known algorithm is the Coopersmith-Winograd algorithm, which is of the order of N^2.376 multiplications. See also this paper for some comparison analysis, or email me if you want to see some of these papers. Some important facts. (1) The Strassen algorithm has some issues with numerical stability. (2) One can ask similar questions about one dimension matrices, ie, how many bit operations does it take to multiply two N digit numbers. It can be done in less than N^2 bit operations (again, very surprising!). One way to do this is with the Karatsuba algorithm (see also the Mathworld entry for the Karatsuba algorithm).
- We talked about the output of primality tests; this can range from sometimes saying 'I don't know' to using phrases such as 'probably prime'. There is some similarity with the medical industry and tests there. A great example is false positives. See Bayes' Theorem, especially the bit at the end. Or, better yet, click here for the drug testing example. The results are shocking -- if you test positive on a test that seems to be 99% accurate you might only have a 33% chance of being ill!
- To look up papers / math, try MathSciNet, JStor, and the arxiv. (Of course, one can also google!)
- Tuesday, November 8. We talked about denseness, equidistribution, and applications to Monte Carlo Integration. There's a lot that we've done, but it's only the tip of the iceberg. What I like about this unit is that it shows the interconnectedness of mathematics.
- In class we talked about denseness of certain sequences. Other fun ones are sin(n) and cos(n) -- are these dense in the interval [-1, 1]? Equidistributed? What can you say about these? (I believe one is somewhat elementary, one is more advanced. Email me for a hint on what math results might be useful.)
- During the student presentations, we'll see an application of the equidistribution of n alpha mod 1 to Benford's law. For applications, it is often important to have a sense as to how rapidly one has convergence in equidistribution results. One of the common techniques involves using the Erdos-Turan theorem (the web resources aren't great; I have a copy of a good book that shows how the irrationality exponent is connected to quantifying the rate of convergence to equidistribution).
- For our probability detour, we need to know the following: means, variances; to get a sense of how well the numerical approximations are in the Monte-Carlo integration, we can use the Central Limit Theorem or Chebyshev's Theorem.
- We discussed Monte Carlo integration, which has been hailed by some as one of the (if not the) most influential papers in the 20th century. We only touched the briefest part of the theory here. It can be combined with Central Limit Theorem or Chebyshev's Theorem to give really good results on numerically evaluating integrals. Specifically, if N is large and we choose N points uniformly, we can simultaneously assert that with extremely high probability (such as at most 1 - N^{-1/2}) the error is extremely small (at most N^{-1/4}). If you want to know more, please see me -- there are a variety of applications from statistics to mathematics to economics to .... Below are links to two papers on the subject to give you a little more info:
- We ended with the briefest of introductions to quasirandom processes. This is a very active area of research, with lots of great applications. James Propp has written some very nice papers and talks on the subject. You can click here for some papers and talks. A great talk of his is online: click here for the slides.
- Thursday, November 3. Below are some fun papers to look at, followed by some comments from what we did in class.
- Tuesday, October 25. We began our discussion of irrationality with the standard proof of the irrationality of sqrt(2), seeing that while it may depend on the Fundamental Theorem of Arithmetic, we can bypass that with a poor mathematicians version. It's worthwhile spending time seeing how `deep' a result is; in other words, what do you really need to prove something?
- Thursday, October 20. Good notation can really help shine a light on the key features of a problem. We discussed big-Oh notation (there's also little-oh notation). We then turned to the question of whether or not there are more numbers with exactly two prime factors or primes up to x, and talked about the proof of Chebyshev's theorem.
- Our proof in class is typical of many results in number theory -- be as crude as possible as long as possible in your estimations; if you don't get your result, then refine your estimates as needed. We did numerous 'worst case' approximation and still won. Now, if we asked a harder question (estimate the RATE of convergence, or give the OPTIMAL value of the constant), then we of course couldn't be so crude. Many of our arguments used dyadic interval decompositions and telescoping series.
- We needed to estimate the central binomial coefficient, (2n choose n). For the general problem of (m choose k) one uses Stirling's formula, which gives n! is about n^n e^{-n} sqrt(2 pi n) (1 + error of size 1/12n + ...). We can get upper and lower bounds by using the comparison method in calculus (basically the integral test); we could get a better result by using a better summation formula, say Simpson's method or Euler-Maclaurin.
- Whenever we see a product we want to take a logarithm.
- Again, we saw the advantages of breaking difficult sums into cases. While there are more things to consider, we have more control over what is going on.
- One of my favorite theorems in number theory is the Erdos-Kac theorem, which basically says the number of distinct prime factors of a large number of integers around x is, for x large, approximately normally distributed with mean loglog(x) and variance loglog(x). This generalizes a result of Hardy and Ramanujan on the number of expected prime factors. For a little more, see some notes from my probability course at Williams a few years back.
- Tuesday, October 18. Today we talked a lot about methods of proofs, especially breaking things up into cases. The big theoretical result mentioned was Skewes's Theorem.
- While we didn't talk about it in class, prime races (see also here) are a lot of fun, and show how misleading the data can be. It was observed that there seem to be more often than not more primes up to x congruent to 3 mod 4 than 1 mod 4. This is known as Chebyshev's bias. We looke at the related quantity Li(x) - pi(x); this was also observed to be positive as far as people could see, but it turns out that they flip infinitely often as to which is larger. This was shown by Littlewood, but it was not known how far one must go to see pi(x) > Li(x). His student Skewes showed it suffices to go up to 10^(10^34) if the Riemann hypothesis is true, or 10^(10^(10^963)) otherwise (as large as these numbers are, they are dwarfed by Graham's number). We 'believe' it's around 10^316 where pi(x) beats Li(x) for the first time (note this is well beyond what we can investigate on the computer!). The proof involves the Grand Simplicity Hypothesis (that the imaginary parts of the non-trivial zeros of Dirichlet L-functions are linearly independent over the rationals); this is used to show that (n gamma_1, ..., n gamma_k) mod 1 is equidistributed in [0,1]^k where the gamma_j are the imaginary parts of these zeros.
- We talked a bit about rearranging the terms of the harmonic series. This is known as the rearrangement theorem, and worth reading.
- One can ask about the sum of x_n/n, where x_n = 1 with probability 1/2 and -1 with probability 1/2. This turns out to be a fun problem! See here for more.
- We shouldn't forget about the regular harmonic series. A nice application is building overhangs with blocks. The harmonic series naturally arises here. See http://www.ken.duisenberg.com/potw/archive/arch03/030728sol.html as well as http://www.cs.cmu.edu/afs/cs/academic/class/16741-s07/www/projects06/chechetka_16-741_project_report.pdf, or the movie of the week: Stacking blocks. For recent results on what can be done if you allow non-simple patterns, see this paper.
- For another example of applications of harmonic numbers, see the coin collector problem (if you want more info on this problem, let me know -- I have lots of notes on it from teaching it in probability).
- Thursday, October 13. We proved the infinitude of primes many ways -- what could be better! The point is to see how different approaches can lead to different perspectives on the same problem. There's not one way to do or prove things, each has its advantages and disadvantages.
- Calculating Brun's constant (the sum of the reciprocals of twin primes) led Nicely to discover the Pentium bug; a nice description of the discovery of the bug is given at http://www.trnicely.net/pentbug/pentbug.html.
- We did Euclid again, so repasting the comment: In class we defined pi(x) to be the number of primes at most x. We discussed Euclid's argument which shows that pi(x) tends to infinity with x, and mentioned that with some work one can show Euclid's argument implies pi(x) >> log log x. As a nice exercise (for fun), prove this fact. This leads to an interesting sequence: 2, 3, 7, 43, 13, 53, 5, 6221671, 38709183810571, 139, 2801, 11, 17, 5471, 52662739, 23003, 30693651606209, 37, 1741, 1313797957, 887, 71, 7127, 109, 23, 97, 159227, 643679794963466223081509857, 103, 1079990819, 9539, 3143065813, 29, 3847, 89, 19, 577, 223, 139703, 457, 9649, 61, 4357.... This sequence is generated as follows. Let a_1 = 2, the first prime. We apply Euclid's argument and consider 2+1; this is the prime 3 so we set a_2 = 3. We apply Euclid's argument and now have 2*3+1 = 7, which is prime, and set a_3 = 7. We apply Euclid's argument again and have 2*3*7+1 = 43, which is prime and set a_4 = 43. Now things get interesting: we apply Euclid's argument and obtain 2*3*7*43 + 1 = 1807 = 13*139, and set a_5 = 13. Thus a_n is the smallest prime not on our list genereated by Euclid's argument at the nth stage. There are a plethora of (I believe) unknown questions about this sequence, the biggest of course being whether or not it contains every prime. This is a great sequence to think about, but it is a computational nightmare to enumerate! I downloaded these terms from the Online Encyclopedia of Integer Sequences (homepage is http://www.research.att.com/~njas/sequences/ and the page for our sequence is http://www.research.att.com/~njas/sequences/A000945 ). You can enter the first few terms of an integer sequence, and it will list whatever sequences it knows that start this way, provide history, generating functions, connections to parts of mathematics, .... This is a GREAT website to know if you want to continue in mathematics. There have been several times I've computed the first few terms of a problem, looked up what the future terms could be (and thus had a formula to start the induction). One last comment: we also talked about the infinitude of primes from zeta(2) = pi^2/6. While at first this doesn't seem to say anything about how rapidly pi(x) grows, one can isolate a growth rate from knowing how well pi^2 can be approximated by rationals (see http://arxiv.org/PS_cache/arxiv/pdf/0709/0709.2184v3.pdf for details; unfortunately the growth rate is quite weak, and the only way I know to prove the needed results on how well pi^2 is approximable by rationals involves knowing the Prime Number Theorem!).
- It is believed that there are only finitely many Fermat primes. The Fermat numbers Fn = 2^(2^n) + 1 have many interesting properties. One is that no two Fermat numbers share a common factor, which as a nice exercise we saw gives another proof of the infinitude of primes! Fermat primes also arise in determining which regular n-gons can be constructed with a straightedge and a compass. We used binary indicator random variables and linearity of expectation in modeling how often Fermat numbers are prime. One must be careful when using such models to predict properties of prime numbers and numbers, as these models miss arithmetic (for example, if we are too crude we'll predict there are infinitely many triples such that n, n+2 and n+4 are all prime, which is clearly absurd as at least one of these three must be divisible by 3). These models can be improved and some of the arithmetic can be incorporated -- if you want to know more, let me know.
- Mersenne numbers and Mersenne primes are wonderful objects to study, and we saw led to another proof of the infinitude of primes. If you want, you too can join the Great Internet Mersenne Prime Search. This would be a great topic for someone to use as a project.
- Thursday, October 6, 2011. We discussed Lagrange's theorem, Fermat's little Theorem and applications to RSA.
- Tuesday, October 4, 2011. We saw a few gems of finite group theory, and a lot of patterns that hopefully make you want to learn more.
- Thursday, September 29, 2011. We described the mechanics of RSA, and saw some of the issues that could arise in implementing it. We saw how the Euclidean algorithm is useful, and began our investigations into group theory.
- RSA encryption is just one of many encryption schemes based on number theory. We'll meet some others later in the semeter (elliptic curve cryptography is a very important modern example; another interesting approach is NTRU's lattice based scheme). What's nice about these methods is that one can describe the systems with undergraduate mathematics (though attacks on these systems use everything in our arsenal).
- Earlier in the semester we talked about adding points on an elliptic curve. It turns out that there is a group structure on elliptic curves, which replaces the group (Z/pqZ)* with a more complicated group and thus open up another possibility for encryption. In both systems the difficulty in cracking the code comes from having to solve the discrete log problem.
- We're only doing the briefest introduction to groups. The Rubik's cube group is one of many fun groups. The idea is to break complicated groups into simpler ones (this is a general plan of attack in almost all mathematical endeavors). A big result in the field is the Feit-Thompson Theorem (Walter Feit was one of my professors at Yale; John Thompson won a Fields medal for his work on this problem).
- Finally, some humor (mathematical and otherwise):
- Tuesday, September 27, 2011. We continued our investigation of the Euclidean Algorithm, which will play a key role in proving certain sets are groups.
- Thursday, September 22, 2011. Elliptic curves are a wonderful generalization of Pythagorean triples, and have enormous applications in cryptography. It's a bit of a (mathematical) miracle that we can create a group law and add points with rational coordinates on the elliptic curve and get a new point with rational coordinates.
- A big part of number theory / modern mathematics is efficiency. Some great algorithms are Horner's algorithm, fast exponentiation and the Euclidean algorithm, all of which are described in detail in Chapter 1 of my book. Horner's algorithm, for example, is very useful in fractal geometry as there one needs to repeatedly evaluate polynomials.
- We talked a lot about efficiency of computation. The Strassen algorithm (see also the Mathworld entry here, which I think is a bit more readable) multiples two NxN matrices A and B in about N^(log_2 7) multiplications; the reason for this savings (from N^3 operations) is that they can multiply two 2x2 matrices with seven and not 8 multiplications (3 = log_2 8). The best known algorithm is the Coopersmith-Winograd algorithm, which is of the order of N^2.376 multiplications. See also this paper for some comparison analysis, or email me if you want to see some of these papers. Some important facts. (1) The Strassen algorithm has some issues with numerical stability. (2) One can ask similar questions about one dimension matrices, ie, how many bit operations does it take to multiply two N digit numbers. It can be done in less than N^2 bit operations (again, very surprising!). One way to do this is with the Karatsuba algorithm (see also the Mathworld entry for the Karatsuba algorithm).
- Tuesday, September 20, 2011. We discussed a dimensional analysis proof of the Pythagorean theorem, and talked about Pythagorean triples. As there are trivially infinitely many if we just rescale all by the same amount, this led us to a quest for primitive triples, and then a quest for the right way to view what we did.
- We used dimensional analysis to prove the Pythagorean Theorem. There are many proofs; one particularly nice one is due to James Garfield, a Williams alum (and president of the US). Victor Hill, an emeritus professor of mathematics here, has a very enjoyable article on Garfield and his proof.
- We talked a bit about prep work for the GREs. Here's a nice way to rederive any needed trig identity quickly. Recall the exponential function exp is defined by e^z = exp(z) = sum_{n = 0 to oo} z^n/n!. This series converges for all z. The notation suggests that e^z e^w = e^(z+w); this is true, but it needs to be proved. (What we have is an equality of three infinite sums; the proof uses the binomial theorem.) Using the Taylor series expansions for cosine and sine, we find e^(iθ) = cos θ + i sin θ. From this we find |e^(iθ)| = 1; in fact, we can use these ideas to prove all trigonometric identities! For example:
- Inputs: e^(iθ) = cos θ + i sin θ and e^(iθ) e^(iφ) = e^(i (θ+φ))
- Identity: from e^(iθ) e^(iφ) = e^(i (θ+φ)) we get, upon substituting in the first identity, that (cos θ + i sin θ) (cos φ + i sin φ) = cos(θ+φ) + i sin(θ+φ). Expanding the left hand side gives (cos θ cos φ - sin θ sin φ) + i (sin θ cos φ + cos θ sin φ) = cos(θ+φ) + i sin(θ+φ). Equating the real parts and the imaginary parts gives the identities
- One can prove other identities along these lines....
- Finally, a common theme in mathematics is the need to simplify tedious algebra. Frequently we have claims that can be proven by long and involved computations, but these often leave us without a real understanding of why the claim is true. If you want, let me know and I'll show you my 40-50 page proof of Morley's theorem.; Conway has a beautiful proof which you can read here (it's after the irrationality of sqrt(2)). If you like non-standard proofs of the irrationality of sqrt(2), see the article I wrote with a SMALL student (to appear in Mathematics Magazine), which we'll probably discuss later in the course.
- Thursday, September 15, 2011. Today we reviewed modular arithmetic, proofs by induction, triangular numbers, and saw a proof of the Pythagorean theorem. We saw how poweful modular arithmetic can be -- it allowed us to `easily' prove certain families can have infinitely many primes, or that certain numbers cannot be a sum of two squares.
- There is at least one prime congruent to a mod b if a and b are relatively prime; the only way we can do this in general is to prove there are INFINITELY many such primes, and in fact that they occur in the right proportion; this is Dirichlet's Theorem on Primes in Arithmetic Progression, included in Chapter 3 of my book or see more for more information.
- One of the main results we proved today was the Pythagorean Theorem, which relates the length of the hypotenuse of a right triangle to the lengths of the sides (President Garfield is credited with a proof). For many classes, this result is important as gives us a way to compute the length of vectors. While we only proved it in the special case of a vector with two components, the result holds in general. Specifically, if v = (v1, ..., vn) then ||v|| = sqrt(v12 + ... + vn2). It is a nice exercise to prove this. One way is to use Mathematical Induction (one common image for induction is that of following dominoes); see also the appendix from my probability book.
- We also discussed notation for the natural numbers, the integers, the rationals, the reals and the complex numbers. We will not do too much with the complex numbers in the course, but it is important to be aware of their existence. Generalizations of the complex numbers, the quaternions, played a key role in the development of mathematics, but have thankfully been replaced with vectors (online vector identities here). The quaternions themselves can be generalized a bit further to the octonions (there are also the sedenions, which I hadn't heard of until doing research for today's comments).
- A natural question to ask is, if all we care about are real numbers, then why study complex numbers? The reason is that certain operations are not closed under the reals. For example, consider quadratic polynomials f(x) = ax2 + bx + c with a, b and c real numbers. Say we want to find the roots of f(x) = 0; unfortunately, not all polynomials with real coefficients have real roots, and thus to find the solutions may require us to leave the real. Of course, you could say that if all you care about is real world problems, this won't matter as your solutions will be real. That said, it becomes very useful (algebraically) to allow imaginary numbers such as i = sqrt(-1). The reason is that it allows us a very clean way to manipulate many quantities. There is an explicit, closed form expression for the three roots of a cubic; while it may not be as simple as the quadratic formula, it does the job. Interestingly, if you look at x3 - 15x - 4 = 0, the aforementioned method yields (2 + 11i)1/3 + (2-11i)1/3. It isn't at all obvious, but algebra will show that this does in fact equal 4! As you continue further and further in mathematics, the complex numbers play a larger and larger role.
- Tuesday, September 13, 2011. Today we discussed generating functions.
- The current record for writing large odd numbers as the sum of three primes is that any odd number at least 10^1346 is the sum of at most three primes (this number is far beyond the range of anything we can investigate on the computer). The key ingredient in our investigations is to use generating functions; the difficulty is finding such functions whose coefficients encode the information we want while being tractable enough to work with. One enormous advantage of the modern formulation of the Circle Method to the original is that we just use finite series; this avoids many convergence issues and simplifies the analysis.
- Many functions in mathematical physics initially exist only for some values of the parameters but can be continued elsewhere; my favorite is the Riemann zeta function (and the extension uses the Gamma function). What is amazing (and not initially apparent) is that the following frequently occurs. We have some function and we only care about its values at the real numbers (or maybe even just the integers); nevertheless, it is often easier to study it as a function of a complex variable (z = x + iy), as then we have all the tools and techniques of complex analysis at our disposal. A terrific example is the Prime Number Theorem (which says that, to first order, the number of primes at most x is about x/log x). This is a statement about integers, yet the `easiest' and `best' proofs all use the Riemann zeta function at complex arguments (and, as you may reasonably ask, why should we need to use complex numbers to count integers!). What follows is an aside on an aside -- this is clearly not needed for the course!
- In class we defined pi(x) to be the number of primes at most x. We discussed Euclid's argument which shows that pi(x) tends to infinity with x, and mentioned that with some work one can show Euclid's argument implies pi(x) >> log log x. As a nice exercise (for fun), prove this fact. This leads to an interesting sequence: 2, 3, 7, 43, 13, 53, 5, 6221671, 38709183810571, 139, 2801, 11, 17, 5471, 52662739, 23003, 30693651606209, 37, 1741, 1313797957, 887, 71, 7127, 109, 23, 97, 159227, 643679794963466223081509857, 103, 1079990819, 9539, 3143065813, 29, 3847, 89, 19, 577, 223, 139703, 457, 9649, 61, 4357.... This sequence is generated as follows. Let a_1 = 2, the first prime. We apply Euclid's argument and consider 2+1; this is the prime 3 so we set a_2 = 3. We apply Euclid's argument and now have 2*3+1 = 7, which is prime, and set a_3 = 7. We apply Euclid's argument again and have 2*3*7+1 = 43, which is prime and set a_4 = 43. Now things get interesting: we apply Euclid's argument and obtain 2*3*7*43 + 1 = 1807 = 13*139, and set a_5 = 13. Thus a_n is the smallest prime not on our list genereated by Euclid's argument at the nth stage. There are a plethora of (I believe) unknown questions about this sequence, the biggest of course being whether or not it contains every prime. This is a great sequence to think about, but it is a computational nightmare to enumerate! I downloaded these terms from the Online Encyclopedia of Integer Sequences (homepage is http://www.research.att.com/~njas/sequences/ and the page for our sequence is http://www.research.att.com/~njas/sequences/A000945). You can enter the first few terms of an integer sequence, and it will list whatever sequences it knows that start this way, provide history, generating functions, connections to parts of mathematics, .... This is a GREAT website to know if you want to continue in mathematics. There have been several times I've computed the first few terms of a problem, looked up what the future terms could be (and thus had a formula to start the induction).
- Thursday, September 8, 2011. Here are some additional links to topics discussed today.
- I strongly urge you to read the graduation speech here -- wonderful advice!
- The double-plus-one strategy is but one of many overlaps between probability and gambling. Other famous ones (recently) include card counting in blackjack. There are many references; see Thorpe's original article as well as his book. Another fun read is Bringing Down The House.
- It is worth remarking that many of the identities in combinatorics are proved by showing that two different ways of counting the same thing are equivalent, and then if we evaluate one we get the other for free. We saw a few examples of this today with stories about the combinatorial summands. Our solution to the cookie problem is quite elegant, and in some respects reminiscent of geometry class (remember all those proofs where the teacher cleverly adds auxiliary lines; the difference here is we just add more cookies). While it is possible to solve many combinatorial problems by brute force in principle, in practice this is not a good way to go -- it is time consuming, and quite likely that one makes a mistake. Typically one finds a way to interpret a given quantity two ways; we can compute one of them and thus we obtain a formula for the other. For example, we showed the number of ways of dividing C cookies among P people is (C + P - 1 choose P-1); here all the identical cookies are divided. What if we don't assume all the cookies are divided -- what is the answer now? It is just Sum_{c = 0 to C} (c + P - 1 choose P - 1); this is because we are just going through all the cases (we give out no cookies, 1 cookie, ...). What does this sum equal? Imagine now we have another person, say the Cookie Monster (this is one of Cameron's favorite clips), who gets all the remaining cookies. Then dividing at most C cookies among P people is the same as dividing exactly C cookies among P+1 people, and hence our sum equals (C + P+1 - 1 choose P+1 - 1). We saw how easy it is to add lower bound constraints on the number of cookies people get; sadly upper bound ones are harder (but we can say a few things by appealing to central limit theorems from probability).
- We saw the method of Divine Inspiration led to a proof of the Fibonacci recurrence, but divine inspiration can be fickle. We'll see a better proof when we turn to generating functions.