Archive for April, 2009

A Riemann Sum

April 29, 2009

One of my office-mates asked me an intuiging question today, one which neither of us could quite resolve. So I thought I’d post it here, and see if anybody had any comments for us.

Consider the integral \int_1^{b+1}\frac{1}{x}\, dx, whose value we know is \ln(b+1) by the Fundamental Theorem of Calculus. The question is: how to show this with Riemann sums.

Let’s set up the notation. Let f(x)=\frac{1}{x}. Consider partitioning the interval [1,b+1] into n intervals of equal width. The width of each sub-interval will be \Delta x=\frac{b}{n}, and the endpoints of the sub-intervals will be the x_i=1+\frac{b}{n}\cdot i for i=0,\ldots,n. Let’s use right endpoints to set up the Riemann sum (I think using left-, or mid-, points won’t be any better, but I could be mistaken), obtaining

\displaystyle \int_1^{b+1}\frac{1}{x}\, dx=\lim_{n\to\infty}\sum_{i=1}^{n}f(x_i)\Delta x=\lim_{n\to\infty}\sum_{i=1}^{n}\frac{1}{1+\frac{b}{n}\cdot i}\cdot \frac{b}{n}.

The question is, how to show that this limit is equal to \ln(b+1)?

The obvious thing to do seems to me to be first to multiply the two fractions together, obtaining the expression \frac{b}{n+bi}, and then perhaps factor out a b, writing the expression as \frac{1}{\frac{n}{b}+1}. Making a substitution, m=\frac{n}{b}, our limit becomes

\displaystyle \lim_{m\to\infty}\sum_{i=1}^{mb}\frac{1}{m+i}.

My next thought was to perhaps group the terms in these series in batches of m, say as

\displaystyle \sum_{i=1}^{mb}\frac{1}{m+i}=\sum_{d=1}^{b}\sum_{i=1}^{m}\frac{1}{dm+i}.

However, it’s not apparent to me that this has accomplished anything.

The only other thought I have had so far is to rewrite the sum as

\displaystyle \sum_{i=1}^{mb}\frac{1}{m+i}=\sum_{k=1}^{mb}\frac{1}{k}-\sum_{k=1}^{m}\frac{1}{k}.

We have thus re-written our sum as the difference of two partial sums for the harmonic series. I’m not sure what, exactly, this has gotten us, in terms of answering this question. Let H_n=\sum_{i=1}^{n}\frac{1}{i} be the n-th partial sum of the harmonic series. We have thus written

\displaystyle \int_1^{b+1}\frac{1}{x}\, dx=\lim_{m\to\infty}H_{m(b+1)}-H_{m}.

Of course, as the harmonic series diverges, this is, on the face of it, not a pleasant limit to consider.

One way we might cheat is to use the approximation H_n\approx \ln(n)+\gamma, where \gamma is the Euler-Mascheroni constant. Let’s ignore the fact that anything you say about this constant is probably intimately related to the fact that \int_1^t\frac{1}{x}\, dx=\ln(t). If we just blindly use this approximation, we do get the right answer, as H_{m(b+1)}-H_{m}\approx \ln(m(b+1))+\gamma-\ln(m)-\gamma=\ln(b+1).

As I think about this sum, I find it pretty interesting. If you think about bigger and bigger partial sums for the harmonic series, and keep subtracting an initial segment whose length is proportional to the number of terms, then you get (in the limit), natural logs. That’s kinda fascinating.

But anyway, anybody have any suggestions for us? I haven’t tried it, but perhaps it’s easier to see the area using trapezoids (or something?), instead of rectangles? Or is there some clever geometry trick one can apply first (or second?)? Perhaps a more convenient partition of the interval [1,b+1] makes the algebra more apparent?


Hardy and Wright, Chapter 7

April 25, 2009

Another week, another chapter (last week’s chapter). This week was “General Properties of Congruences”, a reasonably fun chapter. On a side note, I’d like to say how pleased I am that our little group is still going. And note that having a scheduled blog post every week has been a fun habit. It also makes the tags for these post dominate my tag-cloud. I guess that means I should try to spend more time writing about other things as well.

Chris mentioned, before our meeting, how he thought some of the statements were imprecise. I disagreed, but we both thought that some of the wording could be improved. I think that’s to be expected from a book first written 70 years ago. Perhaps I should try to dig up some specific examples.

Chris and Eric and I all agreed that the beginning of the chapter, the first few sections on results about polynomials mod p, brought us back to our first year, graduate-level algebra class.

We discussed how the numbers A_l, defined to be “the sum of the products of l different members of the set 1,2,\ldots,p-1” was just the coefficient of x^l in the polynomial (x-1)(x-2)\cdots (x-(p-1)). Chris pointed out that really it was the absolute value of that coefficient. If I remember correctly, this was some of the wording Chris was unhappy with.

Most of the rest of our time was spend tracing through the discussion in the text from sections 7.7 and 7.8 on “The residue of \{\frac{1}{2}(p-1)\}!” and “A theorem of Wolstenholme”. We all seemed to think that these were fun sections.

As none of the three of us spend much time actually working with integers, we had a bit to discuss with the “associate” of a number mod p, or mod p^2. We all realized that the associate was just the multiplicative inverse, but had to stop for a second to think about the difference between the inverse mod p and mod p^2. We realized, before too long, that if the inverse of i, mod p, is n, then the inverse of i, mod p^2, has the form n+p\cdot k.

We thought about the relationship between integers mod p and mod p^2. Thinking mod p^2, we write down a p by p array, the first row being 0,1,2,\ldots,p-1, the next row being p,p+1,\ldots,2p-1, and on until the final row, p(p-1),\ldots p^2-1. Thinking about multiplicative inverses again, if the inverse of i is n mod p (as in the previous paragraph), then the inverse of i mod p^2 appears in the same column as n in this array we have constructed. We were trying to interpret Wolstenholme’s theorem (the sum 1+\frac{1}{2}+\cdots+\frac{1}{p-1} is equivalent to 0 mod p^2, where 1/i is the multiplicative inverse of i mod p^2) as summing across rows in this array. That’s not quite right, because the inverses don’t all lie in a row, they are scattered around. If I’m thinking about things correctly, though, these inverses, 1,\frac{1}{2},\ldots,\frac{1}{p-1} (mod p^2) should occur one in each column of the array we have made (I guess, besides the first column, which is the column of multiplies of p).

We didn’t make it to the theorem of von Staudt, concerning Bernoulli numbers taken mod 1. Perhaps we’ll return to it sometime later.

Hardy and Wright, Chapter 6

April 20, 2009

Last week’s exploration into “Congruences and Residues” primed us (no pun intended) for Chapter 6 of Hardy and Wright, “Fermat’s Theorem and its Consequences”. This was, for me, the most challenging chapter so far. Luckily, we had our largest attendance to date (four) to get through it.

Chris started out by telling us that he thinks about as many of these mod-p congruence results as possible as results about the cyclic group of units in the integers mod p. He’s an algebraist, so it’s reasonable.

Eric told us about a relation between some of these mod-p results concerning binomial coefficients and topology. He mentioned something about the sphere spectrum, it’s localization at p, and the Eilenberg-Maclane spectrum H\mathbb{Z}. Perhaps he’d help us out by providing those remarks in the comments, as I’m not particularly familiar with them.

I asked if anybody knew if there was some way I was allowed to think about a^{\frac{1}{2}(p-1)} (whose equivalence class mod p was a hot topic in this chapter) as (\sqrt{a})^{p-1}. When a is a quadratic residue, this just about makes sense (of course, it would have two roots, but both, when raised to the p-1 power, give 1 as a result). Alternatively, when a is not a quadratic residue, a^{\frac{1}{2}(p-1)}\equiv -1\pmod{p}. So when a is not a quadratic residue, I’m still tempted to think of \sqrt{a} in some relation to the imaginary number i. None of us in the meeting seemed to know if there was some reasonable connection here, so perhaps a reader can fill us in?

While I’m talking about quadratic residues, I should point out that the Mathworld page on quadratic residues has some fun pictures.

I then mentioned that I had found the Shanks-Tonelli algorithm for finding the “square root” of a quadratic residue (that is, given a quadratic residue a, find an x so that x^2\equiv a\pmod{p}). While I’m dropping names, I also noted that the result that the Fermat number F_k is prime iff F_k|(3^{\frac{1}{2}(F_k-1)}+1), at the end of section 6.14, sometimes goes by the name of Pepin’s Test.

Next, I mentioned that I had spent some time thinking about Theorems 86 and 87 for composite moduli. The question is: For which n is there an 0<m<n and 0<x,0\leq y so that 1+x^2+y^2=mn? It’s easy to show that this is never satisfied if n is 4, or a multiple thereof. Eric noted that this was easy to see by thinking about the squares mod 4 (only 0 and 1 are squares mod 4). I found this to be a fun little problem to use as inspiration for learning a little more python, and generated a big table of how to write multiples of n as one more than a square, or a sum of squares, for varying n. Turns out, there are many ways, in general, of doing so.

I asked if anybody had any ideas what it was about 1093 that made it one of the two known primes so that 2^{p-1}-1\equiv 0\pmod{p^2}, in relation to Theorem 91. That is, was there something I should have gleaned from the proof? Nobody was particularly sure, but Eric thought maybe there was some relation to regular primes (or irregular). This is a topic that’ll apparently come up later in the book, but Eric told us that it was related to some statements about Dedekind domains and the class group. When we looked up the regular and irregular primes on the Online Encyclopedia of Integer Sequences, though, Eric guessed that maybe it was something else, as there are quite a few of both type. For your convenience, the regular primes are sequence A007703, and the irregulars are A000928.

I guess this last question of mine wasn’t particularly bright. If more were known about why 1093 worked, I feel like either people would have found more, or there would be proofs that there weren’t others. Perhaps I could look into it some more. Either way, it was only the beginning of my poor questions. I also asked why it was we were studying quadratic residues, instead of, say, triadic (?) or higher. My guess, which Eric seconded, was that it was simply the next easiest thing to consider after linear residues, which are somewhat straightfoward. We both guess that things are harder to say about higher powers. I suddenly wonder about higher powers of 2 though… quartic, octic… I’m just making up words.

My remaining not particularly bright question was why there was a whole section about the “quadratic character of 2.” Why 2, in particular? We decided that it was just because it was next after 1, and the answer was relatively easy to obtain. Andy pointed out that, since the quadratic character of -3 was done in the book, as was -1, the quadratic character of 3 was known, using the result about products of quadratic residues and non-residues.

In the 6th edition of the book (the one I’m using, to which all references refer) there is a typo in theorem 97, which we verified because Andy has an older version of the text, without the typo (so… they re-typed the text? I almost think that’s a job I could handle happily. I’ve done it before [pdf]). In the theorem, they make a claim about when 7 is a quadratic residue, and then return to prove the theorem after Gauss’ law of reciprocity. However, when they return, they actually prove a theorem about 5, not 7. I tried my hand at 7, but it seemed hard to say anything about it (at least, consider primes of various classes mod 10).

Eric asked about the efficiency of the primality tests that occupy the second to last section of this chapter (Theorems 101 and 102). None of us seemed to know anything, but my guess was that finding the element h, that seems to help out both of these tests, was not easy. But I really have no idea.

Finally, we concluded with my account of Carmichael’s paper “Fermat Numbers” in the American Mathematical Monthly, V. 26 No. 4 from 1919, pp 137-146 (available (in a sense) at JSTOR here). At least, my account of the part about Euler/Lucas’ result on divisors of Fermat numbers, which I mentioned in our discussion of chapter 2. For completeness or so, allow me to present the proof here.

Theorem: If the prime p divides the Fermat number F_n=2^{2^n}+1, then there is a k so that p=k2^{n+2}+1.

Proof: Since 2^{2^n}\equiv -1\pmod{p} by assumption, we see easily that (2^{2^n})^2=2^{2^{n+1}}\equiv 1\pmod{p}, and so by Fermat’s theorem, 2^{n+1}|p-1, which is to say p=k'\cdot 2^{n+1}+1 (this is Euler’s result). It remains to be seen that k' is even. If n\geq 2 then p=k'\cdot 2^{n+1}+1\equiv 1\pmod{8}, meaning p is of the form 8t+1. Since we studied the quadratic character of 2, we know that 2 is a quadratic residue of a prime of this form, and so 2^{\frac{1}{2}(p-1)}\equiv 1\pmod{p}. Thus, \frac{1}{2}(p-1) is divisible by 2^{n+1}, and so p-1 is divisible by 2^{n+2}, which was our goal. Of course, one should check the claim for the numbers F_0 and F_1.

Since the product of two numbers of the form k2^{n+2}+1 has the same form, the theorem can actually be stated about any divisor of Fermat numbers, not just prime divisors.

Hardy and Wright, Chapter 5

April 11, 2009

As per schedule, we talked about chapter 5 from Hardy and Wright’s Number Theory book this week (chapter 4 last week). “We” was the smallest it’s been, and so “talked about” was the most brief of our talks to date (besides, perhaps, the organizational meeting).

Chapter 5, “Congruences and Residues”, was, in my mind, a somewhat odd chapter. The first handful of definitions and results, about congruence mod m, I feel like I’ve (mostly) seen several times, so I take it as pretty basic (which isn’t necessarily to say easy to read). I also like thinking about greatest common divisors, which were introduced in this chapter, in the context of category theory, but still am not sure where to go with it. I had hoped, fleetingly, that some lemma in this section might be stated in terms of some lemma about limits or so, but didn’t notice one. Anyway, after this introduction, some crazy sums are mentioned (Gauss’, Ramanujan’s, and Kloosterman’s), but I don’t see the motivation, and they aren’t used. Looking at the index, it looks like only Ramanujan’s will show up later in the text. After this section on sums, the chapter ends with a proof that the 17-gon is constructable. All in all, it seems like a hodge-podge chapter, both in topic and difficulty.

Chris and I agree that using \equiv for “is congruent to” and “is logically equivalent to” is pretty obnoxious. Especially when used all in the same line. For example:

t+ym'\equiv t+zm'\pmod{p}\equiv m|m'(y-z)\equiv d|(y-z)

Chris brought up the result that the congruence kx\equiv l\pmod{m} is soluble iff d=(k,m) divides l, in which case there are d solutions (this is Theorem 57 in the book). He brought it up because he wasn’t entirely aware of the generalization past d=1. I’m not sure I was either, but it is reasonable.

We did agree that using the notation \overline{x} for the solution x' to the equation xx'\equiv 1\pmod{m} (when it exists) was a nice choice of notation. Embedding the integers mod m in the unit circle in the complex plane via k\mapsto e^{2\pi ik/m}, the multiplicative inverse of x is the complex conjugate, so the notation lines up.

We finished by talking a little bit about geometric constructions. I liked the long argument about the 17-gon being constructable, by showing that the cosine of an interior angle was constructable, by showing that it was a solution to a system of quadratics (or so). At the beginning of this process, 3 is chosen as a primitive root of 17, and used to define a permutation of \{0,\ldots,15\} by m\mapsto 3^m\pmod{17}. I’m not exactly sure why this was done, or why 3 in particular, but it’s fun to trace through the argument from then on. Chris liked the geometric construction, and is tempted to try actually performing it. We reminded ourselves how to bisect angles and draw perpendiculars with straight-edge and compass.

Hardy and Wright, Chapter 4

April 3, 2009

Following last week’s discussion on chapter 3 of Hardy and Wright’s number theory book, we talked about chapter 4 (Irrational Numbers) this week.

Chris claimed (perhaps quoting something) that Pythagoras is like Shakespeare (or was it Homer?): things with his name as author may not have been authored by him. That’s probably not his exactly quote, but it’s the spirit of it.

We agreed that it was fun to see that roots of monic polynomials with integer coefficients are either integer or irrational. The historical notes about why square roots were only shown irrational (or integral) up to (possibly including) 17 were also interesting.

One of the lemmas in the book is that \log_{10}2 is irrational, and the book states that “\log_n m is irrational if n and m are integers, one of which has a prime factor which the other lacks.” When I came across this note, I wondered if there was some cute way to say this condition, that “one of which has a prime factor which the other lacks.” I thought maybe something with gcds. We talked about it a little, and didn’t come up with much. We discussed whether it meant prime power factor, or just prime factor. So, for instance, are n=2^2\cdot 3 and m=2\cdot 3 handled by the test? Andy pointed out that “having a prime power factor which the other lacks” is simply saying that the numbers aren’t equal, which clearly is too strong a statement here. However, a more general statement, allowing differing prime powers, seems hard (or at least harder) to state. So we left it.

The main thing I was curious about from this chapter was from the end of the chapter, section 4.7, “Some more irrational numbers”, where it is shown that e^{\gamma} is irrational for any rational \gamma and that \pi^2 is irrational. Both of these proofs rely on a function


whose appearance here I would really love to understand. Is it just something terribly clever somebody (I’m looking at you Euler, even if it doesn’t make sense historically (which I have no idea about)) wrote down? Or should I have expected to see it, or even come up with it myself?

In the proof that e^{\gamma} is irrational, they reduce to the case \gamma=h an integer, and then define


While sitting down to write this post, I noticed that this comes from trying to integrate h^{2n+1}e^{hx}f(x), so I guess that’s ok. But I still don’t see where the f(x) comes from. Is it just picked out because it has the property that f(1-x)=f(x), and is bounded above by 1/n!? Or perhaps it comes about because you can think of it as the n-th term in the Taylor series for e^{x(1-x)}, as Chris suggested? Why would I care about that Taylor series though? And what does the integral they compute, \int_0^1 h^{2n+1}e^{hx}f(x)\,dx have to do with any of this? Chris suggested perhaps some error term in a Taylor series? Perhaps the Taylor series for e^h (whose irrationality we are trying to show, after all)?

We had a couple of other thoughts (whose they were, I don’t recall). Was this integral some sort of useful convolution? Or perhaps related to the Gamma function? Or maybe the symmetry f(x)=f(1-x) reminds one of the Zeta function? My other comment was that \frac{d}{dx}e^{hx}F(x) made me think it looked like an integrating factor for some differential equation, but I think that was the most off-track of my comments. The key seems, to me, to be to determine where this integral, \int_0^1 h^{2n+1}e^{hx}f(x)\, dx comes from. Suggestions? I guess I should dig up the references…

Kan Extensions

April 1, 2009

Let’s set some notation up. Let \mathscr{A,B,C} be categories, and suppose that F:\mathscr{A}\to \mathscr{B} and G:\mathscr{A}\to \mathscr{C} are functors. Our goal is to define a functor H:\mathscr{B}\to\mathscr{C} that in some reasonable sense extends F (like, maybe we could make a commuting triangle?).

In particular, I’d like my H to also come with a natural transformation \eta:HF\to G (this is what justifies the term “extension”). And I’d like H to be the “best” such functor. That is, if K:\mathscr{B}\to\mathscr{C} comes with a natural transformation KF\to G, then I want to have a natural transformation K\to H (presumably making appropriate diagrams commute). If I can find an H with this property, I will call it the (right) Kan extension of G along F.

Calling this the right extension has always confused me. I always, in fact, had it backwards – I would have called this the left extension. Since HF\to G, I think of H as being on the left of G, so I thought it was the left extension. I guess instead I should be thinking about how it shows up on the right of any other extension (there’s a natural transformation to it).

Since I had my extension on the wrong side, I’ve always been confused by the pointwise construction of the extension. Let’s have H be the right extension, in the “correct” sense above, so that HF\to G. In several places (references at the bottom) I’ve seen that one can compute H(b) as a limit over an arrow (slice) category:

H(b)=\lim\limits_{\substack{Fa\to b\\ \in F\downarrow b}}G(a).

For convenience, let me suppose F:\mathscr{A}\to\mathscr{B} is the inclusion of a sub-category. Then I can suppress it from the notation, and the above becomes

H(b)=\lim\limits_{\substack{a\to b\\ \in \mathscr{A}\downarrow b}}G(a).

But here’s where I always got confused – I’m supposed to get a natural transformation H\to G. But this means I’m looking for a family of maps, each of which is a map out of a limit. I don’t like mapping out of limits. That’s not what they are for. Limits are for mapping to.

Today, I finally realized that you do, actually, map “out” of this limit, in a sense. Taking a\in \mathscr{A}, we’re supposed to get a map H(a)\to G(a) – that is, a map

\left(\lim\limits_{\substack{a'\to a\\ \in \mathscr{A}\to a}}G(a')\right)\to G(a).

The map we want actually comes from part of the definition of a limit. Since id:a\to a\in \mathscr{A}\to a, this arrow is an object that the limit is taken over. And the value of the functor whose limit we are taking, evaluated at this object, is just G(a). And so, by definition of limit, we always have a “projection” from the limit to the functor evaluated at any of the objects we are taking the limit over, and so we’ve got our map.

Of course, you’re supposed to check this is “works” – that’s its natural and universal and probably other things as well. But I’m happy now, because I finally have things on the standard side, and see where the maps come from.

While I’m on the subject, I should point out that if \mathscr{A} is a full subcategory of \mathscr{C}, then H(a)=G(a) for all a\in\mathscr{A}, because id:a\to a is an initial object in the arrow category you take the limit over. If the inclusion isn’t full, though, this need not happen.

As another (final) note, I should mention why I’m looking at Kan extensions. Possibly after changing limits to homotopy limits in the above, Kan extensions are useful because they preserve homotopy limits. With the notation above, \text{holim }H\simeq \text{holim }G. So if you’ve got a functor out of some big category, but show that it’s equivalent to the extension of a functor on a subcategory, you can work with the smaller category to think about homotopy limits.

References: When Wikipedia isn’t enough (Kan extension), I look at MacLane’s “Categories for the Working Mathematician”. I’m also a big fan of Borceux’s 3 volume “Handbook of Categorical Algebra”. Perhaps with my new understanding, I should go see what I can make of MacLane’s statement that “All Concepts Are Kan Extensions”…