Posts Tagged ‘farey’


November 27, 2009

I think I’m about ready to tie together Farey sequences and the Riemann zeta function (specifically, the Riemann hypothesis). I know there were lots of gaps along the way, and lots more to explore from here, so that all gets to be the dénouement, if/when I get to it.

Yesterday we ended with the identity

\displaystyle \frac{1}{\zeta(s)}=\sum_{n}\frac{\mu(n)}{n^s}.

Define M(x)=\sum_{n\leq x}\mu(n), which you might call the cumulative Möbius function. If you are comfortable with your measure theory, you can decide that there is a measure, which I’ll denote dM, such that \int_a^b dM=M(b)-M(a). And if you’re ok with that statement, you probably also don’t mind writing the identity for the reciprocal of zeta, above, as

\displaystyle \frac{1}{\zeta(s)}=\int_0^{\infty} x^{-s}\ dM.

If you then integrate by parts you find that

\displaystyle \frac{1}{\zeta(s)}=s\int_0^{\infty}x^{-s-1}M(x)\ dx.

Now if M(x)\leq x^a (we could say something more like M(x)=O(x^a)) for some positive real value a, then this most recent integral will converge for \text{Re }s>a. Which means that \zeta(s) would have no zeroes in the half-plane \text{Re }s>a. Since there are zeroes on the critical line, where \text{Re }s=1/2, we know that no value of a less than 1/2 will work. But if you can show that M(x)\leq x^{(1/2)+\epsilon} for every positive \epsilon, then you’ll have proven the Riemann hypothesis (and earned yourself a cool million).

Ok, so now we just have to relate the Farey sequences to this M(x) somehow.

It’s been a while, so perhaps we should recall some basics about the Farey sequences. The n-th Farey sequence, denoted F_n consists of the reduced fractions p/q in [0,1] with q\leq n, ordered as usual. For today, let’s actually remove 0 from the sequences. Thinking about how many terms are in F_n, we quickly determine that there are \sum_{i\leq n}\phi(i), where \phi(i) is the number of positive integers less than i that are relatively prime to i (with \phi(1)=1). Let me call this sum \Phi(n), so that F_n has \Phi(n) terms.

To keep my notation under control, fix an n. Then for 1\leq v\leq \Phi(n), let r_v be the v-th term in the Farey sequence F_n. The terms in F_n are not equally spaced in [0,1], so let us also consider the sequence of \Phi(n) equidistant points in [0,1]. These will be the values s_v=v/\Phi(n). We’re interested in how far the Farey sequence is from this equidistant sequence, so let’s set \delta_v=r_v-s_v.

If f:[0,1]\to \mathbb{R}, then you can show (essentially by Möbius inversion) that

\displaystyle \sum_{v=1}^{\Phi(n)}f(r_v)=\sum_{k=1}^{\infty}\sum_{j=1}^{k} f(j/k)M(n/k).

The idea is that the function D(m) that is 1 for m>1 and 0 for m<1 is also

D(m)=\sum_n M(m/n),

because you can write (fairly cleverly, I feel)

M(m)=\sum_n \mu(n)D(m/n).

By the way, I’m being a bit loose with my values of these stepwise things when they actually make their steps. Apparently the convention is to define the value at the change to be the midpoint of the values on either side. I feel like it’s a technicality I’m not hugely interested in right now.

Now we apply this identity to f(u)=e^{2\pi iu}, obtaining

\displaystyle \sum_v e^{2\pi ir_v}=\sum_k M(n/k)\sum_j f(j/k).

That inner sum on the right-hand side is the sum of k unit vectors equally distributed around the unit circle, and so is 0 except when k=1. So we obtain

M(n)=\sum_v e^{2\pi ir_v}.

Finally, replace r_v with s_v+\delta_v. After shuffling some symbols around, you can write

M(n)=\sum_v e^{2\pi s_v}(e^{2\pi i\delta_v}-1)+\sum_v e^{2\pi is_v}.

Taking absolute values on each side, using the triangle inequality, and the identity we already used about sums of equally spaced unit vectors, we can write

\begin{array}{l}|M(n)|\leq \displaystyle \sum_v \left|e^{2\pi i\delta_v}-1\right| =\sum_v \left| e^{\pi i\delta_v}-e^{-\pi i\delta_v}\right|=2\sum_v \left|\sin \pi\delta_v\right|\\ \displaystyle \qquad \leq 2\pi \sum_v \left|\delta_v\right|.\end{array}

Using our lemma from earlier, about how to obtain the Riemann hypothesis from M, we have, at last:

If \sum_v |\delta_v| is O(n^{(1/2)+\epsilon}) for every \epsilon>0, then the Riemann hypothesis is true.

Hurray! The result that encouraged me to wander down this path!

So, that was fun. It turns out that both of today’s results that imply the Riemann hypothesis are actually equivalent to the Riemann hypothesis, but perhaps that’s a topic for another day…

[Update 20091203: Noticed I had some absolute value bars in the wrong places in the line before the final theorem, and corrected them (hopefully)]



November 10, 2009

Sorry about the lack of a MaBloWriMo post yesterday. I got a little behind, left resources I wanted at the office, and did some other things.

If you remember, my original goal in this process was to relate the Farey sequences to the Riemann hypothesis. I have some vague understanding that the relationship is based on the distribution of the terms in the Farey sequence, as points in the unit interval. It’s getting on toward time to talk about the Riemann hypothesis side of this picture, so I can start relating the two. But before I do, there’s one more diversion (hopefully it isn’t too far off base) I’d like to make related to Farey sequences. I’ve still got some learning to do about it, but let’s see what I can get down today, laying some foundation for the next day or two, perhaps.

Recall that the way to construct one Farey sequence, F_{n+1}, from the previous one, F_n, is to throw in all of the mediants of successive terms whose denominators aren’t too big. What if we didn’t bother with that last part, and just left in all the mediants? All the numbers we’ll write down are still reduced, and we’ll obtain all of the fractions in [0,1] this way, because they all show up in some Farey sequence, and we’re getting all the terms in all the Farey sequences, just sometimes a little earlier than Farey would.

Just to be clear, our first few sequences now look like this:

0/1\quad 1/1

0/1\quad 1/2\quad 1/1

0/1\quad 1/3\quad 1/2\quad 2/3\quad 1/1

0/1\quad 1/4\quad 1/3\quad 2/5\quad 1/2\quad 3/5\quad 2/3\quad 3/4\quad 1/1

and so on. We see that the n-th sequence in this list contains all of F_n, as well as some other numbers with bigger denominators. A nice property of these sequences, which isn’t shared with the Farey sequences, is that every other term in the n-th sequence is a new term that wasn’t in the n-1-st sequence. So when I go to make the next sequence in this process, I’m always taking the mediant of a “new” number and an “old” number. Moreover, each “new” number will be involved in two mediants… one taken “to the left” (with the “old” number on the left, making a smaller value) and one taken “to the right”.

This language let’s me write down a binary tree (every node has two branches down) as follows: Begin with 1/2 as the root (at the top, I’m throwing out 0/1 and 1/1). This is a new fraction in the second sequence above, so to make the third sequence, I added in the “left mediant” (1/3) and “right mediant” (2/3) from 1/2. To make the sequence after that, 1/3 gives a “left mediant” (1/4) and a “right mediant” (2/5), and similarly 2/3 spawns 3/5 and 3/4. So I build up a tree that looks like this:

\begin{picture}(200,80) \put(85,65){1/2} \put(45,45){1/3}\put(125,45){2/3} \put(25,25){1/4} \put(65,25){2/5} \put(105,25){3/5} \put(145,25){3/4} \put(5,5){\vdots} \put(45,5){\vdots} \put(85,5){\vdots} \put(125,5){\vdots} \put(165,5){\vdots} \put(55,55){\line(2,1){20}} \put(35,35){\line(1,1){10}} \put(125,55){\line(-2,1){20}} \put(65,35){\line(-1,1){10}} \put(115,35){\line(1,1){10}} \put(145,35){\line(-1,1){10}} \end{picture}

I’ll call this the Stern-Brocot tree, though the actual definition my be slightly different.

Another way I could build up a sequence of sequences, and generate another binary tree, is to use midpoints, instead of mediants. If my first sequence is still 0/1 and 1/1, then taking midpoints will always give me denominators that are powers of 2. My tree will now look like:

\begin{picture}(200,80) \put(85,65){1/2} \put(45,45){1/4}\put(125,45){3/4} \put(25,25){1/8} \put(65,25){3/8} \put(105,25){5/8} \put(145,25){7/8} \put(5,5){\vdots} \put(45,5){\vdots} \put(85,5){\vdots} \put(125,5){\vdots} \put(165,5){\vdots} \put(55,55){\line(2,1){20}} \put(35,35){\line(1,1){10}} \put(125,55){\line(-2,1){20}} \put(65,35){\line(-1,1){10}} \put(115,35){\line(1,1){10}} \put(145,35){\line(-1,1){10}} \end{picture}

I could now line up these trees, and determine a function that takes one rational number, finds it’s position in the first tree, and spits out the value in the same position from the second tree. For example, using our pictures above, we see that this function will take 3/5 to 5/8. This function is known as Minkowski’s question mark function, ?(x), and Wikipedia has a graph, so you should click that last link. I’ve only defined it on rationals, but you wave the “continuity” wand to extend my definition to a function on [0,1].

So, to evaluate ?(x) for some rational, you have to find that rational in the Stern-Brocot tree. How do you do that? You know that you don’t have to go further down in the tree than your denominator, but that’s still a lot of fractions to look at, potentially. Perhaps there’s a better way?

I guess one way is to play the over-under game. Start at 1/2. If that’s your fraction, you are done. Otherwise, if your fraction is bigger, move right, and if your fraction is smaller, move left. Repeat until you get to your fraction.

But there’s more math we can say if we think about this process slightly differently.

The point of the tree being a tree is that there is only one way to get from the root, 1/2, to any fraction. Working your way down, you make a sequence of moves, either left (L) or right (R). So you generate some string – LLRLRRRL, for example. Of course, that notation gets out of hand quickly, so we might also use exponents to write this example as L^2RLR^3L. To such a string we can associate a sequence of fractions by writing down whatever fraction we are at when we change direction. Let us write this sequence as p'_1/q'_1,p'_2/q'_2,\ldots, and call these “turning points”. I’ve used primes here, because the numbers I’m actually interested in are one before I make a turn. Let me call that sequence p_1/q_1,p_2/q_2,\ldots, calling these values parents (so p_k/q_k has a line directly down to p'_k/q'_k in the Stern-Brocot tree, connecting the two). So, for example, the sequence L^3R^2L^2 has turning points p'_1/q'_1=1/5, p'_2/q'_2=3/13, and finally p'_3/q'_3=7/31 and associated parents p_1/q_1=1/4, p_2/q_2=2/9, and 5/22.

What is the relationship among successive turning points and parents? Suppose you are at the turning point p'_k/q'_k, with parent p_k/q_k, looking to get to the next parent or turning point, and are going to do this with a sequence of a moves (left or right, it doesn’t actually matter). The next parent will be the (1,a-1) weighted mediant of the turning point p'_k/q'_k with its parent p_k/q_k. That is, p_{k+1}=p'_k+(a-1)p_k, and q_{k+1}=q'_k+(a-1)q_k. Similarly, the next turning point will be the (1,a) weighted mediant of p'_k/q'_k with p_k/q_k. This also means that it is the (1,1) weighted mediant of p_k/q_k with p_{k+1}/q_{k+1}.

Combining all of these relationships, messing with the algebra just slightly, you can determine that p_{k+1}=ap_k+p_{k-1} and q_{k+1}=aq_k+q_{k-1}. You may recall that these are exactly the relations between the convergents of a continued fraction (if you start putting subscripts on your a, writing it as a_{k+1}).

So, given a continued fraction [0;a_1,\ldots,a_n] for a rational x in (0,1), begin at 1/1 (which isn’t in the tree, I know) and move left a_1 times (the first move getting you to 1/2), then move a_2 right, and so on. In your final step, you only move a_n-1 (if you wonder about the “-1”, you might think that the n-th convergent is the fraction you are after, and that is the n-th parent, not the n-th turning point), and you’ll arrive at x. On your way, all of the fractions that you hit, one before a turning point, will be the convergents to x.

That only gets us half-way into evaluating ?(x), for this rational x you’ve picked. You first find its continued fraction and then, by the process above, you’ll see how to get down to x from the top of your tree. That means ?(x) will be the value in our second tree (taking averages, the tree with powers of two in the denominator) that follows the same path. The initial a_1 lefts, from 1/1, will get us to a value of 1/2^{a_1}. Then we are supposed to take a_2 rights. Taking 1 right will get us to 1/2^{a_1}+1/2^{a_1+1}. Another right will then be 1/2^{a_1}+1/2^{a_1+1}+1/2^{a_1+2}, which I’ll write 1/2^{a_1}+3/2^{a_1+2}. Yet another right gets us to 1/2^{a_1}+7/(2^{a_1+3}), and so on, until 1/2^{a_1}+(2^{a_2}-1)/2^{a_1+a_2}. After that, we move left a_3 times. The analysis of where we end up is basically the same, you just have to remember that you are subtracting 1/2^{a_1+a_2+i}s, instead of adding them.

Putting this all together, we obtain (remembering that in the last stage we only went a_n-1 moves.

?([0;a_1,\ldots,a_n])=\dfrac{1}{2^{a_1}}+\dfrac{2^{a_2}-1}{2^{a_1+a_2}}-\dfrac{2^{a_3}-1}{2^{a_1+a_2+a_3}}+\cdots\pm \dfrac{2^{a_n-1}-1}{2^{a_1+\cdots+a_n-1}}

Just as a warning, I may be off slightly in some of my formulas. I haven’t written down anything knowing that it was false, but I’m not convinced I obtain the correct formula at the end of the day (it looks slightly different from the formula at Wikipedia, for instance). If anybody sees an error, please let me know.

That’s probably plenty for today. Up next (I think), I want to take the derivative (even though I can’t) of ?(x), because Wikipedia thinks I’ll get something related to the density of the Farey sequences.

By the way, a paper I was looking at to figure out about these things was “The Minkowski Question Mark, GL(2,\mathbb{Z}) and the Modular Group (Expository)” by Linas Vepstas. The printed copy I have, from some time summer 2009, is incomplete, but it looks to be available online.

Ford Circles

November 5, 2009

Now that we’ve got some foundations laid for Farey sequences and rational approximations, let’s put pictures to some of those claims, via Ford circles.

For every rational h/k, in reduced terms, draw a circle in the plane with center (h/k,1/(2k^2)) and radius 1/(2k^2). I’m not going to try to come up with a better picture for you than the one used on the Wikipedia page for Ford circles. (I tried embedding the picture below, but that didn’t work. So I’ll leave it to you to click the link.)

What’s so great about this construction? Well, you may recognize those radii as the error term in our discussion of rational approximations. The statement that for every irrational \omega there are infinitely many reduced fractions h/k such that |\omega-h/k|<1/(2k^2) is the same as saying that the vertical line x=\omega intersects infinitely many of the Ford circles. The improved bound 1/(\sqrt{5}k^2) can also be found by (very carefully) analyzing vertical lines passing through Ford circles (more specifically, the gaps between them), but I think for now I’m going to let that statement be for now. If you want, the proof via Ford circles is in [Rademacher], and a proof via continued fractions is in [Hardy and Wright]. The result is known as Hurwitz’s theorem.

What else can be said about Ford circles? Well, yesterday I was talking about neighbors in Farey sequences. It turns out that h/k and H/K are neighbors in a Farey sequence if and only if the Ford circles at h/k and at H/K are tangent to each other.

To see why this is true, consider the circles at h/k and H/K, with h/k<H/K so Hk-hK>0). Since we know the location of their centers, we can find an expression for the distance between those centers. The two circles will intersect iff the distance between the centers is no bigger than the sum of the radii of the two circles. Re-arranging terms (and working with squares to avoid square roots), you might look at d^2-(r_k+r_K)^2, where d is the distance between the centers, and r_k and r_K denote the (hopefully obvious) radii. The two circles will be tangent if this is 0, and won’t intersect if the difference is positive. More care needs to be used when talking about what happens if the difference is negative, but we’ll see momentarily that this doesn’t happen, so let’s forget it. So, it turns out that d^2-(r_k+r_K)^2 is ((Hk-hK)^2-1)/(k^2K^2). Since all the letters there are positive integers, we see that d^2-(r_k+r_K)^2\geq 0, with equality iff Hk-hK=1. This says, based on our work yesterday, precisely that two Ford circles intersect iff the fractions they correspond to are neighbors in a Farey sequence, and when this happens, they intersect at a single point (they are tangent to eachother).

This justifies the picture from above, then. The circles never overlap, and some of them are tangent.

It might be worth it to go back and justify why any vertical line at an irrational value slices through infinitely many of the Ford circles. To that end, let \omega\in (0,1) be irrational. The two circles at 0/1 and 1/1 both have radius 1/2, and so certainly x=\omega hits at least one Ford circle, kicking off an induction argument.

Suppose that x=\omega hits the Ford circle h/k. For the sake of argument, let us suppose that \omega>h/k, so the vertical line lies to the right of the center of the circle. Not much changes if the opposite inequality is true, so let’s just stick with this one. I aim to show that x=\omega hits a circle with a smaller radius, i.e. goes through a circle corresponding to a fraction with a bigger denominator than k. I claim that it hits one of the circles tangent to the circle at h/k. By the above, we know that these are the circles corresponding to neighbors of h/k in some Farey sequence. Since I’m thinking about circles to the right of h/k, these will be successors in the Farey sequence. Yesterday we said that all of these successors are (m,1)-weighted mediants of h/k with H/K where H/K is the successor of h/k in F_k (the first Farey sequence that h/k shows up in).

Consider, then, the circles corresponding to the (1,1), (2,1), (3,1),… weighted mediants of h/k with H/K. Let me call the circles C_1,C_2,\ldots, using C for the circle at h/k. It isn’t too much work to check the necessary inequalities to show that the center of C_1 is further right than the right-most point on C. I’ve already said, even if not directly, that for each n, C_n is tangent to C_{n+1}. So, if you want, you may start at the top of C_1, and move left along the tops of all the circles in turn, jumping from one to the next at points of tangency. How far can you go doing this? Well, if you take the limit as m\to\infty of the (m,1)-weighted mediant of h/k with H/K, you get h/k (l’Hospital!). So, the tops of the circles form an unbroken path from the top of C_1 to as close to h/k, on the right, as we care to get. The vertical line x=\omega, which passes through the right side of C, must surely pass through this path, and therefore at least one of the C_n.

That’s probably just about enough with Ford circles for now. I’m not sure what I’ll be writing about tomorrow. Hopefully something else fun. I would like to note that, as I was writing this post, I was excited to find a connection to continued fractions. I knew a connection should be there (and I’ll probably find plenty more later), and found it while asking “how can we tell which of the C_n does x=\omega pass through?”. I remembered from reading about continued fractions, [a_0;a_1,a_2,\ldots], that the “convergents”, [a_0;a_1,\ldots,a_n] are rationals p_n/q_n that give “best rational approximations”. Those should be the circles that x=\omega passes through, or so. And so if you pass through p_n/q_n, the next circle you should get to is p_{n+1}/q_{n+1}. Well, it turns out, there is a nice relationship between successive convergents. Namely, p_{n+1}=a_{n+1}p_n+p_{n-1} and q_{n+1}=a_{n+1}q_n+q_{n-1}. Look at that! It’s the (a_{n+1},1)-weighted mediant of p_n/q_n with p_{n-1}/q_{n-1}! Fantastic. Could this story get any better? I’m glad I’ve decided to find out this month.

Neighbors in Farey Sequences

November 5, 2009

I was reading about Ford circles today, and was planning on writing about them. But then I noticed that one of the things I wanted to say was based on some things I didn’t say about Farey sequences when I talked about them a few days ago. So I thought I’d go back and fill in that gap, and save Ford circles for tomorrow.

So what did I skip? Well, I mentioned that to get from one Farey sequence to the next, you throw in mediants whose denominators aren’t too big. Is there a way, given a fraction h/n in F_n to figure out what terms in F_{n-1} it was the mediant of? Well, we could, of course, list the terms in F_{n-1}, in order, and then pick the two that surround the value h/n. I’m after another way.

Since h/n is supposed to be reduced, h and n are relatively prime, and so there are integers s and t such that hs-nt=1 (this usually (in my mind) gets written with a “+”, not a “-“, but just change the sign of your t). Once you’ve found such a pair (s,t), all of the other pairs that satisfy the same equation are of the form (s+mn,t+mh), where m is any integer. Among the s+mn, there is only one in the interval [0,n). Abusing notation a bit, or assuming you picked well to start, let’s call this value s, and its partner t. I claim that t/s is the term directly before h/n in F_n.

Before talking about t/s, I should have checked that s\neq 0. If this is not the case, then hs-nt=1 means n=1, in which case we are in F_1 which is just 0/1 and 1/1. Not much interesting going on there. So let’s go ahead and assume s>0.

Now, the linear equation hs-nt=1 gets re-written h/n-t/s=1/(ns) to show that h/n>t/s (so the fractions are at least in the correct order). Messing about with inequalities some more, you can check that 0\leq t<s, and since s and t are relatively prime (they satisfy the linear equation), the fraction t/s is actually in F_n. Let a/b be the predecessor of h/n in F_n (we want to show a=t,b=s). Since it is a neighbor, we showed last time that then hb-na=1. But now we’re saying that all solutions of the equation hx-ny=1 have a fixed form. That is, b=s+mn and a=t+mh for some integer m. But if m\neq 0 then b is not in [0,n), and so a/b is not in F_n. So m=0 and a=t,b=s, and we’re done.

Man, ok, so, after all that, we now know how to find the term before h/n in F_n. I’ll call it t/s, sticking with our notation above. What about the term after h/n? You could play mostly the same games as above, I suppose. Or, if the successor is T/S, then we know that h=t+T and n=s+S, since we made h/n as a mediant. So T=h-t and S=n-s, easy enough. [Update 20091109: You have to be careful with this. The above method for finding the successor to h/n in F_n only works because you are in F_n. If you look for successors in h/k in F_n, where k<n, this method, using the predecessor, is wrong. For example, 1/4 is the predecessor of 1/3 in F_5, but the fraction (1-1)/(3-4) suggested by the previous method is clearly NOT the sucessor of 1/3 in F_5]

So we know the neighbors of h/n in F_n. What is the next Farey sequence when h/n gets a new neighbor? Let’s stick to the next time it gets a new neighbor “on the left” (less than itself). We know that when that happens, it had to be because we made the mediant with the old neighbor, our friend t/s. So the next neighbor is (t+h)/(s+n), which doesn’t show up until F_{s+n}. The newest neighbor after that will be the mediant with this new neighbor, i.e., (t+2h)/(s+2n), in F_{s+2n}. Continuing on, we see that all of the neighbors are of the form (t+mh)/(s+mn), for some positive integer m. I called these (1,m)-weighted mediants (of t/s and h/n) yesterday.

The story goes pretty much the same on the right side. I called the neighbor on the right, in F_n, T/S. The next neighbors will be (h+T)/(n+S), (2h+T)/(2n+S), etc. These are the (m,1)-weighted mediants of h/n and T/S.

We can put both of these into the same statement by noticing that the (1,-m) mediant of t/s with h/n is the same as the (m-1,1) mediant of h/n with T/S. In other words, all of the neighbors of h/n, in all of the Farey sequences, can be written as (1,m) weighted mediants of t/s with h/n. They will have numerators of the form a=t+mh and denominators of the form b=s+mn. Guess what? These are exactly all of the solutions to hx-ny=1. Go back a few paragraphs, it’s the same thing I said there.

In summary, the neighbors to h/n, in any Farey sequence, are in one-to-one correspondence with solutions to hx-ny=1.

If I knew what I were doing, I could probably have gotten to this statement a bit more quickly. But I don’t remember the last time I knew what I was doing, so there you have it.

Rational Approximations

November 3, 2009

Suppose x is an irrational in the interval (0,1), and fix a positive integer n. Let’s see what I can say about rational approximations to x whose denominator is no bigger than n.

First of all, x lies in one of the intervals [j/n,(j+1)/n], where 0\leq j<n. Since x is irrational, it won’t be an endpoint of such an interval, nor will it be the midpoint. Which is to say, it is closer to one end than the other. More explicitly, we can say that |x-j/n|<1/(2n) for some 0\leq j\leq n.

Instead of requiring the denominator of a rational approximation to actually be n, what if we just ask that it be no bigger than n? Then we’re asking how well rationals in the n-th Farey sequence, F_n, approximate x.

Instead of thinking about |x-p/q|, let’s multiply everything by q, and so look at |qx-p| (we’ll divide by that q again soon enough). So now we’re not asking about rational approximations, as such, but asking about pairs of integers, p and q with 0\leq p\leq q\leq n, so that qx is “pretty close” to the integer p.

Let’s re-phrase once more. Consider all of the multiples x,2x,3x,\ldots,nx of x. All of these are irrationals, but could now be bigger than 1 (our original x was less than 1). I don’t want to think about things bigger than 1, so let me trim all of these multiples qx down to just the “fractional part”. Which is to say, I’ll now think about all of the values qx-\lfloor qx\rfloor (the “floor” notation picks out the “greatest integer less than”), which are all irrationals in the interval (0,1). Now for each q I have an integer \lfloor qx\rfloor, and let me denote this integer by p_q. You might note that p_q is always less than n.

So now I am thinking about the n irrational values qx-p_q in the interval (0,1). I’ll think about them in comparison to the rational points 0,1/n,2/n,\ldots,1, like I did before. Following Rademacher (as most of my content has been and will be), we distinguish two cases.

  1. Suppose some qx-p_q falls in the interval (0,1/n). Then |qx-p_q|< 1/n, and so |x-p_q/q|<1/(qn)\leq 1/q^2 (since q\leq n).
  2. If (0,1/n) is free of our points, then by the pidgeon-hole principle, some interval (j/n,(j+1)/n) contains two values, say ax-p_a and bx-p_b (arrange notation so that a<b, forcing also p_a<p_b). Then |(bx-p_b)-(ax-p_a)|< 1/n, which we re-arrange to say that |(b-a)x-(p_b-p_a)|< 1/n. Letting q=b-a and p=p_b-p_a, we have |qx-p|< 1/n, which we again re-write, as in case (1), to say that |x-p/q|< 1/q^2.

We have, therefore, improved the “order” of our approximation. To begin with, we said that rational approximations could be found with |x-j/n|< 1/(2n). I think of this as a “order 1” approximation, since the denominator of the error is a linear function of the denominator of the rational approximation. We have now improved this so that |x-p/q|< 1/q^2, which I would therefore call an “order 2” approximation. If you are a little more careful with your inequalities, you can improve the bound to |x-p/q|< 1/(2q^2) (an error that I think I’ll probably end up mentioning again tomorrow), basically for the same reason there is a coefficient of 2 in the order 1 approximation, I guess.

I should probably be a little (a lot) more careful in what I am saying with the above. I chose an irrational x and an integer n, and said that an “order 2” rational approximation could be found. The stronger claim is that, in fact, infinitely many order 2 rational approximations can be found, if you let n increase. Suppose your first order 2 approximation gives you |x-p/q|< 1/q^2, when you have bounded the denominator by n. Find a new N so that 1/N< 1/q^2. Then go through the process above, using N as the new bound on the denominator. When you find |x-p'/q'|< 1/(q'N)\leq 1/(q')^2 with that process, then you’ve found a new rational approximation (since it is closer to x then p/q was) that is still order 2 (since that’s what the process does for you).

So, can order 2 be improved on? Can you find order 3, or higher? It turns out, “no”, in general (and the golden ratio is an example). You can improve the coefficient of 2 in the order 2 approximation to \sqrt{5}, but that’s apparently the best you can do, in general. There’s more than can be said along these lines. The Roth-Liouville irrationality measure might be fun to talk about (it earned Roth a Fields medal, so it must be ok), as would continued fractions. I’m not sure how much these are related to my stated goal of understanding the link between Farey sequences and the Riemann hypothesis, so for now, perhaps I’ll leave them alone.

For some reason, I thought it might be fun to make a visualization about rational approximations using F_5. Here is a picture of the rationals in F_5, as a subset of [0,1]:

\begin{picture}(200,10) \put(0,0){\line(1,0){200}}\multiput(0,0)(40,0){6}{\circle*{3}}\multiput(50,0)(50,0){3}{\circle*{3}}\multiput(67,0)(67,0){2}{\circle*{3}}\end{picture}

Here’s a graph of the function taking x to “the rational in F_5 closest to x“:

\begin{picture}(200,200) \put(0,0){\line(1,0){200}}\put(0,0){\line(0,1){200}}\multiput(0,0)(40,0){6}{\circle*{3}}\multiput(50,0)(50,0){3}{\circle*{3}}\multiput(67,0)(67,0){2}{\circle*{3}}\put(20,40){\line(1,0){25}}\put(45,50){\line(1,0){13}}\put(58,67){\line(1,0){16}}\put(74,80){\line(1,0){16}}\put(90,100){\line(1,0){20}}\put(110,120){\line(1,0){16}}\put(126,133){\line(1,0){16}}\put(142,150){\line(1,0){13}}\put(155,160){\line(1,0){25}}\put(180,200){\line(1,0){20}}\end{picture}

Farey Sequences

November 3, 2009

Ok, so if I’m going to talk about Farey sequences and the Riemann hypothesis for MaBloWriMo, then Farey sequences are probably a good place to start.

Let us say that the n-th Farey sequence is the sequence of fractions h/k where 0\leq h\leq k\leq n and h and k are relatively prime (so that h/k is in “lowest terms”). Let’s denote this sequence by F_n. The first few are:

F_1: 0/1\quad 1/1

F_2: 0/1 \quad 1/2\quad 1/1

F_3: 0/1\quad 1/3\quad 1/2\quad 2/3\quad 1/1

F_4: 0/1\quad 1/4\quad 1/3\quad 1/2\quad 2/3\quad 3/4\quad 1/1.

There’s sort of a nice way to get from one Farey sequence to the next: add fractions incorrectly.

The mediant of the fractions h_1/k_1 and h_2/k_2 is defined to be the fraction (h_1+h_2)/(k_1+k_2). It turns out that the mediant of two fractions always lies between them, which you can check by taking differences and noticing they have the correct sign. Now, to make the sequence F_{n+1}, knowing the sequence F_n, you simply include all the existing terms, and any mediants of successive terms that have a denominator no bigger than n+1. So, for example, consider F_4 above. We consider the fractions obtained by adding to this sequence the mediants:

0/1\quad 1/5\quad 1/4\quad 2/7\quad 1/3\quad 2/5\quad 1/2\quad 3/5\quad 2/3\quad 5/7\quad 3/4 \quad 4/5\quad 1/1

and then we throw out any fraction whose denominator is bigger than 5, giving us

F_5: 0/1\quad 1/5\quad 1/4 \quad 1/3\quad 2/5\quad 1/2\quad 3/5\quad 2/3\quad 3/4 \quad 4/5\quad 1/1.

There’s a little something to worry about in this process. Well, probably a few little somethings. The one I’ve got in mind is: are the mediants always reduced? However, you might also ask: are we sure we get all of the fractions this way?

To begin to answer these, let me change the subject. Look at any of the Farey sequences listed above, and pick any two successive terms in that sequence. Call them h_1/k_1 and h_2/k_2, and let’s say that we’ve picked our indices so that h_1/k_1 < h_2/k_2. Now compute h_2/k_2 - h_1/k_1. Done? You got 1/(k_1k_2), didn’t you? This happens no matter what Farey sequence you are in, and no matter which successive terms you take. You always find that, with the notation above, h_2k_1-h_1k_2=1. This is fairly interesting.

We already knew that, for example, h_1 and k_1 were relatively prime. And for a pair h_1,k_1 of relatively prime integers, there always exist integers s,t such that h_1s+k_1t=1. Conversely, if you can find such a pair of integers, then h_1 and k_1 are relatively prime. What’s perhaps a little surprising is that, in this case, the s and t happen to show up in the next term of the Farey sequence: s=-k_2 and t=h_2.

Armed with the observation that the difference between successive terms has a 1 in the numerator, in even just the first Farey sequence, F_1, we can proceed by induction on n to show that if h_1/k_1 < h_2/k_2 are successive terms in F_n then: (1) h_2k_1-h_1k_2=1, and also (2) the mediant of these fractions is reduced. Therefore, if the mediant has a denominator no bigger than n+1, it will get included in F_{n+1}.

Does this process of expanding one Farey sequence to the next by throwing in mediants actually give us all of the terms of the next Farey sequence? That is, if m/n is a fraction in reduced form, is it the mediant of two successive fractions from F_{n-1}?

Let me answer that by considering what I’ll call weighted mediants (a name I just made up, it may or may not show up in your favorite reference). Let me say that the (a,b)-weighted mediant of fractions h_1/k_1 < h_2/k_2 is the fraction (ah_1+bh_2)/(ak_1+bk_2). I’ll assume a,b,h_1,h_2,k_1,k_2>0. So, for example, the normal mediant, defined above, is just the (1,1)-weighted mediant. I claim that: (1) every weighted mediant lies between the two original fractions, (2) every fraction between two successive terms in a Farey sequence can be obtained as a weighted mediant.

Claim (1) is easy, just take appropriate differences and note that they have the appropriate sign, as before. For claim (2), suppose h_1/k_1<p/q<h_2/k_2, where h_1/k_1,h_2/k_2 are successive terms in a Farey sequence, so that h_2k_1-h_1k_2=1. Our goal is to find a,b so that p=ah_1+bh_2 and q=ak_1+bk_2. The assumption h_2k_1-h_1k_2=1 means that we can, indeed, find integers satisfying this condition, namely: a=-k_2p+h_2q and b=k_1p-h_1q.

So, back to the question about obtaining every fraction in F_n by taking mediants from F_{n-1}. We need to show that if m/n<1 and m is relatively prime to n, then m/n is the mediant of two successive fractions in F_{n-1}. Well, m/n certainly lies between two of them, say h_1/k_1 and h_2/k_2 in keeping with our notation, and we have just shown that it may therefore be obtained as some weighted mediant (we want it to be the (1,1)-weighted mediant). Since the denominator of the (a,b)-weighted mediant is ak_1+bk_2, the smallest denominator is obtained only with the (1,1)-weighted (i.e., normal) mediant. This denominator must be at least n, because otherwise the fraction would show up in some Farey sequence before F_n. As m/n is obtainable as a weighted mediant, it must be the (1,1)-weighted mediant, which is what we wanted to show. Let’s call it a day.


November 3, 2009

I’m at least a day behind, but I’m thinking I might try this “Math Blog Writing Month” suggested by Charles Siegel over at Rigorous Trivialities.

A few months ago I was reading Hardy and Wright’s “An Introduction to the Theory of Numbers” with some friends. While that reading seems to have ended (we made it further than originally expected, to be honest), there was a topic I read about that I decided I wanted to learn more about. When H&W covered Farey sequences, I checked out the Wikipedia page and was fairly delighted to find that there is a statement about Farey sequences that is equivalent to the Riemann hypothesis (RH).

I know nearly nothing about the RH. You take a function defined by an infinite series, extend to the complex plane, notice that the zeroes have to lie in a particular region, and then the claim is that you can narrow down that region even more. Great. Why do I care about the zeroes of this function? Oh, right, because they are related to the distribution of the primes. And how is that again? I have no idea. Probably I should be embarrassed about this, being a math graduate student and not knowing the math behind the most famous open problem.

Well, now seems like as good a time as any to learn more. Farey sequences look fun, and close to some pretty pictures (Ford circles and the ? function). And so my plan is to learn enough about all of that, and the RH, to see if I can understand the relationship between the two. Not so that I can try to prove RH, mind you (I seriously have no delusions about this). It just seems like a fun thing to learn about. Plus maybe it’ll give me an excuse to give a talk in my department’s “graduate seminar”, which I always think is a fun thing to do.

I’ve got a few sources that I plan on looking at (and, indeed, have already been looking at). I found Rademacher’s “Higher Mathematics from an Elementary Point of View” to be quite delightful, and it contains information about the Farey and Ford part of the story. Of course, H&W might also help out here. I’ve got some papers tucked away somewhere I may dig up as well, and when I do, I’ll mention them. And, on the Riemann side of the story, I think I’ll look at Edwards’ “Riemann’s Zeta Function”, which has a section on the connection with Farey sequences.

I figure I’ll try to read at least a little bit every day, and share what I come across here. Perhaps in a month I’ll have some vague understanding (goal: less vague than it is now) about what’s going with the topic. Or perhaps I’ll crap out after a week (or less). But for now, let’s say I’m trying. Starting off a day late (and with an essentially non-mathematical post) means I probably should try to find a day or two and write more than one post. We’ll see. Also, I don’t expect I’ll get close to the 1000 word mark very often. We’ll see.