Posts Tagged ‘ford circle’

LFTs and Ford Circles

November 7, 2009

Given 4 complex numbers, a,b,c,d, we may consider the linear fractional transformation (LFT)


Well, 4 numbers are enough to make a matrix, \left(\begin{smallmatrix} a & b\\c & d\end{smallmatrix}\right). Is there any better reason to relate the linear fractional transformation with this matrix?

Suppose you have two matrices, \left(\begin{smallmatrix}a&b \\ c&d\end{smallmatrix}\right) and \left(\begin{smallmatrix}a'&b'\\c'&d'\end{smallmatrix}\right). Then the product is as follows:

\begin{pmatrix}a&b\\c&d\end{pmatrix}\begin{pmatrix}a'&b'\\c'&d'\end{pmatrix}=\begin{pmatrix}aa'+bc' & ab'+bd' \\ ca'+dc' & cb'+dd'\end{pmatrix}.

If you take the composite of the two linear fractional transformations, i.e.,

\dfrac{a\cdot \frac{a'z+b'}{c'z+d'}+b}{c\cdot \frac{a'z+b'}{c'z+d'}+d}

and then play around simplifying that expression for a few minutes, you obtain the LFT


which is precisely the LFT corresponding to the product matrix above. So, if nothing else, writing LFTs as matrices this way won’t lead us astray when thinking about composites.

This idea is not without its confusion, for me anyway. Generally when you think about a 2×2 matrix of complex values, you are thinking about that matrix as a linear map \mathbb{C}^2\to\mathbb{C}^2, which is not what we are doing above. Instead, I guess we are saying that the “monoid” (group without inverses) of 2×2 matrices, M_2(\mathbb{C}), acts (in the technical sense) on \mathbb{C} as linear fractional transformations. My guess is that there are even better ways to say what is going on.

I think it is also important to keep in mind that two different matrices may correspond to the same LFT. For example, \left(\begin{smallmatrix}1&2\\ 0&2\end{smallmatrix}\right) represents the same LFT as \left(\begin{smallmatrix} 1/2 & 1\\ 0 & 1\end{smallmatrix}\right). More generally, if \lambda is any complex value (nonzero), then \left(\begin{smallmatrix}a&b\\c&d\end{smallmatrix}\right) represents the same LFT as \left(\begin{smallmatrix}\lambda a&\lambda b\\ \lambda c & \lambda d\end{smallmatrix}\right). I guess one can think of M_2(\mathbb{C}) as a \mathbb{C}-vector space (isomorphic to \mathbb{C}^4), and then think of its projective space (the quotient where two “vectors” (matrices here) are the same when they differ by a scalar (complex) multiple), which I’ll denote P(M_2(\mathbb{C})). Then I think I’m saying that the action of M_2(\mathbb{C}) on \mathbb{C} actually is an action of the quotient, P(M_2(\mathbb{C})). I’m not sure if this is a useful viewpoint (or, indeed, correct).

Yesterday, when I was talking about how to picture what an LFT does to \mathbb{C}, I wrote down a factorization of the LFT as a composite. Our new notation gives us another way to write that factorization (recall \alpha=-(ad-bc)/c^2, and that we had assumed c\neq 0):

\begin{pmatrix}a&b\\c&d\end{pmatrix}=\begin{pmatrix}\alpha & a/c\\ 0&1\end{pmatrix}\begin{pmatrix}0&1\\ 1&0\end{pmatrix}\begin{pmatrix}1&d/c\\ 0&1\end{pmatrix}.

As is frequently useful, we will assume that ad-bc\neq 0 (indeed, this factorization seems to require it – I think I’m missing something somewhere, anybody see it?). Notice that ad-bc is the determinant of the matrix representing our LFT. We may then multiply all entries in our matrix (without changing the LFT, as discussed above) by 1/(ad-bc), and obtain a matrix with determinant 1. Let’s do that, making \alpha=-1/c^2.

Yesterday, when I was working on the factorization above, I only had something like \epsilon idea where I was going. I think today I’ve got about twice that, so I want to re-write the factorization. Let me write it as

\begin{pmatrix}1 & a/c\\ 0 & 1\end{pmatrix}\begin{pmatrix}1&0 \\ 0& c^2\end{pmatrix}\begin{pmatrix}0 & -1\\ 1 & 0\end{pmatrix}\begin{pmatrix}1 & d/c\\ 0 & 1\end{pmatrix}.

So what’s the connection with Ford circles? Recall that for a reduced fraction h/k, the associated Ford circle is the circle centered at (h/k,1/(2k^2)) with radius 1/(2k^2). Following Rademacher (and, presumably, others), let us say that the “fraction” 1/0 also gets a Ford “circle”, the line y=1 in the plane. This isn’t such a nasty thing to do, as it has the tangency properties I talked about when talking about Ford circles. Anyway, let us think about applying our transformation \left(\begin{smallmatrix}a&b\\c&d\end{smallmatrix}\right), as the composite given above, and see what happens to this line y=1. We’ll assume that a,b,c, and d are all integers.

The first step, \left(\begin{smallmatrix}1&d/c\\ 0 &1\end{smallmatrix}\right) is the linear translation z+d/c. Since d/c is real (since c,d are integers), this translation is a horizontal shift, which leaves y=1 unchanged.

Next up, \left(\begin{smallmatrix}0&-1\\1&0\end{smallmatrix}\right), which is -1/z. Thinking of a point on the line y=1, you can quickly determine that its polar coordinates are (\csc \theta,\theta). The transformation -1/z is the composite of: (1) inversion with respect to the unit circle (the point becomes (\sin \theta,\theta)), (2) reflection across the horizontal axis (giving (\sin \theta,-\theta)), and finally (3) multiplication by -1 (giving (-\sin \theta,-\theta), since these are polar coordinates). This final point is (\sin \theta,\pi-\theta)=(\sin \pi-\theta,\pi-\theta). As \theta varies through (0,\pi), \pi-\theta also varies through this interval, and so we get the graph of the polar curve r=\sin \theta. If you know your polar curves, you know what this looks like…

So, the first two transformations take the line y=1 to the circle with center (0,1/2) and radius 1/2. The next in our composite is multiplication by 1/c^2, which is just a scaling (since c\in \mathbb{R}). This scaling takes our circle to the circle with center (0,1/(2c^2)) and radius 1/(2c^2). Finally, the last transformation is another horizontal translation, leaving our circle centered at (a/c,1/(2c^2)). We recognize this as the Ford circle for the fraction a/c (as long as that fraction is reduced).

Wasn’t that fun? If you want to think about it some more, you might convince yourself that any point above the line y=1 will get moved to a point inside the Farey circle resulting from this process.

Anyway, enough out of me. Hopefully tomorrow I’ll have slightly more of an idea what I’m talking about. Don’t count on it though.


Ford Circles

November 5, 2009

Now that we’ve got some foundations laid for Farey sequences and rational approximations, let’s put pictures to some of those claims, via Ford circles.

For every rational h/k, in reduced terms, draw a circle in the plane with center (h/k,1/(2k^2)) and radius 1/(2k^2). I’m not going to try to come up with a better picture for you than the one used on the Wikipedia page for Ford circles. (I tried embedding the picture below, but that didn’t work. So I’ll leave it to you to click the link.)

What’s so great about this construction? Well, you may recognize those radii as the error term in our discussion of rational approximations. The statement that for every irrational \omega there are infinitely many reduced fractions h/k such that |\omega-h/k|<1/(2k^2) is the same as saying that the vertical line x=\omega intersects infinitely many of the Ford circles. The improved bound 1/(\sqrt{5}k^2) can also be found by (very carefully) analyzing vertical lines passing through Ford circles (more specifically, the gaps between them), but I think for now I’m going to let that statement be for now. If you want, the proof via Ford circles is in [Rademacher], and a proof via continued fractions is in [Hardy and Wright]. The result is known as Hurwitz’s theorem.

What else can be said about Ford circles? Well, yesterday I was talking about neighbors in Farey sequences. It turns out that h/k and H/K are neighbors in a Farey sequence if and only if the Ford circles at h/k and at H/K are tangent to each other.

To see why this is true, consider the circles at h/k and H/K, with h/k<H/K so Hk-hK>0). Since we know the location of their centers, we can find an expression for the distance between those centers. The two circles will intersect iff the distance between the centers is no bigger than the sum of the radii of the two circles. Re-arranging terms (and working with squares to avoid square roots), you might look at d^2-(r_k+r_K)^2, where d is the distance between the centers, and r_k and r_K denote the (hopefully obvious) radii. The two circles will be tangent if this is 0, and won’t intersect if the difference is positive. More care needs to be used when talking about what happens if the difference is negative, but we’ll see momentarily that this doesn’t happen, so let’s forget it. So, it turns out that d^2-(r_k+r_K)^2 is ((Hk-hK)^2-1)/(k^2K^2). Since all the letters there are positive integers, we see that d^2-(r_k+r_K)^2\geq 0, with equality iff Hk-hK=1. This says, based on our work yesterday, precisely that two Ford circles intersect iff the fractions they correspond to are neighbors in a Farey sequence, and when this happens, they intersect at a single point (they are tangent to eachother).

This justifies the picture from above, then. The circles never overlap, and some of them are tangent.

It might be worth it to go back and justify why any vertical line at an irrational value slices through infinitely many of the Ford circles. To that end, let \omega\in (0,1) be irrational. The two circles at 0/1 and 1/1 both have radius 1/2, and so certainly x=\omega hits at least one Ford circle, kicking off an induction argument.

Suppose that x=\omega hits the Ford circle h/k. For the sake of argument, let us suppose that \omega>h/k, so the vertical line lies to the right of the center of the circle. Not much changes if the opposite inequality is true, so let’s just stick with this one. I aim to show that x=\omega hits a circle with a smaller radius, i.e. goes through a circle corresponding to a fraction with a bigger denominator than k. I claim that it hits one of the circles tangent to the circle at h/k. By the above, we know that these are the circles corresponding to neighbors of h/k in some Farey sequence. Since I’m thinking about circles to the right of h/k, these will be successors in the Farey sequence. Yesterday we said that all of these successors are (m,1)-weighted mediants of h/k with H/K where H/K is the successor of h/k in F_k (the first Farey sequence that h/k shows up in).

Consider, then, the circles corresponding to the (1,1), (2,1), (3,1),… weighted mediants of h/k with H/K. Let me call the circles C_1,C_2,\ldots, using C for the circle at h/k. It isn’t too much work to check the necessary inequalities to show that the center of C_1 is further right than the right-most point on C. I’ve already said, even if not directly, that for each n, C_n is tangent to C_{n+1}. So, if you want, you may start at the top of C_1, and move left along the tops of all the circles in turn, jumping from one to the next at points of tangency. How far can you go doing this? Well, if you take the limit as m\to\infty of the (m,1)-weighted mediant of h/k with H/K, you get h/k (l’Hospital!). So, the tops of the circles form an unbroken path from the top of C_1 to as close to h/k, on the right, as we care to get. The vertical line x=\omega, which passes through the right side of C, must surely pass through this path, and therefore at least one of the C_n.

That’s probably just about enough with Ford circles for now. I’m not sure what I’ll be writing about tomorrow. Hopefully something else fun. I would like to note that, as I was writing this post, I was excited to find a connection to continued fractions. I knew a connection should be there (and I’ll probably find plenty more later), and found it while asking “how can we tell which of the C_n does x=\omega pass through?”. I remembered from reading about continued fractions, [a_0;a_1,a_2,\ldots], that the “convergents”, [a_0;a_1,\ldots,a_n] are rationals p_n/q_n that give “best rational approximations”. Those should be the circles that x=\omega passes through, or so. And so if you pass through p_n/q_n, the next circle you should get to is p_{n+1}/q_{n+1}. Well, it turns out, there is a nice relationship between successive convergents. Namely, p_{n+1}=a_{n+1}p_n+p_{n-1} and q_{n+1}=a_{n+1}q_n+q_{n-1}. Look at that! It’s the (a_{n+1},1)-weighted mediant of p_n/q_n with p_{n-1}/q_{n-1}! Fantastic. Could this story get any better? I’m glad I’ve decided to find out this month.

Hardy and Wright, Chapter 10

May 29, 2009

We decided to split the reading of chapter 10 into two weeks (chapter 9 here, in case you missed it). It’s a longish chapter, and I really like continued fractions (though I’m not particularly sure why, they’re just fun) and some of the other readers thought it might be worth it to spend more time reading it carefully.

Our first meeting covered the first few sections, which only involved basic definitions, and the theorem that every real number has an essentially unique continued fraction expansion, and the expansion is finite if and only if the number is rational. Eric stated that he was unimpressed so far, and didn’t see what I was so fascinated by. None of us seemed to have any questions about the reading, so I gave a glimpse of things to come (relation of periodic continued fractions to quadratics, and rational approximations). I also mentioned that there are some interesting tie-ins to the “modular group” (SL_2(\mathbf{Z})), Farey sequences, and Ford circles (which have come up before). Eric has been reading about hypergeometric series, and said there are some interesting formulas there related to continued fractions. He also asked if there was some relation to surreal numbers, because continued fractions approximate numbers from the left and right, alternatingly.

We picked up, the second week, in section 10.10 “A lemma”, defining an equivalence relation on reals. The relation works out to be that two numbers are equivalent if the tail of their continued fractions are the same. Chris corrected a misinterpretation Eric brought up, about canonical representatives of equivalence classes. I had wondered if the equivalence meant that, in terms of periodic continued fractions representing “quadratic” numbers, two numbers (a_1+\sqrt{b})/d_1 and (a_2+\sqrt{b})/d_2 would always be equivalent. In fact, I thought I had decided they were. But an example in the book shows that this is not the case (\sqrt{5}=[2,\dot{4}] while (\sqrt{5}+1)/2=[\dot{1}], dots representing the repeating part). Eric pointed out that two points were related if there are in the orbit of the modular group acting on \mathbb{R} as a subset of \mathbb{C}, acting as linear fractional transformations.

We spent a little while talking about periodic continued fractions, how the two directions of the proof that they are equivalent to “quadratics” go. I think the proof that any quadratic has a periodic continued fraction is fascinating. It gives no indication how long the period will be, or when it will start.

Next I mentioned that there’s a convenient algorithm for finding the continued fraction for a “quadratic surd”, and that I intend to post some python code here implementing it (and other fun functions for playing with continued fractions). While it’s essentially the normal algorithm, taking floors and then reciprocals, there’s some convenience in having quadratics around, because you can “rationalize the numerator” and sorts of things. Not mentioned in the text, but stated at both Wikipedia and Mathworld (links below), is that Lagrange showed that the continued fraction for \sqrt{D} has a period smaller than 2D, that the period begins after a single non-repeating term, and that the last term in the period is twice a_0 (the first term of the continued fraction). All of these things are true of the examples given in the text. And, while finding links… holy crap! the repeating part, besides the last numeral, is palindromic! Is there no end to the fascination!?

I’ll go ahead and just direct you to the Wikipedia page on (periodic) continued fractions, and similarly the Mathworld page (periodic) continued fractions. All (and undoubtedly many others) make for fascinating reading.

Our next main focus was on approximation by convergents. Chris pointed out how remarkable the final theorem is, that any time a fraction is sufficiently close to a number (in terms of it’s denominator), it is automatically a convergent. I mentioned one thing I read about in Rademacher’s “Higher Mathematics from an Elementary Point of View” (which I love), which was that the existence of infinitely many p/q such that |p/q-x|<\frac{1}{2q^2} (corollary of theorem 183) can be interpreted as saying that a vertical line at x passes through infinitely many Ford circles.

I then tried to explain the difference between Theorems 181 and Theorems 182, and point out that there are two reasonable definitions of “closest rational approximation”. I had read about these in Khinchin’s “Continued Fractions” (which I also love). I bumbled it a bit, but believe I was saying true things throughout. Basically, the story goes… convergents are best rational approximations in the stronger sense (thm 182), and mediants of successive convergents are best rational approximations in the weaker sense (thm 181). In fact, choose an irrational (for convenience) x, and let \square denote the operation “mediant”. For any n, define m_{n,1}=(p_n/q_n)\square (p_{n+1}/q_{n+1}), and then iteratively m_{n,k}=m_{n,k-1}\square (p_{n+1}/q_{n+1}). The last of these mediants that is on the same side of x as p_n/q_n will be p_{n+2}/q_{n+2}. Continued fractions rock.

It’s really best to think about these lemmas with an actual continued fraction example. I, personally, used 61/45=[1;2,1,4,3], and looked at the mediants m_{1,k}, between the first and second convergent.

We finished with me trying to explain something that I thought was quite surprising. Let f_k(x)=n_k(x)/k denote the closest rational to x, with denominator k (let’s not require the fraction in reduced terms). I was quite surprised, honestly (and convinced Eric he should be too), that for a chosen x, the sequence of such rationals will not be successively better approximations to x. Having had the chance to go through an example with Eric, and then a few hours to mull it over, I’ve since realized this it not particularly surprising at all. Suppose x lies in [1/4,1/3]. Half of these x will be better approximated by 1/3 than 1/4.

So, anyway, I guess that’s all I have to say about continued fractions by now. Perhaps Eric will show us sometime about fun relationships between hypergeometric series and continued fractions. If you haven’t already stopped reading this post to go find all sorts of other interesting things to read about continued fractions, either online or in Rademacher’s or Khinchin’s books, you can now.

Hardy and Wright, Chapter 3

March 28, 2009

Well, somewhat surprisingly, my little reading group hasn’t yet disbanded entirely. Last week we had a great discussion about chapter 2 of H&W, which put us in chapter 3 for this week.

Chris commented that one of the more interesting things, for him, in this chapter was that people would actually be interested in proving things about sequences of reduced fractions with bounded denominator (the Farey sequences that start the chapter). I rather enjoyed playing with them and reading about them. My understanding is that they are useful for starting to get bounds on approximating irrationals by rationals, a topic which I’m fairly certain we will return to. I also was glad to see a proof of Minkowski’s theorem about symmetric regions with large area containing lattice points. I think I’d heard the result before, but it’s always nice to see (and be able to follow!) a proof.

I was a little confused about when a mediant is reduced. I thought that the mediant of reduced fractions was reduced. For any three consecutive terms of a Farey sequence, the middle is the mediant of the other two. There is a footnote stating that the middle term might be the reduced form of the mediant. Why would you need this footnote, if mediants are already reduced? Chris pointed out that in the Farey sequence with denominator no larger than 3 (denote if F_3), 1/3, 1/2, 2/3 are consecutive terms, but the middle is the reduced mediant of the outer terms. All the text claims is that the first time a term shows up, it is the reduced mediant of its neighbors. Clearly mediants of reduced fractions aren’t, necessarily, reduced. Eric and I were frequently embarrased by things we thought were true during this meeting, and I pointed out that this was exactly why I wanted to read about these things: I don’t work with integers too much 🙂

Eric wondered about the fact that the distance between two consecutive terms h/k and h'/k' in a Farey sequence is 1/(kk'). Chris and I noted that this was obvious from the lemma that kh'-hk'=1. Eric was wondering, though, if one could determine it without that lemma somehow. We didn’t come up with much.

I presented the answer I had found to the question I had asked myself: “For h/n in F_n, can you find the first n'>n for which h/n has a new successor in F_{n'}? What can you say about the sequence of all such n'?” The book has a lemma about how to find the next term for h/n in F_n. Suppose that its successor is h'/k'. By other lemmas in this chapter, the next time something comes between these terms, it will be the mediant, \frac{h+h'}{n+k'}, and this occurs in F_{n+k'}. The next time h/n will have a new successor, it will be the mediant of h/n with this new term, \frac{h+h'}{n+k'}, and so will be \frac{2h+h'}{2n+k'}. Continuing on, we see that h/n has a new successor in the Farey sequences F_{m\cdot n+k'}.

Next, I talked about the Stern-Brocot Tree and Minkowski’s Question Mark function, ?(x). We had mentioned in our meeting that the terms of the Farey sequence are not equally distributed, and I pointed out that there was some relation between that and how ‘wobbly’ ?(x)-x is. Notice that the Farey sequences are least dense around 1/2, and most dense around 0 and 1, which relates well to the wobbliness of the graph of ?(x)-x. This has something to do with defining an appropriate measure so that ?(x) is the integral (of… something. 1? x?) with respect to this measure. At least, that’s what I gathered from Wikipedia. I’d like to read more about this.

Also, Eric and I had both noticed the comment on the Wikipedia page that there is a relation between Farey sequences and the Riemann hypothesis, which we found pretty intriguing. Of course, neither of us knew much about it. Perhaps a topic for another day. It seems to be related to the density, or perhaps distribution is more accurate, of the terms of Farey sequences in the interval [0,1].

I completely forgot to bring up Ford circles. If you put circles of appropriate sizes above the points in a Farey sequence, you get lots of nice tangent circles, with fun properties. Perhaps the property most relevant to this section is that any circles tangent to the circle above p/q are centered at x-coordinates that are neighbors of p/q in some Farey sequence.