Archive for the ‘Uncategorized’ Category

An R Mumble

September 18, 2011

This weekend I attended beCamp, “Charlottesville’s version of the BarCamp unconference phenomenon”. Basically a tech meetup with talks, talking outside of talks, food, and drinks. All around, a good time. This was my first beCamp, but I enjoyed it, and will probably try to go to others in the future.

The way the camp goes, you show up Friday night and people stand up and say things they could talk about, if there’s interest. And then people put marks about if they’d attend or whatever, and a schedule of talks for Saturday comes together. When I signed up for the camp a week beforehand, my intention was to spend some free time during the week preparing a talk about R. Didn’t happen, but I stood up Friday and said I could maybe say a few words about it anyway. I got a few ‘Learn’ checkmarks, indicating a little interest. Lucky for all involved, though, I didn’t get officially scheduled to talk. Of course, I didn’t know that until I showed up Saturday, having spent about 2 hours that morning trying to throw something together, just in case. I can’t say I was too disappointed with not having to talk, though it could have been fun. At lunch, there was about an hour of “lightning talks”, just 5 minute talks. While I was sitting there, I figured that would actually be a good amount for me. Just as the line of talkers for that was starting to wind down, and I was thinking about joining it, a handful more queued up. Those talks used all the remaining time, so I was, again, off the hook.

But, hey, I’ve got this venue, I can talk about stuff whenever I want, right? So here’s some notes about R I was just about prepared to mumble about Saturday.

According to the webpage, “R is a free software environment for statistical computing and graphics.” It’s something like an open source clone of the S software for stats processing. The language has some sort of interesting aspects to it, and also some things that really get me sometimes. R has some good built-in “call for help” utilities, making it sort of easy to pick up and do fairly interesting things fairly quickly. Perhaps one of the best things is the huge collection of freely available libraries. I’ll try to talk about some or all of these things, showing by example and not being too formal (me, informal?), but hopefully still fairly close to correct (or, at least, usably incorrect). Folks wishing for more can certainly jump to the official introduction. John D. Cook has some good notes about picking up R if you know some programming (I’m assuming you do), and I also found “The R Inferno” [pdf] informative.

Right, so lets look at some code. I always feel, with a language like R, you don’t start off with “Hello World”, but with calculations. Because it’s just so novel to have a calculator:

> 2+2
[1] 4
> exp(5)
[1] 148.4132
> (1i)^(1i)
[1] 0.2078796+0i
> exp(pi*(1i))
[1] -1+0i

Those lines with the arrow in front are the input lines, where you type, and the other lines are the output. Anyway, pretty exciting, no?

I should mention that you can reproduce this, or other examples, by running R from the command line, or using R-specific editors (or emacs add-ons). I’ve been using RStudio recently, and, while it’s got its issues, it’s got some nice features too. Worth a check-out.

Ok, so R does the normal hello world too, and has if-statements (imagine!) and for loops (how are you not using it already!?). The ‘:’ operator is pretty helpful for making ranges, as the example shows:

> print("hello world")
[1] "hello world"
> if (TRUE) { print("yep") } else { print("nope") }
[1] "yep"
> for (n in 1:10) { print(log(n)) }
[1] 0
[1] 0.6931472
[1] 1.098612
[1] 1.386294
[1] 1.609438
[1] 1.791759
[1] 1.94591
[1] 2.079442
[1] 2.197225
[1] 2.302585

Here’s an example of writing our own function. Pretty straightforward… note that ‘<-’ is the assignment operator (‘=’ would also work here, but I think ‘<-’ is more R-ey). There’s another assignment operator, ‘<<-’, which, I think, is a little bit of a bad habit to use. It’s along the lines of making the thing on the left a global variable, but I’d rather not get too much into that sort of thing (i.e., things I don’t understand at all, instead of things I can at least pretend a little about). I seem to recall reading somewhere the R using lexical scoping rules. If you know what that means, good for you. I think it applies here, because if we didn’t assign to fact, then the call to fact at the end of the function would fail. Oh, and note you can return things in R with return(), but more typically results get returned by just having them be the last line of the function (in my (limited) experience).

> fact <- function(n) {
+     if (n <= 1) {
+        return(1)
+     }
+     n*fact(n-1)
+ }
> fact(3)
[1] 6

There’s a few ways to iterate over a bunch of things and apply a function to each object in turn (i.e., “map”). A simple way is with sapply. In the following example, we use sapply to get the factorial of the values 0 to 4, then plot the results with lines connecting the values. We add a title to the plot, and re-plot the points. Note that we pass 0:4 in to specify the x-coordinates of the values; R indexes from 1 by default (I don’t hold R personally responsible for this, since they’re trying to be S-compatible, but still). Anyway, example (run it yourself for the picture):

> first.facts <- sapply(0:4, fact)
> plot(0:4, first.facts, xlab="n", ylab="fact(n)", type="l")
> title("Factorial")
> points(0:4, first.facts)

Plotting can get just about as complicated and involved as you’d like. So now’s probably a good place to introduce R’s help system. If you want to find out more about a command, just use ‘?’:

> ?plot

There’s another help operator, ‘??’, I’ll use later. (I think I actually saw there’s a third, ‘???’, but I haven’t used it). Another cool thing about R is that you can look at the source for functions. Just type the function without the parentheses:

> title
function (main = NULL, sub = NULL, xlab = NULL, ylab = NULL,
    line = NA, outer = FALSE, ...)
{
    main <- as.graphicsAnnot(main)
    sub <- as.graphicsAnnot(sub)
    xlab <- as.graphicsAnnot(xlab)
    ylab <- as.graphicsAnnot(ylab)
    .Internal(title(main, sub, xlab, ylab, line, outer, ...))
}
<environment: namespace:graphics>

Ok, only so informative, since it passes work off with that .Internal call, but the principle is there.

I want to do one more function example, because it shows sort of a fun thing you can do. Most functions allow you to define default values for arguments. R lets you define the value in terms of passed in values. When you call the function, you can pass in values by naming the arguments, so you don’t have to worry about order. And, when you do, you can actually cheat and use initial substrings of the argument names. An example is in order:

> feet.to.yards <- function(feet) {
+   feet/3
+ }
> yards.to.feet <- function(yards) {
+   yards*3
+ }
> feet.to.yards(3)
[1] 1
> yards.to.feet(8)
[1] 24
> convert <- function(feet=yards.to.feet(yards), yards=feet.to.yards(feet)) {
+ 	print(paste(feet, "feet is", yards, "yards"))
+ }
>
> convert(3)
[1] "3 feet is 1 yards"
> convert(yards=8)
[1] "24 feet is 8 yards"
> convert(y=5)
[1] "15 feet is 5 yards"

(that example is based on something I read in a blog post, but I can’t find the link… it was talking about the things I said in the last paragraph, and did an example of converting polar to cartesian coordinates, if memory serves…) (Update 20111004 – found it).

Anyway, I said R was for stats… we should do some of that.

> # random sample of size 100 from normal with mean 7, s.d. 3
> nums <- rnorm(100, 7, 3)
> # calculate sample standard deviation
> sd(nums)
[1] 2.655835
> # plot a histogram
> hist(nums)
> # have a quick look at nums
> summary(nums)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
-0.6268  4.8390  6.8200  6.5500  8.4870 11.8500

One of the really fun things about R is “vectorized” operations (terminology taken from the R Inferno, mentioned above… could be standard-ish though, I dunno). In particular, the usual arithmetic operations are applied componentwise to vectors (and typing ’2′ is a vector of length one, for example). Shorter vectors are recycled. There’s lots of ways to make vectors, ‘:’ was used above, c() is easy, as is seq(). Anyway, here’s an example:

> (1:10)*c(2,3)
 [1]  2  6  6 12 10 18 14 24 18 30
> c(1*2, 2*3, 3*2, 4*3, 5*2, 6*3, 7*2, 8*3, 9*2, 10*3)
 [1]  2  6  6 12 10 18 14 24 18 30

Fitting lines to data sounds like a statsy thing to do, and R is for stats, so let’s do that. Let’s take the nums we made above and use them as offsets from the line y=5x. Then we’ll fit a line to that data, and it’s slope should be pretty close to 5. Note that ‘.’ isn’t special in R like it is in many other languages, so it’s typically used to separate words in variable names (although, it can be used in formulas, see later). Sort of the equivalent to other languages’ ‘.’ is ‘$’, exemplified below.

> sloped.nums <- 5*(1:100) + nums
> sloped.line <- lsfit(1:100, sloped.nums)
> print(sloped.line$coefficients)
Intercept         X
 6.256159  5.005810

I’ll leave it for the curious to sort out why the intercept is 6ish. Of course, if you’re running this at home, you’ll get different numbers, owing to the random sampling. Anyway, we should probably plot the thing and make sure it seems ok:

> plot(1:100, sloped.nums)
> abline(sloped.line, col="RED")

Looks fine to me. Of course, it’d probably be cool to try something with actual data. If you’ve got a csv sitting around, R is quite good at reading them in. If you don’t have a csv sitting around, I found some census data, and it seems R will open csvs across the wire (which is kinda hot).

> census <- read.csv("http://www.census.gov/popest/national/files/NST_EST2009_ALLDATA.csv")

So… what’d we just get? Well, we could read about it, or just mess with it. We’ve already seen summary(), so we can try that. Below, I’ve only shown the top of the output.

> summary(census)
     SUMLEV      REGION    DIVISION      STATE               NAME
 Min.   :10.00   0: 1   5      : 9   Min.   : 0.00   Alabama   : 1
 1st Qu.:40.00   1:10   8      : 8   1st Qu.:12.00   Alaska    : 1
 Median :40.00   2:13   4      : 7   Median :27.00   Arizona   : 1
 Mean   :38.07   3:18   1      : 6   Mean   :27.18   Arkansas  : 1
 3rd Qu.:40.00   4:14   0      : 5   3rd Qu.:41.00   California: 1
 Max.   :40.00   X: 1   3      : 5   Max.   :72.00   Colorado  : 1
                        (Other):17                   (Other)   :51
 CENSUS2000POP       ESTIMATESBASE2000   POPESTIMATE2000
 Min.   :   493782   Min.   :   493783   Min.   :   493958
 1st Qu.:  1808344   1st Qu.:  1808344   1st Qu.:  1806962
 Median :  4301261   Median :  4302015   Median :  4328070
 Mean   : 14878497   Mean   : 14878639   Mean   : 14918075
 3rd Qu.:  8186453   3rd Qu.:  8186781   3rd Qu.:  8230161
 Max.   :281421906   Max.   :281424602   Max.   :282171957

Each of these headers is a name we can use to index into the census object. It’s technically a “data frame”, one of the types in R. That means it’s basically a (not-necessarily-numeric) matrix (as you might expect from a csv table), each column has the same number of entries, and within a column, all of the entries are the same type (no mixing strings with numbers). The names are the column names, and you can use the ‘$’ operator to get at any particular column.

> head(census$NAME)
[1] United States Northeast     Midwest       South         West
[6] Alabama
57 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming

The last line of output indicates that census$NAME is a “factor”, one of the types in R. Basically a list of values all taken from a set. I don’t want to say too much about it. While I’m at it, though, we might as well talk about indexing into R objects. It’s one of the cool things about R. Let’s grab that NAME column, and convert it to strings:

> # $NAME and [['NAME']] do the same thing
> states <- as.character(census[['NAME']])
> states[c(5, 18, 37)]
[1] "West"       "Idaho"      "New Mexico"
> states[-(c(1:10, 15:50))] # negative indices = remove those ones
 [1] "Colorado"                 "Connecticut"
 [3] "Delaware"                 "District of Columbia"
 [5] "Vermont"                  "Virginia"
 [7] "Washington"               "West Virginia"
 [9] "Wisconsin"                "Wyoming"
[11] "Puerto Rico Commonwealth"
> states[c(11:14,51:57)]
 [1] "Colorado"                 "Connecticut"
 [3] "Delaware"                 "District of Columbia"
 [5] "Vermont"                  "Virginia"
 [7] "Washington"               "West Virginia"
 [9] "Wisconsin"                "Wyoming"
[11] "Puerto Rico Commonwealth"
> which(substr(states, 1, 1) == "P") # substr is, apparently, "vectorized"
[1] 44 57
> states[which(substr(states, 1, 1) == "P")]
[1] "Pennsylvania"             "Puerto Rico Commonwealth"

Like I said, census is basically a table. You can pull out rectangular bits of it easily, as the example below shows. And it’s easy enough to generalize that a little, and start pulling out whatever bits you want. If you leave one of the coordinates in the ‘[]‘ empty, it means “everything”. So census[,] is the same as census (for some definition of “same as”).

> census[1:6,1:5] # rows 1 to 6, columns 1:5
  SUMLEV REGION DIVISION STATE          NAME
1     10      0        0     0 United States
2     20      1        0     0     Northeast
3     20      2        0     0       Midwest
4     20      3        0     0         South
5     20      4        0     0          West
6     40      3        6     1       Alabama

Right, we’re supposed to be doing statsy things. We should be actually playing with the data… Let’s pick out some easy bit. Let’s play with the “POPESTIMATE2009″, “BIRTHS2009″, and “DEATHS2009″ values for just the states.

> state.rows <- c(6:13, 15:56)
> some.data <- census[state.rows, c("POPESTIMATE2009", "BIRTHS2009", "DEATHS2009")]
> summary(some.data)
 POPESTIMATE2009      BIRTHS2009       DEATHS2009
 Min.   :  544270   Min.   :  6407   Min.   :  3140
 1st Qu.: 1802408   1st Qu.: 25748   1st Qu.: 14667
 Median : 4403094   Median : 58800   Median : 36536
 Mean   : 6128138   Mean   : 85094   Mean   : 49617
 3rd Qu.: 6647091   3rd Qu.:100143   3rd Qu.: 58657
 Max.   :36961664   Max.   :558912   Max.   :241733
> dim(some.data)
[1] 50  3
> colnames(some.data)
[1] "POPESTIMATE2009" "BIRTHS2009"      "DEATHS2009"
> rownames(some.data)
 [1] "6"  "7"  "8"  "9"  "10" "11" "12" "13" "15" "16" "17" "18" "19" "20" "21"
[16] "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36"
[31] "37" "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50" "51"
[46] "52" "53" "54" "55" "56"
> rownames(some.data) <- as.character(census$NAME[state.rows])
> head(some.data)
           POPESTIMATE2009 BIRTHS2009 DEATHS2009
Alabama            4708708      62265      47157
Alaska              698473      11307       3140
Arizona            6595778     103956      49657
Arkansas           2889450      40539      28003
California        36961664     558912     241733
Colorado           5024748      72537      31679

To get an idea how the variables relate to each other, you can plot each of them against each of the others quickly, with pairs():

> pairs(some.data)

We fit a line to data earlier, and can expand on that here. Let’s model the population estimate given the births and deaths. I’m not trying to display some dazzling insight into the data, just show how easy it is in R to do this sort of thing.

> l <- lm(POPESTIMATE2009 ~ ., data=some.data
> print(l)

Call:
lm(formula = POPESTIMATE2009 ~ ., data = some.data)

Coefficients:
(Intercept)   BIRTHS2009   DEATHS2009
  -73663.83        42.52        52.07

> summary(l)

Call:
lm(formula = POPESTIMATE2009 ~ ., data = some.data)

Residuals:
     Min       1Q   Median       3Q      Max
-1074015  -205561   -12694   171717   997020

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -73663.831  70178.173   -1.05    0.299
BIRTHS2009      42.518      1.540   27.62   DEATHS2009      52.075      3.099   16.80   ---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 342400 on 47 degrees of freedom
Multiple R-squared: 0.9976,	Adjusted R-squared: 0.9975
F-statistic:  9652 on 2 and 47 DF,  p-value: < 2.2e-16

That looks good and statsy.

If you don’t have your own data, by the way, R comes with some, and many libraries bring in their own, for examples. Just poking around a little (here’s to tab completion), I found, for example, that R comes with a data frame with state-by-state arrest counts by crime (USArrests</code). I mentioned earlier that '?' is good for getting help on a command. If you want to find a command to use, use '??'. This frequently returns lists of packages that you can poke at. One of the great things about R is that many of the libraries you can install come with 'vignettes', providing a little tutorial on some of the tools in the package. R makes it very easy to install packages (install.packages()).

I’m sort of running out of steam for the evening, so think I may wrap this up. I had sort of envisioned talking about a couple of fun packages. Guess I’ll save that for another post (this has gotten long anyway), and try to do some cool examples, with prettier pictures.

A homotopy limit description of the embedding functor

February 25, 2010

(Approximate notes for a talk I gave this afternoon.)

Setup

So according to the title, I should be telling you about {\text {Emb}(M,N)}, as a functor of manifolds {M} and {N}. That’s perhaps a bit ambitious. I’ll only be thinking about {M=\coprod _{m} D^{n}}, a disjoint union of closed disks, and I’ll actually fix {M}. And instead of letting {N} range over any manifolds, it’ll just be {V}, ranging over real vector spaces.

By taking derivatives at centers, we obtain a homotopy equivalence from {\text {Emb}(M,V)} to something I’ll maybe denote {\text {AffEmb}_{0}(\coprod _{m} \mathbb {R}^{n},V)}. This is componentwise affine (linear followed by translation) maps {\coprod _{m} \mathbb {R}^{n}\to V} whose restriction to {\epsilon }-balls around the 0s is an embedding. I may use {\underline{m}=\{1,\ldots ,m\}}, and write {\underline{m}\times \mathbb {R}^{n}}. And I’ll actually send everything to spectra, instead of topological spaces, via {\Sigma ^{\infty }}.

So really I’ll be talking about

\displaystyle  \Sigma ^{\infty }\text {AffEmb}_{0}(\underline{m}\times \mathbb {R}^{n},V)

as a functor of {V}. I’ll be lazy and write it {\Sigma ^{\infty }\text {Emb}(M,V)}, having fixed an {m} and {n} to give {M=\underline{m}\times D^{n}}.

Useful Cases

The first useful case is when {M=\underline{m}} (i.e., {n=0}). Then embeddings are just configuration spaces, {F(m,V)}. I’ve talked before about a homotopy limit model in this case, but let me remind you about it.

The category I have in mind is something I’ll denote {\mathcal{P} (m)}. Objects will be non-trivial partitions of {\underline{m}}, and I’ll probably denote them {\Lambda }, perhaps writing {\Lambda \vdash \underline{m}}. Non-trivial means that some equivalence class is larger than a singleton. I’ll write \Lambda\leq \Lambda' if {\Lambda } is finer than {\Lambda'}, meaning that whenever {x\equiv y\pmod{\Lambda }}, then {x\equiv y\pmod{\Lambda '}}.

The functor I want is something I’ll denote {\text {nlc}(-,V)} and call “non-locally constant” maps. So {\text {nlc}(\Lambda ,V)} is the set (space) of maps {f:\underline{m}\to V} such that there is an {x\equiv y\pmod{\Lambda }} where {f(x)\neq f(y)}. Equivalently, maps which don’t factor through {\underline{m}/\Lambda }.

Depending on which order you make your poset {\mathcal{P} (m)} go, {\text {nlc}(-,V)} is contravariant, and you can show

\displaystyle  \Sigma ^{\infty }F(m,V)\simeq \text {holim}_{\mathcal{P} (m)}\Sigma ^{\infty }\text {nlc}(-,V).

The second useful case is when {M=D^{n}} (i.e., {m=1}). Then the space of embeddings is homotopy equivalent to the space of injective linear maps. You can obtain a homotopy limit model in this case that looks strikingly similar to the previous case. Namely, you set up a category of “linear” partitions (equivalent to modding out by a linear subspace), and take the holim of the non-locally constant maps functor, as before.

I like to think of both cases as being holims over categories of kernels, and the non-locally constant maps of some kernel are maps that are known to fail to be not-injective for some particular reason. Embeddings fail to be not-injective for every reason.

But there’s another model I want to use in what follows. My category will be denoted {\mathcal{L} (n)}, and objects in the category will be vector spaces {E} with non-zero linear maps {f:E\to \mathbb {R}^{n}}. Morphisms from {f:E\to \mathcal{R}^{n}} to f':E'\to \mathbb{R}^n will be surjective linear maps \alpha:E\to E' with f=f'\circ \alpha. You might think of the objects as an abstract partition ({E}) with a map to {\mathbb {R}^{n}}, which then determines a partition of {\mathbb {R}^{n}}, by taking the image.

The functor out of this category is something I’ll still denote {\text {nlc}(-,V)}. On an object {f:E\to \mathbb {R}^{n}} it gives all non-constant affine maps {E\to V}. Arone has shown

\displaystyle  \Sigma ^{\infty }\text {Inj}(\mathbb {R}^{n},V)\simeq \text {holim}_{\mathcal{L} (n)}\Sigma ^{\infty }\text {nlc}(-,V).

Product Structure

The space of embeddings we are considering splits, as

\displaystyle  \text {Emb}(M,V)\simeq F(m,V)\times \text {Inj}(\mathbb {R}^{n},V)^{m}.

We know a homotopy limit model of each piece of the splitting, and might hope to combine them into a homotopy limit model for the product. This can, in fact, be done, using the following:

Lemma: If {\Sigma ^{\infty }X_{i}\simeq \text {holim}_{\mathcal{C}_{i}}\Sigma ^{\infty }F_{i}}, {i=1,\ldots ,k}, then

\displaystyle  \Sigma ^{\infty }\prod _{i} X_{i} \simeq \text {holim}_{*_{i}\mathcal{C}_{i}}\Sigma ^{\infty }*_{i} F_{i}.

Here {*} denotes the join. For categories, {C*D} is {(C_{+}\times D_{+})_{-}}, obtained by adding a new initial object to each of {C} and {D}, taking the product, and removing the initial object of the result.

Proof (Sketch): Consider the case {k=2}. The idea is to line up the squares:

and

Both of which are homotopy pullbacks. The equivalence of the lower-right corners follows because join is similar enough to smash, which plays nicely with {\text {holim}}.

So, anyway, applying this lemma and perhaps cleaning things up with some homotopy equivalences, we obtain an equivalence

\displaystyle  \Sigma ^{\infty }\text {Emb}(M,V)\simeq \text {holim}_{\mathcal{P} (m)*\mathcal{L} (n)^{*m}}\Sigma ^{\infty }\text {nlc}(-,V).

Objects in the category consist of a partition {\Lambda \vdash \underline{m}} along with, for {i\in \underline{m}}, linear {f_{i}:E_{i}\to \mathbb {R}^{n}}. To tidy up a little bit, I’ll denote this category {\mathcal{J}=\mathcal{J}(M)}, for join. The functor takes an object as above and returns the set of componentwise affine maps {\coprod _{i} E_{i}\to V} such that either (a) the map is non-constant on some component, (b) when restricted to the image of {0:\underline{m}\hookrightarrow \coprod E_{i}}, the map is non-locally constant with respect to {\Lambda }.

There you have it, a homotopy limit description for the embedding functor.

But not a particularly nice one. If we had an embedding {M\hookrightarrow M'}, then we’d have map {\text {Emb}(M,V)\leftarrow \text {Emb}(M',V)}. It’d be really swell if this map was modelled by a map {\mathcal{J}(M)\to \mathcal{J}(M')} of the categories we are taking homotopy limits over. But that’s not going to happen. What can go wrong? Non-trivial partitions of {M}, when sensibly composed with the map to {M'}, may become trivial, and thus not part of the category. This is, essentially, because several components of {M} might map to a single component of {M'}. If {M} has two components, and {M'} one, say, where do you send the object consisting of the non-trivial {\Lambda \vdash \underline{2}} paired with some 0 vector spaces?

A More Natural Model

We sort of need to expand the category we take the homotopy limit over, and make it a more natural construction. We actually have an indication on how to do this from the discussion, above, in the case of linear injective maps from a single component. Perhaps we can find a proper notion of “abstract partition”, pair such a beast with a map to {M}, sensibly define non-locally constant, and get what we want. Let’s see how it goes…

An affine space is, loosely, a vector space that forgot where its 0 was. There is, up to isomorphism, one of any given dimension, just like for vector spaces; I’ll denote the one of dimension {d} by {A^{d}}, say. That should be enough of a description for now.

Let me define a Complete Affine Partition (CAP), {\rho }, to be a partition of a disjoint union of affine spaces, such that equivalence classes contain components. That is, everybody that’s in the same component is in the same equivalence class. Given a {\rho }, I’ll denote by {A(\rho )} the underlying component-wise affine space. The data that determines a {\rho } is: a finite set {s} (the set of components), a partition of {s}, and a dimension function, {d:s\to \mathbb {N}_{0}} (non-negative integers). With this information, {A(\rho )} is {\coprod _{i\in s}A^{d(i)}}.

By a refinement {\alpha } from {\rho } to {\rho'}, denoted {\alpha :\rho \to \rho'}, I will mean an affine map {\alpha :A(\rho )\to A(\rho')} so that the “affine closure” of the partition {\alpha (\rho )} is coarser than {\rho'}. I don’t want to spend too much time talking about the affine closure operation, on partitions of a component-wise affine space. If {\rho } and {\rho'} have a single component, a refinement is just a surjective affine map (recall before we had surjective linear maps in {\mathcal{L} (n)}). If {\rho } and {\rho'} have dimension function 0, so basically {\rho =\Lambda } and {\rho'=\Lambda'} (partition of possibly distinct finite sets), a refinement just means {\alpha (\Lambda ) \geq \Lambda'}.

We’re now ready to define a category, which I’ll denote {\mathcal{C}=\mathcal{C}(M)}. The objects will be pairs of: a CAP, {\rho }, along with a non-locally constant affine {f:A(\rho )\to M} (subsequently denoted {f:\rho \to M}). A morphism from {f:\rho \to M} to {f':\rho'\to M} will be a refinement {\alpha :\rho \to \rho'} such that {f=f'\circ \alpha }. This should look familiar to the {\mathcal{L} (n)} construction.

The functor I’ll consider still deserves to be called {\text {nlc}(-,V)}, and it takes {f:\rho \to M} to the set of non-locally constant affine maps {A(\rho )\to V}. We’d really like to be able to say

\displaystyle  \Sigma ^{\infty }\text {Emb}(M,V)\simeq \text {holim}_{\mathcal{C}(M)}\Sigma ^{\infty }\text {nlc}(-,V).

It seems sensible to try to do so by showing that

\text {holim}_{\mathcal{C}(M)}\Sigma ^{\infty }\text {nlc}(-,V)\simeq \text {holim}_{\mathcal{J}(M)}\Sigma ^{\infty }\text {nlc}(-,V),

since {\mathcal{J}(M)\subseteq \mathcal{C}(M)}, and we know the homotopy limit over {\mathcal{J}(M)} has the right homotopy type. This is our goal.

Semi-direct Product Structure

I’ll use the semi-direct product notation for the Grothendieck construction, as follows. Recall that for a category {D}, and a functor {F:D\to E}, the Grothendieck construction is a category, which I’ll denote {D\ltimes F}, whose objects are pairs {(d,x)} where {x\in F(d)}. Morphisms {(d,x)} to {(d',x')} are morphisms {h:d\to d'} such that {F(h)(x)=x'}. Of course, my functors are all contravariant as defined, so you have to mess about getting your arrows right. Best done in private.

I claim that {\mathcal{C}} can be written as a Grothendieck construction. Actually, it can in a few ways. The obvious way is to set {\mathcal{U}} to be the category of CAPs {\rho }, paired with refinements. The functor you need is then {\text {nlc}(-,M)}. You find that {\mathcal{C}=\mathcal{U}\ltimes \text {nlc}(-,M)}.

But there’s another way to slice it. Let {\mathcal{U}_{0}} be the category of CAPs {\rho }, along with functions {f_{0}:\rho \to \underline{m}}. Now the functor you need is not all non-locally constant maps to {M}, but only those that are lifts of {f_{0}}. You might denote this set {\text {nlc}_{f_{0}}(\rho ,M)}. I’m tired of all the notation, so let me let {\mu } denote this non-locally constant lifts functor. We have, then {\mathcal{C}=\mathcal{U}_{0}\ltimes \mu }.

While I’m simplifying notation, let me also write {\nu } for {\Sigma ^{\infty }\text {nlc}(-,V)}. Notice that it is actually a functor from {\mathcal{U}}, and thus from {\mathcal{U}_{0}}.

Let’s return to the category {\mathcal{J}(M)} again. It has the same structure. In fact, we just need to pick out of {\mathcal{U}_{0}} the subcategory of CAPs whose set of components is {\underline{m}}, and where {f_{0}} is the identity on {\underline{m}}. Calling this subcategory {\mathcal{R}}, we have {\mathcal{J}=\mathcal{R}\ltimes \mu }.

Summarizing all the notation, our goal is to show that

\displaystyle  \text {holim}_{\mathcal{U}_{0}\ltimes \mu }\nu \simeq \text {holim}_{\mathcal{R}\ltimes \mu }\nu .

The first thing I’d like to do is use twisted arrow categories to re-write things, so perhaps I should tell you about these categories first. If {D} is a category, the twisted arrow category, {{}^{a}D} has objects the morphisms of {D}. Morphisms from {d_{1}\to d_{2}} to {d_{1}'\to d_{2}'} are commuting squares

If {F} and {G} are contravariant functors from {D}, one can check that {(d_{1}\to d_{2})\mapsto \text {mor}(F(d_{2}),G(d_{1}))} is a covariant functor from {{}^{a}D}. I’ll denote it {G^{F}}. One can show that

\displaystyle  \text {holim}_{D\ltimes F}G\simeq \text {holim}_{{}^{a}D}G^{F}.

Using this, we’re hoping to show

\displaystyle  \text {holim}_{{}^{a}\mathcal{U}_{0}}\nu ^{\mu }\simeq \text {holim}_{{}^{a}\mathcal{R}}\nu ^{\mu }.

Proof Outline

We’ve got {{}^{a}\mathcal{U}_{0}\supseteq {}^{a}\mathcal{R}}. Between them lies a category I’ll denote {\mathcal{U}_{0}\to \mathcal{R}}, consisting of arrows {u\to r} with {u\in \mathcal{U}_{0}}, {r\in \mathcal{R}}. Morphisms are “twisted” commuting squares, as they should be, as part of the twisted arrow category. One can reduce the holim over {{}^{a}\mathcal{U}_{0}} to one over {\mathcal{U}_{0}\to \mathcal{R}}, and from there to one over {{}^{a}\mathcal{R}}.

To reduce from {\mathcal{U}_{0}\to \mathcal{R}} to {{}^{a}\mathcal{R}}, one can show that for all {u\to r\in \mathcal{U}_{0}\to \mathcal{R}}, the over-category {{}^{a}\mathcal{R}\downarrow (u\to r)} is contractible. In fact, this result seems to rely very little on our particular {\mathcal{R}} and {\mathcal{U}_{0}}, and doesn’t depend on the functors, {\mu }, {\nu }, or {\nu ^{\mu }}.

For the reduction from {{}^{a}\mathcal{U}_{0}} to {\mathcal{U}_{0}\to \mathcal{R}}, one shows that for all {u_{1}\to u_{2}\in {}^{a}\mathcal{U}_{0}}, we have

\displaystyle  \text {mor}(\mu (u_{2}),\nu (u_{1}))\stackrel{\sim }{\rightarrow } \text {holim}_{\substack {(u_1\to u_2)\to (u\to r)\\ \in (u_1\to u_2)\downarrow (\mathcal{U}_0\to \mathcal{R})}}\text {mor}(\mu (r),\nu (u)).

Essentially this shows that {\nu ^{\mu }}, as a functor from {{}^{a}U_{0}}, is equivalent to the right Kan extension of it’s restriction to a functor from {\mathcal{U}_{0}\to \mathcal{R}}. And the homotopy limit of a right Kan extension is equivalent to the homotopy limit of the restricted functor.

It is in this second reduction, {{}^{a}\mathcal{U}_{0}} to {\mathcal{U}_{0}\to \mathcal{R}}, that we rely on information about our categories and functors ({\mu }, in particular). Pick your object {u_{1}\to u_{2}\in {}^{a}\mathcal{U}_{0}}. You can quickly reduce the crazy over-category above to just {u_{2}\downarrow \mathcal{R}}. Now remember {u_{2}} is a CAP with a function to {\underline{m}}. I’ll denote it {f_{0}:u_{2}\to \underline{m}}. If this function is locally constant (all objects within an equivalence class get sent to the same point), then you sort of replace {u_{2}} with an object obtained by taking affine and direct sums of it’s components. The result is an object of {\mathcal{R}}, but from the perspective of {\mu }, the two objects give equivalent spaces of lifts. Alternatively, if {f_{0}:u_{2}\to \underline{m}} is non-locally constant, then every lift {u_{2}\to M} is non-locally constant, and so {\mu (u_{2})\simeq *}.

This all works out to be useful in the whole proof. But I’ll maybe save all that for another day.

Carnival of Mathematics #60

December 4, 2009

Welcome to the Carnival of Mathematics! Finding that the 60th is apparently the “diamond anniversary,” I was reminded of the symmetry in the Buckyball C_{60}, which has the shape of a truncated icosahedron. You can make pretty nice ones using modular origami:

Before getting to this month’s links, allow me a diversion to talk about some geometry I learned a little of this month.

There are 6!=720 ways to order the letters A, B, E, I, L, and S. If we declare that two orderings are the same if one is obtained from the other by cyclic permutation (for example, ABEILS and ILSABE are the same), there are 6!/6=5!=120 combinations. If we also declare that a word and it’s reverse are the same (ABEILS = SLIEBA), we have arrived at 6!/(6*2)=60 combinations.

Pick any 6 distinct points on a circle (or any conic section). Choose any of the points as a starting point, and draw a line to any of the other points. Then draw a line to one of the remaining 4 points. Continue until all of the points have been hit, and then draw a line back to your starting point. How many different pictures can you make in this process? 60, again, because you could label the points A, B, E, I, L, S, and then pictures correspond to words from the previous calculation.

Each picture you draw is a figure with six edges. These six edges can be put into three set of pairs, where two edges are paired if they are “opposite.” In the process of drawing the lines, above, the line opposite the very first line is the fourth line you draw. Similarly, the second and fifth form a pair, and then the third and sixth.

Now, if you extend all of the lines, each pair of opposite edges will determine a point of intersection (or infinity… maybe try another setup for your original points :)). So each picture you draw determines 3 points in the plane (or infinity). When he was only 16, Pascal showed that these three points are always colinear.

So, given 6 points on a conic, the process outlined above determines 60 lines, called Pascal Lines. Mathworld has more on Pascal Lines, for the inquisitive, so it’s probably about time to direct you over there and get on to this month’s blog posts!

In honor of 60 being both a colossally abundant number and a superior highly composite number, I thought it fitting to include as many links as divisors of 60. I ended up with slightly more links than that, so here are \phi(60)=12 (more on \phi) groups of links from the previous month:

1) At the beginning of the month, Charles Siegel, at Rigorous Trivialities decided to parallel the National Novel Writing Month (NaNoWriMo) by introducing Math Blog Writing Month, MaBloWriMo. After putting it to a vote, he wrote a series on intersection theory. Also taking up MaBloWriMo were Akhil Mathew at Delta Epsilons, Qiaochu Yuan at Annoying Precision, Harrison Brown at Portrait of the Mathematician and, well, yours truly. I found it to be a great experience, and hope next year brings many more authors. If you like your daily math in bite-size fashion, and not just in MaBloWriMo, you might check out probfact on twitter for daily probability facts.

2) At approximately halfway through the month, Wednesday the 18th was determined to be the 150th birthday of the Riemann Hypothesis. Plus Magazine and Math In The News both had articles.

3) Riemann’s zeta function, the lead character in his hypothesis, is connected to primes by Euler’s product formula. If you are interested in the distribution of the primes, Matt Springer at Built on Facts has a post about the function Li(x), as part of his running Sunday Function series. If natural number primes aren’t exciting enough for you, Rich Beveridge at Where The Arts Meet The Sciences has a post for you on Gaussian Primes.

4) It would hardly be a month of math posts without some puzzles:

If you prefer unsolved puzzles, Bill the Lizard has recently written posts about the Collatz Conjecture and the Perfect Cuboid Problem. Alternatively, for some behind-the-scenes on the notoriously difficult Putnam exam (and yet more puzzles), head over to Izabella’s post at The Accidental Mathematician.

5) It’ll take a while to get to the 3435th Carnival of Math, so I think I’m not stepping on too many toes if I point you at Mike Croucher’s quick post at Walking Randomly and Dan MacKinnon’s slightly longer post at mathrecreation that talk about what makes 3435 interesting.

6) Brian, at bit-player, finds some interesting math in a collection of staples, as described in The birth of the giant component.

10) Fëanor at JOST A MON presents Accumulated Causes and Unknowable Effects, related to Pascal’s Wager.

12) This month also saw some nice calculus posts. Daniel Colquitt at General Musings describes the fascinating trumpet of Torricelli. Kalid at BetterExplained asked Why Do We Need Limits and Infinitesimals? and had A Friendly Chat About Whether 0.999… = 1.

15) Kareem at The Twofold Gaze points out that asking for a Best Proximate Value has two reasonable answers.

20) Plus Magazine had an article entitled Pandora’s 3D Box, talking about a recently discovered fractal inspired by the Mandelbrot set.

30) Dave Richeson at Division By Zero reports on a case of mistaken identity in Legendre Who?

60) Finally, Samuel at ACME Science discusses the fractured state of the current mathematics community, noting that Mathematics Really is Discrete. This post was closely followed by Abstruse Goose’s Landscape.

That’s it for now. Look for the next Carnival, Math Teachers at Play, in two weeks!

I apologize for any omissions or errors.

Rational Approximations

November 3, 2009

Suppose x is an irrational in the interval (0,1), and fix a positive integer n. Let’s see what I can say about rational approximations to x whose denominator is no bigger than n.

First of all, x lies in one of the intervals [j/n,(j+1)/n], where 0\leq j<n. Since x is irrational, it won’t be an endpoint of such an interval, nor will it be the midpoint. Which is to say, it is closer to one end than the other. More explicitly, we can say that |x-j/n|<1/(2n) for some 0\leq j\leq n.

Instead of requiring the denominator of a rational approximation to actually be n, what if we just ask that it be no bigger than n? Then we’re asking how well rationals in the n-th Farey sequence, F_n, approximate x.

Instead of thinking about |x-p/q|, let’s multiply everything by q, and so look at |qx-p| (we’ll divide by that q again soon enough). So now we’re not asking about rational approximations, as such, but asking about pairs of integers, p and q with 0\leq p\leq q\leq n, so that qx is “pretty close” to the integer p.

Let’s re-phrase once more. Consider all of the multiples x,2x,3x,\ldots,nx of x. All of these are irrationals, but could now be bigger than 1 (our original x was less than 1). I don’t want to think about things bigger than 1, so let me trim all of these multiples qx down to just the “fractional part”. Which is to say, I’ll now think about all of the values qx-\lfloor qx\rfloor (the “floor” notation picks out the “greatest integer less than”), which are all irrationals in the interval (0,1). Now for each q I have an integer \lfloor qx\rfloor, and let me denote this integer by p_q. You might note that p_q is always less than n.

So now I am thinking about the n irrational values qx-p_q in the interval (0,1). I’ll think about them in comparison to the rational points 0,1/n,2/n,\ldots,1, like I did before. Following Rademacher (as most of my content has been and will be), we distinguish two cases.

  1. Suppose some qx-p_q falls in the interval (0,1/n). Then |qx-p_q|< 1/n, and so |x-p_q/q|<1/(qn)\leq 1/q^2 (since q\leq n).
  2. If (0,1/n) is free of our points, then by the pidgeon-hole principle, some interval (j/n,(j+1)/n) contains two values, say ax-p_a and bx-p_b (arrange notation so that a<b, forcing also p_a<p_b). Then |(bx-p_b)-(ax-p_a)|< 1/n, which we re-arrange to say that |(b-a)x-(p_b-p_a)|< 1/n. Letting q=b-a and p=p_b-p_a, we have |qx-p|< 1/n, which we again re-write, as in case (1), to say that |x-p/q|< 1/q^2.

We have, therefore, improved the “order” of our approximation. To begin with, we said that rational approximations could be found with |x-j/n|< 1/(2n). I think of this as a “order 1″ approximation, since the denominator of the error is a linear function of the denominator of the rational approximation. We have now improved this so that |x-p/q|< 1/q^2, which I would therefore call an “order 2″ approximation. If you are a little more careful with your inequalities, you can improve the bound to |x-p/q|< 1/(2q^2) (an error that I think I’ll probably end up mentioning again tomorrow), basically for the same reason there is a coefficient of 2 in the order 1 approximation, I guess.

I should probably be a little (a lot) more careful in what I am saying with the above. I chose an irrational x and an integer n, and said that an “order 2″ rational approximation could be found. The stronger claim is that, in fact, infinitely many order 2 rational approximations can be found, if you let n increase. Suppose your first order 2 approximation gives you |x-p/q|< 1/q^2, when you have bounded the denominator by n. Find a new N so that 1/N< 1/q^2. Then go through the process above, using N as the new bound on the denominator. When you find |x-p'/q'|< 1/(q'N)\leq 1/(q')^2 with that process, then you’ve found a new rational approximation (since it is closer to x then p/q was) that is still order 2 (since that’s what the process does for you).

So, can order 2 be improved on? Can you find order 3, or higher? It turns out, “no”, in general (and the golden ratio is an example). You can improve the coefficient of 2 in the order 2 approximation to \sqrt{5}, but that’s apparently the best you can do, in general. There’s more than can be said along these lines. The Roth-Liouville irrationality measure might be fun to talk about (it earned Roth a Fields medal, so it must be ok), as would continued fractions. I’m not sure how much these are related to my stated goal of understanding the link between Farey sequences and the Riemann hypothesis, so for now, perhaps I’ll leave them alone.

For some reason, I thought it might be fun to make a visualization about rational approximations using F_5. Here is a picture of the rationals in F_5, as a subset of [0,1]:

\begin{picture}(200,10) \put(0,0){\line(1,0){200}}\multiput(0,0)(40,0){6}{\circle*{3}}\multiput(50,0)(50,0){3}{\circle*{3}}\multiput(67,0)(67,0){2}{\circle*{3}}\end{picture}

Here’s a graph of the function taking x to “the rational in F_5 closest to x“:

\begin{picture}(200,200) \put(0,0){\line(1,0){200}}\put(0,0){\line(0,1){200}}\multiput(0,0)(40,0){6}{\circle*{3}}\multiput(50,0)(50,0){3}{\circle*{3}}\multiput(67,0)(67,0){2}{\circle*{3}}\put(20,40){\line(1,0){25}}\put(45,50){\line(1,0){13}}\put(58,67){\line(1,0){16}}\put(74,80){\line(1,0){16}}\put(90,100){\line(1,0){20}}\put(110,120){\line(1,0){16}}\put(126,133){\line(1,0){16}}\put(142,150){\line(1,0){13}}\put(155,160){\line(1,0){25}}\put(180,200){\line(1,0){20}}\end{picture}

Homotopy Limits and Arrow Categories

September 13, 2009

Ouch, no posts in 2 months? Shame on me. Let’s suppose I’ve been busy.

Today, and other scattered times in the past, I’ve been trying to come to grips with a proof concerning homotopy limits. Suppose \mathscr{C}'\subseteq \mathscr{C} and that |\mathscr{C}'\downarrow c|\simeq * for every c\in \mathscr{C}. Suppose that F is a (covariant) functor from \mathscr{C} to, say, spaces. Then the lemma I’ve been thinking about is that \text{holim}_{\mathscr{C}'}F\simeq \text{holim}_{\mathscr{C}}F, and in fact the natural map from right to left gives the equivalence.

I’ve been studying Bousfield and Kan‘s proof a little, and trying to make it look like something I can think about. Their F goes to simplicial sets, or so, and they have some assumption about it taking values in the collection of fibrant objects. I’m going to ignore that, because in Top all spaces are fibrant (right?), and because ignoring hypotheses can never lead you astray (right?).

So anyway, I thought I’d share something like an outline of a proof. I’ll write = where there probably should be \simeq instead. And that likely won’t be the worst of my transgressions.

Consider the bi-complex A_{i,j} (I guess it’s a bi-cosimplicial space, or so) where

\displaystyle A_{i,j}=\prod_{\substack{c_0'\leftarrow\cdots\leftarrow c_i'\\c_j\leftarrow c_0'\\c_0\leftarrow \cdots\leftarrow c_j}} F(c_0),

where the c_k represent objects of \mathscr{C}, the c_k' come from \mathscr{C}', and arrows lie where they should (in particular, between primed elements, the arrows are in \mathscr{C}'). I’m going to get tired of that indexing, so I’ll let \underline{c_j} represent a chain of the form c_0\leftarrow \cdots \leftarrow c_j, and similarly \underline{c_i'} is c_0'\leftarrow\cdots\leftarrow c_i'. In fact, I’m not going to put the subscripts, as the subscript on a \underline{c_j} will always be j, and similarly i for the primes. If I write \underline{c}\leftarrow \underline{c'}, I mean that c_j\leftarrow c_0'. So I can write

\displaystyle A_{i,j}=\prod_{\underline{c}\leftarrow \underline{c'}}F(c_0),

letting the ambiguation begin (or was that earlier?).

Next, let’s start thinking about what some homotopy limits are. I should probably be calling things Tot’s, for totalization, but my understanding is that they’re basically holim. Finally, “\holim” is not a built-in latex command. For convenience, I’m going to just write “lim”.

\lim means \text{holim} below.

Ok, so, fix a j. In what follows, Map(X,Y) denotes the mapping space, and N_n(\mathscr{D}) represents the n-th layer in the simplicial nerve of a category \mathscr{D}. The geometric realization of \mathscr{D} is denoted |\mathscr{D}|. Compute:

\begin{array}{rl} \lim_i A_{i,j} &= \displaystyle \lim_i \prod_{\underline{c}\leftarrow \underline{c'}} F(c_0) \\ &= \displaystyle \lim_i \prod_{\underline{c}}\prod_{c_j\leftarrow \underline{c'}} F(c_0) \\ &= \displaystyle \prod_{\underline{c}} \lim_i \prod_{c_j\leftarrow \underline{c'}}F(c_0) \\ &= \displaystyle \prod_{\underline{c}} \lim_i Map(N_i(\mathscr{C}'\downarrow c_j),F(c_0)) \\ &= \displaystyle \prod_{\underline{c}} Map(|\mathscr{C}'\downarrow c_j|,F(c_0))\end{array}

Finally, our assumption that these arrow categories are contractible lets us write this as simply \prod_c F(c_0). Which means that

\displaystyle \lim_{\mathscr{C}}F=\lim_j \prod_{\underline{c}} F(c_0) = \lim_j \lim_i A_{i,j}.

That’s nice. Let’s do the homotopy limit the other way, fixing i to begin.

\begin{array}{rl} \lim_j A_{i,j} &= \displaystyle \lim_j \prod_{\underline{c}\leftarrow \underline{c'}}F(c_0) \\ &= \displaystyle \lim_j \prod_{\underline{c'}}\prod_{\underline{c}\leftarrow c_0'} F(c_0) \\ &= \displaystyle \prod_{\underline{c'}} \lim_j \prod_{\underline{c}\leftarrow c_0'} F(c_0) \end{array}

Now \lim_{\mathscr{C}'}F=\lim_i \prod_{\underline{c'}} F(c_0'), which almost looks close to what we’ve got above. To see that they are equivalent, notice that the arrow category c_0'\downarrow \mathscr{C} has the initial object c_0'\to c_0'. This means that

\displaystyle F(c_0')=\lim_{c_0'\downarrow \mathscr{C}}F=\lim_j \prod_{\underline{c}\leftarrow c_0'}F(c_0).

We conclude that

\displaystyle \lim_{\mathscr{C}'} F=\lim_i \lim_j A_{i,j}.

Finally, apply Fubini’s theorem for homotopy limits, and conclude that

\displaystyle \lim_{\mathscr{C}}F=\lim_{\mathscr{C}'}F

Now, to translate to the situation when my categories are internal categories in Top…

Hardy and Wright, Chapter 10

May 29, 2009

We decided to split the reading of chapter 10 into two weeks (chapter 9 here, in case you missed it). It’s a longish chapter, and I really like continued fractions (though I’m not particularly sure why, they’re just fun) and some of the other readers thought it might be worth it to spend more time reading it carefully.

Our first meeting covered the first few sections, which only involved basic definitions, and the theorem that every real number has an essentially unique continued fraction expansion, and the expansion is finite if and only if the number is rational. Eric stated that he was unimpressed so far, and didn’t see what I was so fascinated by. None of us seemed to have any questions about the reading, so I gave a glimpse of things to come (relation of periodic continued fractions to quadratics, and rational approximations). I also mentioned that there are some interesting tie-ins to the “modular group” (SL_2(\mathbf{Z})), Farey sequences, and Ford circles (which have come up before). Eric has been reading about hypergeometric series, and said there are some interesting formulas there related to continued fractions. He also asked if there was some relation to surreal numbers, because continued fractions approximate numbers from the left and right, alternatingly.

We picked up, the second week, in section 10.10 “A lemma”, defining an equivalence relation on reals. The relation works out to be that two numbers are equivalent if the tail of their continued fractions are the same. Chris corrected a misinterpretation Eric brought up, about canonical representatives of equivalence classes. I had wondered if the equivalence meant that, in terms of periodic continued fractions representing “quadratic” numbers, two numbers (a_1+\sqrt{b})/d_1 and (a_2+\sqrt{b})/d_2 would always be equivalent. In fact, I thought I had decided they were. But an example in the book shows that this is not the case (\sqrt{5}=[2,\dot{4}] while (\sqrt{5}+1)/2=[\dot{1}], dots representing the repeating part). Eric pointed out that two points were related if there are in the orbit of the modular group acting on \mathbb{R} as a subset of \mathbb{C}, acting as linear fractional transformations.

We spent a little while talking about periodic continued fractions, how the two directions of the proof that they are equivalent to “quadratics” go. I think the proof that any quadratic has a periodic continued fraction is fascinating. It gives no indication how long the period will be, or when it will start.

Next I mentioned that there’s a convenient algorithm for finding the continued fraction for a “quadratic surd”, and that I intend to post some python code here implementing it (and other fun functions for playing with continued fractions). While it’s essentially the normal algorithm, taking floors and then reciprocals, there’s some convenience in having quadratics around, because you can “rationalize the numerator” and sorts of things. Not mentioned in the text, but stated at both Wikipedia and Mathworld (links below), is that Lagrange showed that the continued fraction for \sqrt{D} has a period smaller than 2D, that the period begins after a single non-repeating term, and that the last term in the period is twice a_0 (the first term of the continued fraction). All of these things are true of the examples given in the text. And, while finding links… holy crap! the repeating part, besides the last numeral, is palindromic! Is there no end to the fascination!?

I’ll go ahead and just direct you to the Wikipedia page on (periodic) continued fractions, and similarly the Mathworld page (periodic) continued fractions. All (and undoubtedly many others) make for fascinating reading.

Our next main focus was on approximation by convergents. Chris pointed out how remarkable the final theorem is, that any time a fraction is sufficiently close to a number (in terms of it’s denominator), it is automatically a convergent. I mentioned one thing I read about in Rademacher’s “Higher Mathematics from an Elementary Point of View” (which I love), which was that the existence of infinitely many p/q such that |p/q-x|<\frac{1}{2q^2} (corollary of theorem 183) can be interpreted as saying that a vertical line at x passes through infinitely many Ford circles.

I then tried to explain the difference between Theorems 181 and Theorems 182, and point out that there are two reasonable definitions of “closest rational approximation”. I had read about these in Khinchin’s “Continued Fractions” (which I also love). I bumbled it a bit, but believe I was saying true things throughout. Basically, the story goes… convergents are best rational approximations in the stronger sense (thm 182), and mediants of successive convergents are best rational approximations in the weaker sense (thm 181). In fact, choose an irrational (for convenience) x, and let \square denote the operation “mediant”. For any n, define m_{n,1}=(p_n/q_n)\square (p_{n+1}/q_{n+1}), and then iteratively m_{n,k}=m_{n,k-1}\square (p_{n+1}/q_{n+1}). The last of these mediants that is on the same side of x as p_n/q_n will be p_{n+2}/q_{n+2}. Continued fractions rock.

It’s really best to think about these lemmas with an actual continued fraction example. I, personally, used 61/45=[1;2,1,4,3], and looked at the mediants m_{1,k}, between the first and second convergent.

We finished with me trying to explain something that I thought was quite surprising. Let f_k(x)=n_k(x)/k denote the closest rational to x, with denominator k (let’s not require the fraction in reduced terms). I was quite surprised, honestly (and convinced Eric he should be too), that for a chosen x, the sequence of such rationals will not be successively better approximations to x. Having had the chance to go through an example with Eric, and then a few hours to mull it over, I’ve since realized this it not particularly surprising at all. Suppose x lies in [1/4,1/3]. Half of these x will be better approximated by 1/3 than 1/4.

So, anyway, I guess that’s all I have to say about continued fractions by now. Perhaps Eric will show us sometime about fun relationships between hypergeometric series and continued fractions. If you haven’t already stopped reading this post to go find all sorts of other interesting things to read about continued fractions, either online or in Rademacher’s or Khinchin’s books, you can now.

Hardy and Wright, Chapter 9

May 14, 2009

Continuing on, we talked about “The Representation of Numbers by Decimals” this week. I thought the first few sections were fun, in the precision and care used to prove things like “Rational numbers have repeating decimals, and vice versa”.

Chris and I both, apparently, took a minute to digest the example that 29310478561 is divisible by 7, at the end of section 9.5. I feel like in my first year of undergrad I learned about this, or perhaps a similar, test for division by 7. I thought that instead of taking the sum of digits (like you would for mod 3 or mod 9), or alternating sum (for mod 11), you could take some sort of weighted sum of the digits, where the weights depended on 7^n\pmod{10} for various n. Looking around online while writing this post, it seems I was (mis-)remembering the method of Pascal, listed on this page at Mathworld. This Mathworld page mentions several other tests, which look interesting. My search turned up another as well, over at God Plays Dice.

Eric mentioned that he’s read a lot about the game of Nim (section 9.8) because Conway writes about it in On Numbers and Games, and Eric likes surreal numbers.

I like the section on “Integers with missing digits” (section 9.9), because I like to show my calculus students that the sum of the reciprocals of such numbers is a convergent series (even though, writing out the first several terms, it looks like you haven’t thrown out many terms from the harmonic series). This is known as the Kempner Series, which I first learned about in the book Gamma, by Havil.

We concluded with a little discussion on normal numbers, the last few sections of the chapter. It seems we all ran out of steam while reading this chapter, so we didn’t get through the proofs in these sections. But the ideas are interesting, and the results are fun. According to Wikipedia, there is a conjecture that all irrational algebraic numbers are normal, even though there is no known proof that any particular irrational algebraic number is normal. I remain a little confused about the definition of normal, actually. “We say that x is normal in the scale of r if all of the numbers x,rx,r^2x,\ldots are simply normal in all of the scales r,r^2,r^3,\ldots“. I think the idea with multiplying by these powers and changing the scale is that you are looking at longer and longer digit sequences, instead of just single digits (which would be simply normal). I was a little unclear about why you need to multiply x by all of those powers of r, but I guess if you don’t then you won’t ever get any digits (in the scale r^n) greater than r^{n-1}, perhaps?

Update 20090515: I completely forgot to mention another thing we talked about, and I had promised the group I’d link to. Section 9.9, on “Integers with missing digits” begins with the line “There is a familiar paradox” and a footnote that reads “Relevant in controversies about telephone directories”. I wasn’t exactly sure what this controvery was, but we decided probably it was the fact that the probability of picking a random number out of the telephone book, and having it not contain a 9 (say) is fairly small. At first, I had thought maybe the controversy the footnote was hinting at might be related to Benford’s Law, which I also remembered was just recently in the news (slashdot).

A Sibling!

May 13, 2009

Ok, ok, so this isn’t a math post. But my parent has formed a new kid, tasked with learning Python/Sage. I’m a big brother now!

I’m in the Carnival!

May 8, 2009

My post on a Riemann sum made the 52nd Carnival of Mathematics!

Hardy and Wright, Chapter 8 (Second Part)

May 8, 2009

Picking up where we left off last week, we finished chapter 8 today. Most of the time was spent trying to trace through the proofs of the various statements, so I won’t go into too much detail here about that. Many of the proofs had the same flavor, cleverly grouping terms in a polynomial, or setting corresponding coefficients equal in two different representations of a polynomial.

When I had first read the chapter, I didn’t pay too close attention to some of the later sections, for example the section on Leudesdorf’s Theorem (generalizing Wolstenholme’s), and the “Further consequences of Bauer’s Theorem“. However, during our meeting we worked through most of Leudesdorf’s theorem, and we were able to gain some appreciation for the various cases (specifically, why they arise).

One of the theorems in the sections we kinda glossed over was the following (Theorem 131 in the book): If p is prime, and 2v<p-3, then the numerator of S_{2v+1}=1+\frac{1}{2^{2v+1}}+\cdots+\frac{1}{(p-1)^{2v+1}} is divisible by p^2. I noted that this S_{2v+1} is a partial sum for \zeta(2v+1)=\sum_{n=1}^{\infty} n^{-(2v+1)} (Wikipedia, Mathworld). Eric wondered if perhaps they were thinking about this sum as a generalization of this \zeta function to some finite field, but the modulus of p^2 didn’t fit that entirely. Eric also reminded us that closed forms for \zeta(2v) can be found, while closed forms for \zeta(2v+1) are not known.


Follow

Get every new post delivered to your Inbox.