## A Formula for Primes

I wanted to write this post yesterday, but I couldn’t work out some of the steps that I wanted to be able to work out, so I had to wait a day and try again. But I think I’ve got it now, let’s see…

We’ve been working with the function

$\displaystyle J(x)=\sum_{p^n\leq x}\frac{1}{n}.$

Last time, we arrived (skipping over some rather large gaps) at an analytic formula for $J(x)$, namely:

$\begin{array}{l}\displaystyle J(x)=Li(x)-\sum_{\text{Im }\rho>0}\left(Li(x^{\rho})+Li(x^{1-\rho})\right)\\ \displaystyle\qquad\qquad +\int_x^{\infty}\frac{dt}{t(t^2-1)\ln t}-\ln 2,\end{array}$

where (mod issues)

$\displaystyle Li(x)=\int_0^x \frac{dt}{\ln t}.$

The goal today is to use this formula for $J(x)$ to find a formula for primes. In particular, we will find a formula for $\pi(x)$, which is the number of primes no larger than $x$. These two functions are related, as follows: recall that $J(x)$ is a step function which jumps by $1/n$ at every prime power $p^n$. Pick an $x_0$. How many jumps of size 1 does $J(x)$ make before getting to $x_0$? Well, it jumps by 1 at the primes, so it makes $\pi(x_0)$ jumps of size 1. How many jumps of size 1/2? These are jumps at prime squares, $p^2\leq x_0$. This inequality is the same as $p\leq x_0^{1/2}$, so we make $\pi(x_0^{1/2})$ jumps of size 1/2. Similarly, we’ll make $\pi(x_0^{1/n})$ jumps of size $1/n$ to calculate $J(x_0)$. Eventually (for some large enough $N$) $x_0^{1/N}$ will be less than 2, and so $\pi(x_0^{1/N})=0$, and we can stop looking for jumps.

Translating the above into symbols, we have just decided that

$J(x)=\pi(x)+\frac{1}{2}\pi(x^{1/2})+\frac{1}{3}\pi(x^{1/3})+\cdots+\frac{1}{n}\pi(x^{1/n})+\cdots.$

I’ve written it as an infinite sum, but eventually (like I said) those $\pi$ values will be 0, so we’ve really got a finite sum.

Now wait, our goal was a formula for $\pi(x)$, in terms of $J(x)$, and I’ve got a formula the other way, $J(x)$ in terms of $\pi(x)$.

This is the part I got stuck on yesterday. Edwards’ book says (or perhaps doesn’t, causing my problems) that basically you do Möbius inversion to the sum above, and obtain a formula for $\pi(x)$ in terms of $J(x)$. He says slightly more than that, but I wasn’t able to piece it together as easily as he seemed to indicate it should work. But anyway, what’s this Möbius inversion thing? I’ll take a pretty low-level account, leaving lots of room for things to be jazzed up (which I may return to and do sometime soonish).

An integer is said to be “squarefree” if it factors into the product of distinct primes or, equivalently, it is not divisible by a square. Define the following function, called the Möbius function, on positive integers ($p_i$ is supposed to indicate a prime):

$\displaystyle \mu(n) = \begin{cases}1& n=1 \\ 0 & n\text{ not squarefree} \\ (-1)^k & n=p_1\cdots p_k\text{ squarefree}\end{cases}$

So, for example, $\mu$ is -1 at any prime, $\mu(6)=\mu(10)=1$, and $\mu(8)=\mu(12)=0$.

Möbius inversion is then the following:

If $f(n)=\sum_{d|n}g(d)$ then $g(d)=\sum_{d|n}\mu(d)f(n/d)$, and conversely.

Well, I tried to figure out how my expression for $J(x)$, in terms of $\pi(x^{1/n})$s was such an $f$ of $g$. I don’t see it, as stated. Perhaps somebody can point me in the right direction in the comments.

The observation you are supposed to make comes when you consider something like $\frac{1}{2}J(x^{1/2})$. Using our formula above, we can write

$\frac{1}{2}J(x^{1/2})=\frac{1}{2}\pi(x^{1/2})+\frac{1}{4}\pi(x^{1/4})+\frac{1}{6}\pi(x^{1/6})+\cdots$

or, more generally, we have

$\displaystyle \frac{1}{n}J(x^{1/n})=\sum_k \frac{1}{nk}\pi(x^{1/nk}).$

If we subtract this equation (with $n=2$) from the equation for $J(x)$ in terms of $\frac{1}{n}\pi(x^{1/n})$s, we can eliminate one term from the right-hand side. The idea is to iterate this process. I think a (reasonably, by my blog’s standards) clean way to write what is happening is as follows (it helps when you know the answer you are supposed to get :)):

Consider the sum

$\displaystyle \sum_n \frac{\mu(n)}{n}J(x^{1/n}),$

which by our most recent formula is the same as

$\displaystyle \sum_{n,k}\frac{\mu(n)}{nk}\pi(x^{1/nk}).$

Now, fix a positive integer $m$. The coefficient of $\pi(x^{1/m})$ in this last double sum is

$\displaystyle \sum_{nk=m}\frac{\mu(n)}{m}=\frac{1}{m}\sum_{n|m}\mu(n).$

Almost there… what is $\sum_{n|m}\mu(n)$? Turns out, it is 1 if $m=1$ (which is clear), and 0 otherwise. Indeed, suppose $m$ has prime factorization $p_1^{e_1}\cdots p_k^{e_k}$. Then all of the divisors, $n$ in the sum we’re after, are of the form $p_1^{f_1}\cdots p_k^{f_k}$ where each $0\leq f_i\leq e_i$. The only divisors, $n$, where $\mu(n)\neq 0$ are those where the $f_i$ are all either 0 or 1. So the worthwhile $n$ could be 1, or some $p_i$, or a $p_ip_j$ ($i\neq j$), or $p_ip_jp_{\ell}$ ($i\neq j$, $i\neq \ell$, $j\neq \ell$), etc. So

$\begin{array}{l}\displaystyle \sum_{n|m}\mu(n)=1+\sum_i \mu(p_i)+\sum_{i,j} \mu(p_ip_j)+\cdots+\mu(p_1\cdots p_k) \\ \qquad \qquad \displaystyle =1-k+\binom{k}{2}-\binom{k}{3}+\cdots+(-1)^k=(1-1)^k=0.\end{array}$

(By the way, this is (almost verbatim) the proof in Hardy and Wright’s “An Introduction to the Theory of Numbers”, and I’ll count that as one of my references for anything I say about $\mu$).

So, let’s gather back up here. We’ve now shown that in the sum

$\displaystyle \sum_n \frac{\mu(n)}{n}J(x^{1/n})=\sum_{n,k}\frac{\mu(n)}{nk}\pi(x^{1/nk})$

the coefficient of $\pi(x^{1/m})$ is 0 unless $m=1$, when the coefficient is 1. Therefore, we have obtained out formula for $\pi(x)$:

$\begin{array}{l} \pi(x)=\displaystyle \sum_n \frac{\mu(n)}{n}J(x^{1/n})\\ \qquad \displaystyle=J(x)-\frac{1}{2}J(x^{1/2})-\frac{1}{3}J(x^{1/3})-\frac{1}{5}J(x^{1/5})+\frac{1}{6}J(x^{1/6})-\cdots\end{array}$

I guess the point of Riemann’s paper was that the terrible integral formula he obtained for $J(x)$ could then be used in this expression for $\pi(x)$, to give an exact analytic expression (that is, one only an analyst would find pretty) for $\pi(x)$. Pretty wild, in my mind.

It seems that if you get some feeling for the terms in $J(x)$, you can decide that the “dominant” term is the $Li(x)$ term. And then that still ends up as the dominant term in the expression above for $\pi(x)$. And so you get to say

$\pi(x)\sim Li(x).$

So you’ve got that going for you, which is nice.