1. ## The Basel Problem

Hi

I have been trying to learn and figure out the elementary rigorous proof posted here:

Basel problem - Wikipedia, the free encyclopedia
under "A rigorous proof"

I know there are...simpler proofs, but because of the elementary knowledge needed to understand this, I'm trying to learn this one.

I wasn't sure if this belonged in the calculus section, but it has some elements of calculus in it.

For the proof, I know most of the background knowledge. I understand De Moivre's theorem, the binomial theorem, one-to-one functions, trig identities, the inequality given, limits, and the squeeze theorem.
I'm not completely sure about Viete's formulas...I know that it gives us a value for the sum of roots.

Anyway, I somewhat understand this proof but not completely.
In the first step, I know how the (cotx+i)^n function is derived and how it is expanded.

I see how imaginary and real parts are grouped and the sine function is set equal to the imaginary parts.

In the next step it says 0 is equal to that equation and I understand why its equal to 0. However, I do not see why it ends with (-1)^m. I think I see why it would be a '1' since the 'i' was factored out, but I do not understand why it is negative and why it's raised to the mth power.

In the next equation, a function p(t) is defined. I was wondering why (cotx)² needs to be a one-to-one function in the interval in order for this to happen. I also do not see why the equation before it must equal 0 for this to happen.

The steps following I generally understand the processes.

It would be greatly appreciated if a run through for the proof was posted, but most importantly I would like the clarifications to be clarified.

Thank you very much.

2. Originally Posted by Anthonny
In the next step it says 0 is equal to that equation and I understand why its equal to 0. However, I do not see why it ends with (-1)^m. I think I see why it would be a '1' since the 'i' was factored out, but I do not understand why it is negative and why it's raised to the mth power.

In the next equation, a function p(t) is defined. I was wondering why (cotx)² needs to be a one-to-one function in the interval in order for this to happen. I also do not see why the equation before it must equal 0 for this to happen.
This $\displaystyle (-1)^m$ comes from the fact that $\displaystyle i^{2m+1}=(-1)^m i$. Let me rewrite the expansion of $\displaystyle (\cot x + i)^{2m+1}$:

$\displaystyle (\cot x+i)^{2m+1}=\sum_{k=0}^{2m+1}{2m+1\choose k}(\cot x)^{2m+1-k}i^k$ $\displaystyle =\sum_{p=0}^m {2m+1\choose 2p}(\cot x)^{2m+1-2p}i^{2p}+\sum_{p=0}^m {2m+1\choose 2p+1}(\cot x)^{2m+1-(2p+1)}i^{2p+1}$

(separating terms with even and odd indices) and $\displaystyle i^{2p}=(-1)^p$, $\displaystyle i^{2p+1}=(-1)^p i$, hence

$\displaystyle (\cot x+i)^{2m+1}=\sum_{p=0}^m (-1)^p{2m+1\choose 2p}(\cot x)^{2m+1-2p}$ $\displaystyle +i\sum_{p=0}^m (-1)^p{2m+1\choose 2p+1}(\cot x)^{2(m-p)}$.

Equating imaginary parts, we get

$\displaystyle \frac{\sin((2m+1)x)}{(\sin x)^{2m+1}}=\sum_{p=0}^m (-1)^p{2m+1\choose 2p+1}(\cot x)^{2m+1-(2p+1)}=P\big((\cot x)^2\big)$,

where we define the polynomial $\displaystyle P(t)=\sum_{p=0}^m (-1)^p{2m+1\choose 2p+1}t^{m-p}$.

For $\displaystyle r=1,\ldots,m$, the real number $\displaystyle x_r=\frac{r\pi}{2m+1}$ satisfies $\displaystyle 0<x_r<\frac{\pi}{2}$, hence $\displaystyle \sin x_r\neq 0$, and $\displaystyle \sin((2m+1)x_r)=0$, so that the previous equation shows that $\displaystyle (\cot x_r)^2$ is a root of $\displaystyle P$.

The degree of $\displaystyle P$ is $\displaystyle m$, hence it has at most $\displaystyle m$ roots. Furthermore, since $\displaystyle 0<x_1<x_2<\cdots<x_m<\frac{\pi}{2}$, we have $\displaystyle \cot x_1>\cot x_2>\cdots>\cot x_m>0$, hence $\displaystyle (\cot x_1)^2>(\cot x_2)^2>\cdots>(\cot x_m)^2>0$, which shows that these $\displaystyle m$ roots are distinct. Therefore they are the $\displaystyle m$ roots of $\displaystyle P$.

For any polynomial $\displaystyle P(t)=a_0+a_1 t+\cdots +a_m t^m$ with $\displaystyle m$ roots $\displaystyle t_1,\ldots,t_m$, one can factorize $\displaystyle P(t)=\lambda(t-t_1)(t-t_2)\cdots(t-t_m)$. Expanding this expression (mentally) and gathering the terms of common degree, we see that the term of degree $\displaystyle m$ is just $\displaystyle a_m t^m=\lambda t^m$ and the term of degree $\displaystyle m-1$ is $\displaystyle a_{m-1}t^{m-1}=-\lambda(t_1+\cdots+t_m)t^{m-1}$. Thus we have$\displaystyle t_1+\cdots+t_m=-\frac{a_{m-1}}{a_m}$.

In the case of our polynomial, this gives $\displaystyle \sum_{r=1}^m \big(\cot\frac{r\pi}{2m+1}\big)^2 = \frac{{2m+1\choose 3}}{{2m+1\choose 1}}=\frac{2m(2m+1)}{6}$.

On the other hand, we have, for $\displaystyle 0<x<\frac{\pi}{2}$, $\displaystyle \cot^2x<\frac{1}{x^2}<1+\cot^2x$. Applying this to $\displaystyle x=x_r$ gives two bounds for $\displaystyle \sum_{r=1}^m \frac{1}{x_r^2}$, hence for $\displaystyle \sum_{r=1}^m \frac{1}{r^2}$ by factorizing a few terms out. Letting $\displaystyle m$ go to infinity, both bounds are easily seen to converge toward $\displaystyle \frac{\pi^2}{6}$, hence the conclusion.

Feel free to ask for more clarifications.

3. Originally Posted by Laurent
Feel free to ask for more clarifications.
Thank you so much for the detailed clarification. The clarification helped me understand the binomial expansion completely and why it ends with the (-1)^m.

However, I have just one more question.

Is it essential that the m roots of P must be distinct?
Although, my definition of distinct is that all evaluated values are different so I am not completely sure about distinctness either.

Again, thank you very much.

4. Originally Posted by Anthonny
Is it essential that the m roots of P must be distinct?
(distinct=different ; maybe this is a "Gallicism", I don't know...)

What is essential is that we know what the roots of P are (for we need to know what there sum is), together with their multiplicity (here, they are simple). On one hand, we know there are at most m of them (by the degree of P). On the other hand, we have m different ones at our disposal, namely $\displaystyle x_1,\ldots,x_m$. As a consequence, the roots are $\displaystyle x_1,\ldots,x_m$.

There could have double roots, provided we would have known which ones they were (and count them twice in the sum of the roots).

If the degree of $\displaystyle P$ was $\displaystyle m+1$, then there would be one root missing, and the sum of the roots would be undeterminated by the only knowledge of $\displaystyle x_1,\ldots,x_m$.

5. Originally Posted by Laurent
(distinct=different ; maybe this is a "Gallicism", I don't know...)

What is essential is that we know what the roots of P are (for we need to know what there sum is), together with their multiplicity (here, they are simple). On one hand, we know there are at most m of them (by the degree of P). On the other hand, we have m different ones at our disposal, namely $\displaystyle x_1,\ldots,x_m$. As a consequence, the roots are $\displaystyle x_1,\ldots,x_m$.

There could have double roots, provided we would have known which ones they were (and count them twice in the sum of the roots).

If the degree of $\displaystyle P$ was $\displaystyle m+1$, then there would be one root missing, and the sum of the roots would be undeterminated by the only knowledge of $\displaystyle x_1,\ldots,x_m$.

Thank you!
I basically completely now understand the proof for this problem.

Again, thank you so much.