# Thread: When number theory and probabilities clash

1. ## When number theory and probabilities clash

Hey everybody,
I'm trying to prove the following :

Say you are given a set of $\displaystyle n$ random numbers $\displaystyle 0 \leq X_i \leq 255$ with $\displaystyle n > 1$. Prove that $\displaystyle P \left ( \sum_{i = 1}^n X_i \equiv a \pmod{256} \right ) = \frac{1}{256}$. That is, prove that the sum of all the random numbers modulo $\displaystyle 256$ has an equal chance of being any number between $\displaystyle 0$ and $\displaystyle 255$.
I've been trying various methods, but I cannot find a general way to prove this. I guess it requires playing with inequalities and common sense, but I already am having difficulties proving it when $\displaystyle n = 2$ (I've been separating it in two cases, the first one being $\displaystyle X_1 + X_2 < 256$, then I add both to find the desired probability). Does anyone have a general proof for this ?

Thanks all

2. The statement is wrong: the probability should be $\displaystyle \frac{1}{256}$, i.e. every residue of $\displaystyle X_1+\cdots+X_n$ modulo $\displaystyle 256$ is equally likely.

There is a strong symmetry between the variables, and this is the key.

Consider the simplest case to better understand it: if $\displaystyle X_1, X_2,\ldots$ are random with equal probability to equal 0 or 1, and independent of each other, then $\displaystyle X_1+X_2+\cdots+X_n\pmod{2}$ is 0 if $\displaystyle X_n$ has same parity as $\displaystyle X_1+\cdots+X_{n-1}$ and 1 else. However, by the case with n-1 variables, the parity of $\displaystyle X_1+\cdots+X_{n-1}$ is 0 or 1 with same probability. Using independence, we conclude that the parities match with probability 1/2.

The general case is the same. You can procede by induction: you have (equalities are modulo N) $\displaystyle P(X_1+\cdots+X_n+X_{n+1}=x)=P(X_{n+1}=x-(X_1+\cdots+X_n))$ $\displaystyle =\sum_{y=0}^{N-1} P(X_{n+1}=x-y, X_1+\cdots+X_n=y)$ $\displaystyle =\sum_{y=0}^{N-1} P(X_{n+1}=x-y)P(X_1+\cdots+X_n=y)$ (using independence) and the conclusion is immediate since each probability is $\displaystyle \frac{1}{N}$ by assumption or induction hypothesis.

3. Yes my bad, it is indeed 256, just added this before posting and realized it was wrong a couple of seconds ago
I'll edit. Thanks for your reply, I will attempt to prove it by induction and see how it flows.

4. I don't understand
and the conclusion is immediate since each probability is 1/N by assumption or induction hypothesis.
I understand that $\displaystyle P(X_{n + 1} = x - y) = \frac{1}{256}$ since only one value of $\displaystyle y$ satisfies it, but I don't get the next probability. That's what I'm trying to prove, isn't it ? That $\displaystyle P(X_1+\cdots+X_n=y) = \frac{1}{N}$ ? And if both probabilities are $\displaystyle \frac{1}{N}$, then the overall probability should be $\displaystyle \frac{1}{N^2}$ which is incorrect ? This is really puzzling me. Can you give me a little more help on this one Laurent ?

5. And I just thought of something, wouldn't it be possible to show that two random numbers $\displaystyle a$ and $\displaystyle b$ between $\displaystyle 0$ and $\displaystyle 255$ add up equally to any number between $\displaystyle 0$ and $\displaystyle 255$, then apply the same argument over the sum of all random numbers in an iterative way ? (like take the first two numbers, say their sum is equally distributed over $\displaystyle 0$..$\displaystyle 255$, and take the next number, rince and repeat ?)

If this is possible then I only have to show the case when $\displaystyle n = 2$, and if I really get stuck I can always prove it by exhaustive search (ugh I wish this won't happen, would look bad in my textbook)

6. Originally Posted by Bacterius
I don't understand

I understand that $\displaystyle P(X_{n + 1} = x - y) = \frac{1}{256}$ since only one value of $\displaystyle y$ satisfies it, but I don't get the next probability. That's what I'm trying to prove, isn't it ? That $\displaystyle P(X_1+\cdots+X_n=y) = \frac{1}{N}$ ?
No. This is a proof by induction, although I didn't write every detail. We assume that $\displaystyle P(X_1+\cdots+X_n=y)=\frac{1}{N}$ for every $\displaystyle y$ (this is obvious if n=1), and deduce that $\displaystyle P(X_1+\cdots+X_n+X_{n+1}=x)=\frac{1}{N}$ for every $\displaystyle x$.

Then you get $\displaystyle \sum_{y=0}^{N-1}\frac{1}{N^2}=\frac{N}{N^2}=\frac{1}{N}$.

About your last post: adding the random variables one by one is exactly was is done in the above induction.

You can write the induction step in a different way, like $\displaystyle P(X_1+\cdots+X_{n+1}=x)=P(X+Y=x)$, where $\displaystyle X=X_1+\cdots+X_n$ and $\displaystyle Y=X_{n+1}$ are independent and uniformly distributed on $\displaystyle \{1,\ldots,N\}$ by the induction hypothesis, so you are reduced to the case $\displaystyle n=2$. Maybe you like it better this way; the computation is exactly the same as the one I did (sum over the value y of $\displaystyle Y$), a very little bit longer but it may be clearer.

7. Oh, I understand now, that was smart ! Thanks Laurent !

8. ...or without induction, every step at once:

$\displaystyle P(X_1+\cdots+X_n=x)=$ $\displaystyle \sum_{0\leq x_1,\ldots,x_{n-1}<N}P(X_1=x_1,\ldots,X_{n-1}=x_{n-1},X_n=x-(x_1+\cdots+x_{n-1}))$ $\displaystyle =\sum_{0\leq x_1,\ldots,x_{n-1}<N}\frac{1}{N^n}=\frac{N^{n-1}}{N^n}=\frac{1}{N}$.

9. Originally Posted by Laurent
...or without induction, every step at once:

$\displaystyle P(X_1+\cdots+X_n=x)=$ $\displaystyle \sum_{0\leq x_1,\ldots,x_{n-1}<N}P(X_1=x_1,\ldots,X_{n-1}=x_{n-1},X_n=x-(x_1+\cdots+x_{n-1}))$ $\displaystyle =\sum_{0\leq x_1,\ldots,x_{n-1}<N}\frac{1}{N^n}=\frac{N^{n-1}}{N^n}=\frac{1}{N}$.

DISCLAIMER: the writer of this post is NOT a probability/measure theory lover and, in fact, pretty much sucks at it.

I don't see this clearly: take $\displaystyle n=2\,,\,\,a=5\,,\,7\!\!\!\pmod{256}$ . Now, there are only 6 options to get a sum that equals $\displaystyle 5\!\!\!\pmod{256}$ according to the conditions: $\displaystyle 0-5\,,\,5-0\,,\,1-4\,,\,4-1\,,\,2-3\,,\,3-2$, but there're 8 options for $\displaystyle 7\!\!\!\pmod{256}: 7-0\,,\,0-7\,,\,1-6\,,\,6-1\,,\,2-5\,,\,5-2\,,\,3-4\,,\,4-3$ , so this means the probability to obtain 5 is less than the prob. to obtain 7...or ain't so?

Tonio

10. Originally Posted by tonio
I don't see this clearly: take $\displaystyle n=2\,,\,\,a=5\,,\,7\!\!\!\pmod{256}$ . Now, there are only 6 options to get a sum that equals $\displaystyle 5\!\!\!\pmod{256}$ according to the conditions: $\displaystyle 0-5\,,\,5-0\,,\,1-4\,,\,4-1\,,\,2-3\,,\,3-2$, but there're 8 options for $\displaystyle 7\!\!\!\pmod{256}: 7-0\,,\,0-7\,,\,1-6\,,\,6-1\,,\,2-5\,,\,5-2\,,\,3-4\,,\,4-3$ , so this means the probability to obtain 5 is less than the prob. to obtain 7...or ain't so?
What about $\displaystyle 6+255=6+(-1)$? Like in my previous post, the equality is to be understood modulo 256. In fact, everything would be simpler to write if we considered that $\displaystyle X_i$ is uniformly chosen in $\displaystyle \mathbb{Z}/256\mathbb{Z}$, and this is what I implicitly do here.

Thus 5 can be obtained as 0+5, 1+4, 2+3, 3+2, 4+1, 5+0, 6+255,... and it is immediate to see that there are 256 possibilities, and there is nothing specific to 5 here, so it would be the same for any sum. I guess suddenly the problems seems simpler than at first sight...