# Thread: X=0 almost surely => E(X)=0

1. ## X=0 almost surely => E(X)=0

Axioms of expectation:
1. X≥0 => E(X)≥0
2. E(1)=1
3. E(aX+bY) = aE(X) + bE(Y)
4. If X_n is a nondecreasing sequence of numbers and lim(X_n) = X, then E(X)=lim E(X_n) [montone convergence theorem]

Definition: P(A)=E[I(A)]

Using the above, prove that if X=0 almost surely [i.e. P(X=0)=1 ], then E(X)=0.

Proof:
X=0 almost surely <=> |X|=0 almost surely

[note: I is the indicator/dummy variable
I(A)=1 if event A occurs
I(A)=0 otherwise]

|X| = |X| I(|X|=0) + |X| I(|X|>0)
=> E(|X|) = E(0) + E[|X| I(|X|>0)]
=E(0*1) + E[|X| I(|X|>0)]
=0E(1) + E[|X| I(|X|>0)] (axiom 3)
=0 + E[|X| I(|X|>0)] (axiom 2)
=E[|X| I(|X|>0)]
=E[lim |X| * I(0<|X|≤N)] (lim here means the limit as N->∞)
=lim E[|X| * I(0<|X|≤N)] (axiom 4)
lim E[N * I(0<|X|≤N)]
=lim N * E[I(0<|X|≤N)]
=lim N * P(0<|X|≤N) (by definition)
=lim (0) since P(X=0)=1 => P(0<|X|≤N)=0
=0
=>E(X)=0
=======================================

Now, I don't understand the parts in red.
1) The following is a proof of the claim E[|X| I(|X|=0)] =0.
"If X=0, then |X|=0, so |X| I(|X|=0) = 0
If X≠0, then I(|X|=0) =0, so |X| I(|X|=0) = 0
Therefore, |X| I(|X|=0) is always 0 and E[|X| I(|X|=0)] =0. "

But in the assumptions, we are given that X=0 almost surely [i.e. P(X=0)=1]. We can say that X=0 with certainty. It is impossible for X to not be 0. Then, why are we even considering the case X≠0? There should only be one possible case, X=0, right?

2) |X| * I(0<|X|≤N) ≤ N * I(0<|X|≤N)
=> E[|X| * I(0<|X|≤N)] ≤ E[N * I(0<|X|≤N)] (I am OK with this step)
But does this imply
lim E[|X| * I(0<|X|≤N)] ≤ lim E[N * I(0<|X|≤N)] ???????????
(note: lim=the limit as N->∞)

I was flipping through my calculus textbooks, but I couldn't find a theorem that applies and justifies the last step about the limit.

Any help is greatly appreciated!

[also under discussion in talk stats forum and S.O.S. math cyberboard, yet nobody has provided an answer to the two follow-up questions above]

2. Suppose the random variable $\displaystyle X$ has the standard normal distribution, $\displaystyle X\sim\mathrm N(0,1)$. Pick any real number, say $\displaystyle 1.5$. What is the probability of the event $\displaystyle \{X=1.5\}$? The answer is $\displaystyle 0$, i.e. $\displaystyle \mathrm P(X=1.5)=0$, as any high school student will tell you. So the event $\displaystyle \{X\neq1.5\}$ has probability $\displaystyle 1$, that is $\displaystyle X\neq1.5$ almost surely.

Do you think this means that it is impossible for $\displaystyle X$ to be exactly $\displaystyle 1.5$? You may say yes, and you may be right, but then the same is true for any other value. You are left wondering how it is possible for $\displaystyle X$ to take any value whatsoever, given that it's impossible for $\displaystyle X$ to take any particular value.

Do you begin to see the difference between the terms surely and almost surely?

3. Originally Posted by halbard
Suppose the random variable $\displaystyle X$ has the standard normal distribution, $\displaystyle X\sim\mathrm N(0,1)$. Pick any real number, say $\displaystyle 1.5$. What is the probability of the event $\displaystyle \{X=1.5\}$? The answer is $\displaystyle 0$, i.e. $\displaystyle \mathrm P(X=1.5)=0$, as any high school student will tell you. So the event $\displaystyle \{X\neq1.5\}$ has probability $\displaystyle 1$, that is $\displaystyle X\neq1.5$ almost surely.

Do you think this means that it is impossible for $\displaystyle X$ to be exactly $\displaystyle 1.5$? You may say yes, and you may be right, but then the same is true for any other value. You are left wondering how it is possible for $\displaystyle X$ to take any value whatsoever, given that it's impossible for $\displaystyle X$ to take any particular value.

Do you begin to see the difference between the terms surely and almost surely?
1) But since high school, I was taught that an event with probability 1 is a certain event and an event with probability 0 is an impossible event.
P(Ω)=1
P(empty set)=0

I still don't quite see the difference between "X=0" and "X=0 almost surely".

Also, about your example P(X=1.5)=0, it is actually a big puzzle to me...I am never able to fully understand why P(X=1.5) would be exactly 0. What I actually beleive is that P(X=1.5) would be very close to 0, instead of exactly 0.

4. Originally Posted by kingwinner
2) |X| * I(0<|X|≤N) ≤ N * I(0<|X|≤N)
=> E[|X| * I(0<|X|≤N)] ≤ E[N * I(0<|X|≤N)] (I am OK with this step)
But does this imply
lim E[|X| * I(0<|X|≤N)] ≤ lim E[N * I(0<|X|≤N)] ???????????
(note: lim=the limit as N->∞)

I was flipping through my calculus textbooks, but I couldn't find a theorem that applies and justifies the last step about the limit.
You need to rewrite this so that you can see what is happening:

$\displaystyle f_1(N) \le f_2(N)$

Then:

$\displaystyle \lim_{N\to N_0} f_1(N) \le \lim_{N \to N_0} f_2(N)$

This should be familiar, since it is equivalent to:

$\displaystyle g(N)=f_2(N)-f_1(N)\ge 0$

Then:

$\displaystyle \lim_{N \to N_0}g(N) \ge 0$

(I have missed out the caveats about what is meant if any of the limits do not exist)

CB

5. Originally Posted by CaptainBlack
You need to rewrite this so that you can see what is happening:

$\displaystyle f_1(N) \le f_2(N)$

Then:

$\displaystyle \lim_{N\to N_0} f_1(N) \le \lim_{N \to N_0} f_2(N)$

This should be familiar, since it is equivalent to:

$\displaystyle g(N)=f_2(N)-f_1(N)\ge 0$

Then:

$\displaystyle \lim_{N \to N_0}g(N) \ge 0$

(I have missed out the caveats about what is meant if any of the limits do not exist)

CB
Just to clarify a bit, by writing f(N)≤g(N), you mean "f(N)≤g(N) for ALL N", am I right?

Is it true that f(N)≤g(N) ALWAYS implies
lim f(N)≤ lim g(N) ?
N->∞ N->∞
And we can safely take the limit of both sides while still preserving the same inequality sign??
(I flipped through my calculus textbooks again, but still I am unable to find the exact statement of the theorem that guarantees the above, so I am not sure whether it is a correct statement or not.)