ERM Induction consistency problem

• Oct 19th 2012, 12:01 PM
ERM Induction consistency problem
Suppose that $X_1, ...,X_n$ are i.i.d. to F, an unknown distribution. We wish to estimate the true mean of F $= \mu$ with the loss function $L(X,f(X))=(X-f(X))^2$

Prove that $\= {X}$ is consistent for ERM induction.

Proof so far.

Now, I know that the Bayes learner $f_{Bayes} = \mu$ and the ERM learner $f_{ERM} = \= {X}$

I need to show that both the risk function $R( \= {X} ) = E[(X- \= {X})^2 ]$ and the empirical risk function $\^ {R} ( \= {X} ) = \frac {1}{n} \sum ^n_{k=1} E [ (X_k- \= {X} )^2]$ converges to the Bayes Risk $R_{Bayes} = R( \mu ) = EL( X , \mu ) = E[ (X- \mu )^2 ]$

That is, I need to show that $\forall \epsilon > 0,Pr( | R ( \mu)-R( \= {X} )| < \epsilon ) = 0$ and $Pr( | R ( \mu)-\^ {R} ( \= {X} )| < \epsilon ) = 0$

For the first part, I have:

$Pr( | R ( \mu)-R( \= {X} )| < \epsilon )$

$= Pr( | E[(X- \mu )^2-(X- \= {X} ) ^2| < \epsilon )$

$= Pr( | E[-2X \mu + \mu ^2 + 2 X \= {X} - \= {X}^2 ] |< \epsilon )$

$=Pr( | -2 \mu ^2 + \mu ^2 + 2 E X \= {X} - E \= {X}^2 |< \epsilon ) ...$ Looks like I'm lost...

I know by the the Law of Large Number, I have $Pr( | \mu - \= {X} | < \epsilon ) = 0$, so I'm basically trying to mold it but stuck...

I'm guessing that one mistake I made is that the loss function should be $L( \mu , f ) = ( \mu - f )^2$, but if I try that way, I can't get $\= {X} = f_{ERM}$ and I still can't get the ERM consistency going...

Any help, please? Thank you!!!
• Oct 19th 2012, 04:12 PM
chiro
Re: ERM Induction consistency problem

Since you have an expectation within the probability, you can use the identity E[X_bar] = mu which means you will get mu^2 and X_bar^2 left inside the expression.

From there you can either consider proving that X_bar^2 approaches mu^2 in convergence or you can use the equality mu^2 - X_bar^2 = (mu - X_bar)(mu + X_bar) and attempt to show that if z = mu + X_bar, then P(z*|mu-X_bar|) < epsilon. (You may have to show that the above holds for absolute values by considering the different branches).
• Oct 19th 2012, 06:05 PM
Re: ERM Induction consistency problem
So is my loss function correct?
• Oct 19th 2012, 07:00 PM
chiro
Re: ERM Induction consistency problem
Do you mean is the proof for the loss function correct?

The proof will be correct if both expressions |mu^2 - X_bar^2| <= |mu + X_bar|*|mu - X_bar| (You should look at Banach spaces for more information) and this is equivalent to saying that the norm is continuous (which it should be, but you may have to state a few things).
• Oct 19th 2012, 07:11 PM
Re: ERM Induction consistency problem
Hey! Thank you for your post and I have to tell you I deeply appreciate you taking your time to help me here. I'm actually just trying to do some problems from previous courses that my professor taught and hope to learn something new lol.

What I meant before was that should the loss function be $( \mu - f ) ^2$ or $( X - f) ^2$

Since we are estimating for the true mean, I think the loss function should be the first one.

If it is the first one, then I can't prove that $\= {X}$ is the ERM learner since:

Attempted proof: I want to show that $\^ {R} (f) \geq \^ {R} ( \= {X} ) \ \ \ \ \ \forall f \in \mathcal{F} = \mathbb {R}$

Now, $\^ {R} (f) = \frac {1}{n} \sum _{k=1}^n ( \mu - f )^2$

$= \frac {1}{n} \sum _{k=1}^n ( \mu - \= {X} + \= {X} - f )$

$= \frac {1}{n} \sum _{k=1}^n [ ( \mu - \= {X} )^2+2( \mu - \= {X} )( \= {X} - f)+( \= {X} - f)^2 ]$

$= \frac {1}{n} \sum _{k=1}^n ( \mu - \= {X} ) ^2 + \frac {1}{n} \sum _{k=1}^n 2( \mu - \= {X} )( \= {X} - f)+ \frac {1}{n} \sum _{k=1}^n( \= {X} - f)^2$

$= \^ {R} ( \= {X} ) + ...$ I'm stuck.

Basically I want to get those two terms on the right of $\^ {R} ( \= {X} )$ to be positive, well, of course, the third term is positive, problem is the second one...