ERM Induction consistency problem

Suppose that are i.i.d. to F, an unknown distribution. We wish to estimate the true mean of F with the loss function

Prove that is consistent for ERM induction.

Proof so far.

Now, I know that the Bayes learner and the ERM learner

I need to show that both the risk function and the empirical risk function converges to the Bayes Risk

That is, I need to show that and

For the first part, I have:

Looks like I'm lost...

I know by the the Law of Large Number, I have , so I'm basically trying to mold it but stuck...

I'm guessing that one mistake I made is that the loss function should be , but if I try that way, I can't get and I still can't get the ERM consistency going...

Any help, please? Thank you!!!

Re: ERM Induction consistency problem

Hey tttcomrader.

Since you have an expectation within the probability, you can use the identity E[X_bar] = mu which means you will get mu^2 and X_bar^2 left inside the expression.

From there you can either consider proving that X_bar^2 approaches mu^2 in convergence or you can use the equality mu^2 - X_bar^2 = (mu - X_bar)(mu + X_bar) and attempt to show that if z = mu + X_bar, then P(z*|mu-X_bar|) < epsilon. (You may have to show that the above holds for absolute values by considering the different branches).

Re: ERM Induction consistency problem

So is my loss function correct?

Re: ERM Induction consistency problem

Do you mean is the proof for the loss function correct?

The proof will be correct if both expressions |mu^2 - X_bar^2| <= |mu + X_bar|*|mu - X_bar| (You should look at Banach spaces for more information) and this is equivalent to saying that the norm is continuous (which it should be, but you may have to state a few things).

Re: ERM Induction consistency problem

Hey! Thank you for your post and I have to tell you I deeply appreciate you taking your time to help me here. I'm actually just trying to do some problems from previous courses that my professor taught and hope to learn something new lol.

What I meant before was that should the loss function be or

Since we are estimating for the true mean, I think the loss function should be the first one.

If it is the first one, then I can't prove that is the ERM learner since:

Attempted proof: I want to show that

Now,

I'm stuck.

Basically I want to get those two terms on the right of to be positive, well, of course, the third term is positive, problem is the second one...