Page 1 of 2 12 LastLast
Results 1 to 15 of 18

Math Help - Deriving the Least Squares Estimates

  1. #1
    Newbie
    Joined
    May 2009
    Posts
    6

    Deriving the Least Squares Estimates

    Hello everyone!

    My question is about the steps one takes to derive least square estimators. Its a question which is part of a topic I am studying at uni. Obviously its important I do this myself and learn how to, so I am only posting the first 2 questions, hopefully with some help I can do the rest myself and maybe check my answers with the brains in this forum

    Thanks



    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    I have no idea what this has to do with least sqaures.

    But  \sum_{k=1}^n\bar x=\bar x\sum_{k=1}^n =\bar xn

    and by definition \bar x={\sum_{k=1}^nx_k\over n}, so  n\bar x=\sum_{k=1}^nx_k.

    Thus  \sum_{k=1}^nx_k=n\bar x=\sum_{k=1}^n\bar x.
    Last edited by matheagle; May 20th 2009 at 09:40 PM.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    May 2009
    Posts
    6
    Thanks very much for the help.

    I know what you mean - this doesnt have relevance to least squares - yet. The question is sort of leading me by the hand through the process to derive least squares estimators.

    This was only the beginning of the questions, and I didnt really think posting all the questions would be the right thing to do.
    Follow Math Help Forum on Facebook and Google+

  4. #4
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    the suspense is killing me
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Newbie
    Joined
    May 2009
    Posts
    6
    Haha!

    I mean I would feel bad by posting the entire thing - sort of guilty and that I should do it myself to learn.

    I think what throws me out is the sigmas, they confuse me a lot. Looking at your answer its really easy to follow but I just didnt even think to do that.

    The part I am struggling with right now is:


    Just getting my head around what it means is proving to be difficult, a/b are just constants (could be any number)? Is the idea to try to make LHS(the bit on the left with y's) = RHS? How would I start off doing this?
    Follow Math Help Forum on Facebook and Google+

  6. #6
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    I figured this was next and I was going to state it yesterday.
    All you need to do is expand the sum.
    BUT, from ......  \sum_{k=1}^nx_k=\sum_{k=1}^n\bar x we have....

     \sum_{k=1}^n(x_k-\bar x) =\sum_{k=1}^nx_k-\sum_{k=1}^n\bar x=0.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Newbie
    Joined
    May 2009
    Posts
    2

    Deriving the Least Squares Estimates

    Speaking of this topic, I am having trouble with this question:



    Prove that by choosing b_0 and b_1 to minimize \sum_{k=1}^n (y_i - b_0 - b_1x_1)^2 you obtain the least squares estimators, namely:


    b_1=beta_1={\sum_{k=1}^n(x_i-\bar x)(y_i-\bar y)\over \sum_{k=1}^n(x_i-\bar x)^2}


    b_0=beta_0=\bar y-b_i\bar x

    wow that syntax was a pain to work out...

    Thanks for any help.
    Follow Math Help Forum on Facebook and Google+

  8. #8
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    Few correction....
    can't use i and k, only one
    beta_1 not i
    x_i not x_1....


    Quote Originally Posted by Campari View Post

    Prove that by choosing b_0 and b_1 to minimize L=\sum_{i=1}^n (y_i - b_0 - b_1x_i)^2 you obtain the least squares estimators, namely:

    b_1={\sum_{i=1}^n(x_i-\bar x)(y_i-\bar y)\over \sum_{i=1}^n(x_i-\bar x)^2}


    b_0=\bar y-b_1\bar x

    AND most importantly b_1\ne \beta_1. \beta_1 is an unknown parameter and b_1 is a rv that estimates it.

    IF you are asking how to show these are the LSEs just differentiate L wrt the two parameters and set equal to zero. This is just calc one.
    Follow Math Help Forum on Facebook and Google+

  9. #9
    Newbie
    Joined
    May 2009
    Posts
    2
    Thanks for your help MathEagle, I am sort of struggling with the concept, so to get the proof rolling i have to:

    Minimise SSE \hat {\beta}_0, \hat {\beta}_1, = \sum_{i=1}^n (y_i-\hat {\beta}_0-\hat {\beta}_1x_i)^2

    Thus

    {\delta SSE \over \delta \hat {\beta}_0} = 2 \sum_{i=1}^n (y_i-\hat {\beta}_0-\hat {\beta}_1x_i) (-1) = 0


    {\delta SSE \over \delta \hat {\beta}_1} = 2 \sum_{i=1}^n (y_i-\hat {\beta}_0-\hat {\beta}_1x_i) (-x_i) = 0

    Then derive the OLS estimates of \beta_0 , \hat {\beta}_0 and \beta_1 , \hat {\beta}_1 in order to obtain the least sqaures estimators that was given in the question?

    Thanks for your time.
    Follow Math Help Forum on Facebook and Google+

  10. #10
    Senior Member
    Joined
    Jan 2009
    Posts
    404
    For the complete derviation of the least square estimates, look at
    Simple linear regression - Wikipedia, the free encyclopedia

    I was learning this last week. The proof is very understandable and complete, and it also includes the 2nd dervative test which is great.

    Good luck!
    Follow Math Help Forum on Facebook and Google+

  11. #11
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    The best way to solve least squares for ANY model is to use matrices.

    Writing the model y=\beta_0+\beta_1x_1+\cdots\beta_kx_k+\epsilon as Y=\beta X+\epsilon

    where Y is a column vector of your responses, \beta a column vector of ALL of you parameters and X, your design matrix

    the least squares solution is \hat\beta =(X^tX)^{-1}X^tY.

    Now \hat\beta is unbiased for \beta and SSE = Y^tY-\hat\beta X^tY

    and the unbiased estimator of \sigma^2 is MSE=SSE/(n-(k+1)), where k+1 is just the number of parameters.
    Last edited by matheagle; May 22nd 2009 at 06:51 AM.
    Follow Math Help Forum on Facebook and Google+

  12. #12
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    But back to your basic model of y=\beta_0+\beta_1x+\epsilon

    the SSxy term can be written three ways since....

     \sum_{k=1}^n(x_k-\bar x) =0 or better yet  \sum_{k=1}^n(y_k-\bar y) =0.

    SSxy can be written as \sum_{k=1}^n (x_k-\bar x) (y_k-\bar y)

    or \sum_{k=1}^n (x_k-\bar x) y_k which is useful when deriving the statistical properties of the beta's

    or \sum_{k=1}^n x_k(y_k-\bar y).
    Follow Math Help Forum on Facebook and Google+

  13. #13
    Senior Member
    Joined
    Jan 2009
    Posts
    404
    Quote Originally Posted by matheagle View Post
    and the unbiased estimator of \sigma^2 is MSE=SSE/(n-(k+1)), where k+1 is just the number of parameters.
    Typically, to estimate V(X_i), we use the sample standard deviation S^2 = (1/n-1)[∑(X_i - X bar)^2].

    Now, by the definition of variance, V(ε_i) = E[( ε_i-E(ε_i) )^2], so to estimate \sigma^2 = V(ε_i) in the context of simple linear regression, shouldn't we use MSE = S^2 = (1/n-2)[∑(ε_i - ε bar)^2] ? This form looks much more similar and analogous to the formula for sample standard deviation above (compare the parts in red).

    However, I know that the estimator of V(ε_i) [which is the MSE] is based on the residuals e_i, not ε_i. What is the reason behind it? What explains the discrepency?

    Thank you~
    Last edited by kingwinner; May 23rd 2009 at 02:19 AM.
    Follow Math Help Forum on Facebook and Google+

  14. #14
    MHF Contributor matheagle's Avatar
    Joined
    Feb 2009
    Posts
    2,763
    Thanks
    5
    Nope, this is a totally different animal.

    Here SSE \sim\sigma^2\chi^2_{n-p} where n is the number of observations and p is the number of parameters.


    BUT, if your model is y=mx+\epsilon, then MSE is SSE/(n-1).

    It's the distributions that count.
    We need our t distribution and F's in order to make tests and obtain confidence and prediction intervals.
    Last edited by matheagle; May 23rd 2009 at 07:37 AM.
    Follow Math Help Forum on Facebook and Google+

  15. #15
    Senior Member
    Joined
    Jan 2009
    Posts
    404
    S^2 = (1/n-1)[∑(X_i - X bar)^2] (estimator for V(X_i), the GENERAL formula for sample standard deviation used throughout ch.1-10 in Wackerly, which I believe it ALWAYS holds)

    S^2 = (1/n-2)[∑(Y_i - Y_i hat)^2] (estimate for V(ε_i) = V(Y_i) )

    Why are we using Y_i hat here instead of Y bar(the sample mean)? The sample mean Y bar is always the best unbiased estimator of the population mean μ = E(Y_i), so shouldn't we always use Y bar in calculating the sample standard deviation? What makes V(ε_i) so different from ch.1-10 in Wackerly?

    Thank you~
    Last edited by kingwinner; May 23rd 2009 at 01:19 PM.
    Follow Math Help Forum on Facebook and Google+

Page 1 of 2 12 LastLast

Similar Math Help Forum Discussions

  1. Point Estimates
    Posted in the Advanced Statistics Forum
    Replies: 5
    Last Post: June 21st 2011, 03:08 PM
  2. estimates
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: March 9th 2011, 10:57 PM
  3. deriving the formula for the least square estimates
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: September 9th 2010, 07:42 PM
  4. Interval estimates?
    Posted in the Statistics Forum
    Replies: 3
    Last Post: May 11th 2010, 09:11 PM
  5. Unbiased Estimates
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: November 22nd 2008, 12:15 AM

Search Tags


/mathhelpforum @mathhelpforum