Results 1 to 5 of 5

Math Help - Regression Help - Using Excel

  1. #1
    Newbie
    Joined
    Nov 2006
    Posts
    2

    Question Regression Help - Using Excel

    I need some help with regression analysis and in particular with the figures which Excel displays.

    I have a sheet which lists test scores and I'd like to create a formula for predicting what a candidates first year salary would be based on their test scores.

    Code:
    Candidate Maths IT   Science History Salary
    ----------------------------------------------------
    Joe     78.7  82.5  81.7  95.0  $20,000
    Mary    87.3  85.4  86.6  68.2  $24,000
    Alison  74.6  83.4  79.2  73.1  $19,000
    Sarah   86.8  91.5  90.2  54.2  $24,000
    Alan    63.5  72.4  74.5  74.2  $17,000
    Dane    78.3  62.8  61.0  85.2  $19,000
    Beth    92.5  97.6  95.4  72.3  $27,000
    Eliza   84.7  88.3  87.7  75.5  $23,000
    Corey   79.3  87.2  87.1  82.2  $21,000
    Don     91.3  85.2  86.3  95.4  $26,000
    What I am trying to do is to use regression to see if I can construct a formula to predict a candidates salary based on their test results.

    Salary = A*Maths Score + B*IT Score + C*Science Score + D*History Score

    1. How can I determine if each test score is significant? Excel will state figures such as Multiple R, R Squar, Standard Error, t Stat and P Value but what do they mean!

    2. How can I determine if the results are independent.

    e.g. It's quite likely that Students with high Maths scores could also have high IT scores and thus there could be an error due to both results not being independent.

    Here are the figures from Excel

    Code:
    Regression Statistics
    -----------------------	
    Multiple R	0.98368756
    R Square	0.967641215
    Adjusted R Square	0.941754187
    Standard Error	796.3869571
    Observations	10
    Code:
    ANOVA
    	df	SS	MS	F	Significance F
    Regression	4	94828839.07	23707209.77	37.37938615	0.00064401
    Residual	5	3171160.927	634232.1854		
    Total	9	98000000
    What are both tables telling me? Is this a good test or not?

    Code:
    Coefficients
    Intercept	-8559.315329
    Maths	313.8348267
    IT	-203.3732385
    Science	276.2479898
    History	-12.81029006
    I presume these are the figures I need to use to construct the formula?

    Salary = -8559 + 313.8 * Maths -203.3 * IT + 276.2 * Science - 12.8 * History

    Also on that last table is shown:

    Code:
    	Standard Error
    Intercept	3539.055274
    Maths	41.49049481
    IT	148.8447209
    Science	145.5849295
    History	23.53289212
    
    	t Stat
    Intercept	-2.418531124
    Maths	7.564017449
    IT	-1.366344989
    Science	1.897504026
    History	-0.544356809
    
    	P-value
    Intercept	0.060226461
    Maths	0.000640424
    IT	0.230071381
    Science	0.116235029
    History	0.609583227
    Does this tell me if some of the test scores should or should not be used?

    TIA
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor Quick's Avatar
    Joined
    May 2006
    From
    New England
    Posts
    1,024
    I recommend graphing it, using the average test score and the salary, then let excel pick the line of best fit
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Global Moderator

    Joined
    Nov 2005
    From
    New York City
    Posts
    10,616
    Thanks
    10
    Are these ordered pairs? Or more.

    If they are ordered pairs, I have software that does method of least square. And then I can give you all the results quickly if you provide me with the list of points.
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Newbie
    Joined
    Nov 2006
    Posts
    2
    If I knew what an ordered pair was I could tell you

    This is more of a basics learning exercise. I understand what regression is and what it can be used for. What I don't know is what the figures are telling me. I can't find out if xyz having a value of 0.1 is good or bad; I can't find out if abc having a value of 9.99 is good or bad either.

    Using the example I gave above what do the figures say?

    I think I'm right in assuming that the R Squared figure of 0.967 means that the the relationship between Y and all the X values is 96.7% good?

    What I'm really trying to determine is

    a) Is this a reliable method of estimating salary when only test score results are known.

    b) Which individual test scores will best estimate the salary. In otherwords should I drop History and possibly look to include other test results such as Geography or Art.

    c) Are there two results scores which are corelated and thus causing an error in the figures. e.g. is having Maths and IT a bad combination because it could be correlated and thus 'double counting'?

    I just don't know what the numbers (the excel results std error, t-test, p-value etc.) are telling me, or what I should be looking for.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by biscayne View Post
    I need some help with regression analysis and in particular with the figures which Excel displays.

    I have a sheet which lists test scores and I'd like to create a formula for predicting what a candidates first year salary would be based on their test scores.

    Code:
    Candidate Maths IT   Science History Salary
    ----------------------------------------------------
    Joe     78.7  82.5  81.7  95.0  $20,000
    Mary    87.3  85.4  86.6  68.2  $24,000
    Alison  74.6  83.4  79.2  73.1  $19,000
    Sarah   86.8  91.5  90.2  54.2  $24,000
    Alan    63.5  72.4  74.5  74.2  $17,000
    Dane    78.3  62.8  61.0  85.2  $19,000
    Beth    92.5  97.6  95.4  72.3  $27,000
    Eliza   84.7  88.3  87.7  75.5  $23,000
    Corey   79.3  87.2  87.1  82.2  $21,000
    Don     91.3  85.2  86.3  95.4  $26,000
    What I am trying to do is to use regression to see if I can construct a formula to predict a candidates salary based on their test results.

    Salary = A*Maths Score + B*IT Score + C*Science Score + D*History Score

    1. How can I determine if each test score is significant? Excel will state figures such as Multiple R, R Squar, Standard Error, t Stat and P Value but what do they mean!

    2. How can I determine if the results are independent.

    e.g. It's quite likely that Students with high Maths scores could also have high IT scores and thus there could be an error due to both results not being independent.

    Here are the figures from Excel

    Code:
    Regression Statistics
    -----------------------    
    Multiple R    0.98368756
    R Square    0.967641215
    Adjusted R Square    0.941754187
    Standard Error    796.3869571
    Observations    10
    Code:
    ANOVA
        df    SS    MS    F    Significance F
    Regression    4    94828839.07    23707209.77    37.37938615    0.00064401
    Residual    5    3171160.927    634232.1854        
    Total    9    98000000
    What are both tables telling me? Is this a good test or not?

    Code:
    Coefficients
    Intercept    -8559.315329
    Maths    313.8348267
    IT    -203.3732385
    Science    276.2479898
    History    -12.81029006
    I presume these are the figures I need to use to construct the formula?

    Salary = -8559 + 313.8 * Maths -203.3 * IT + 276.2 * Science - 12.8 * History

    Also on that last table is shown:

    Code:
       Standard Error
    Intercept    3539.055274
    Maths    41.49049481
    IT    148.8447209
    Science    145.5849295
    History    23.53289212
     
       t Stat
    Intercept    -2.418531124
    Maths    7.564017449
    IT    -1.366344989
    Science    1.897504026
    History    -0.544356809
     
       P-value
    Intercept    0.060226461
    Maths    0.000640424
    IT    0.230071381
    Science    0.116235029
    History    0.609583227
    Does this tell me if some of the test scores should or should not be used?

    TIA
    The R values are telling you that the model is highly predictive.

    The p-values are telling you that most of the prediction comes from the
    intercept and the maths score.

    Rerun the model using only the intercept and maths scores in the model

    Look at the residuals of this and the rerun model, are they significantly different
    (they will be worse with the simpler model, so you will have to decide if the
    difference is significant)

    RonL
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Ncaaf multiple regression/ stepwise regression
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: August 10th 2011, 07:21 AM
  2. Regression SS in multiple linear regression
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: June 20th 2009, 01:23 PM
  3. excel
    Posted in the Math Software Forum
    Replies: 8
    Last Post: April 23rd 2008, 08:48 AM
  4. excel
    Posted in the Math Software Forum
    Replies: 1
    Last Post: January 21st 2008, 06:24 AM
  5. Excel
    Posted in the Math Software Forum
    Replies: 1
    Last Post: August 13th 2007, 06:26 PM

Search Tags


/mathhelpforum @mathhelpforum