Results 1 to 3 of 3

Math Help - Multiple regression analysis and qualitative variables

  1. #1
    Junior Member
    Joined
    Oct 2010
    Posts
    36

    Multiple regression analysis and qualitative variables

    Hello all,

    I am currently analysing a data set on household energy usage and trying to build a statistical model for predicting the daily energy consumption for any house. So far I've determined that house type, household occupancy and socio-economics are the key predictor variables. (Note that socio-economics in the UK is often measured using the ACORN system which is a categorical ranking system with A basically been the highest ranking (rich people) and then wealth decreasing as you go through the alphabet).

    So basically I have one quantitative predictor variable (occupancy rate - the number of people in the house) and two qualitative (categorical) variables (house type such as detached, semi-detached, flats and terraced) and ACORN.

    At some point I want to perform a multiple regression analysis on the data and derive and equation which predicts y (household energy consumption) from the three variables.

    Right - so here's my question: for the qualitative variables is it necessary to have data for ALL the possible household types in order to perform a regression analysis? For example I have households with all possible occupancies 1-6, however I DO NOT HAVE data for a detached 1 person household in ACORN group B (or many other ACORN groups either). My question is does this matter? Do I need to have a sample size so large that it covers every combination that is possible with the categorical variables? (Many hundreds of possible combinations in this instance...)

    Many Thanks
    -Rob
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    May 2010
    Posts
    1,028
    Thanks
    28

    Re: Multiple regression analysis and qualitative variables

    I assume your regression model is of the form:

    Y_i = a_i + b_1X_{1i} + b_2X_{2i} + \texT{stuff}

    Where X1 and X2 are categorical variables.


    My thinking:

    You dont need data for all combinations because the assumption of the (basic) OLS model is that the effect of each variable on E(Y) is independent of the values of the other variable (unless you use combined indicators, but you haven't said anything to suggest you intend to do that).

    Of course the assumption in italics is strong and you will have a weaker chance of detecting any error if you dont collect data from all combinations of caategories.

    Does that make sense?
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Junior Member
    Joined
    Oct 2010
    Posts
    36

    Re: Multiple regression analysis and qualitative variables

    Hello,

    Yes that does make sense thanks, and you're correct in that the model will be of the form y = a +b1x1 + b2x2 + ...bnxn. The predictor variables are independant of each other so from what you posted it appears as though I don't need to have all possible combinations of categorical variables covered (thankfully!)

    Thanks for the reply.

    -Rob
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Replies: 0
    Last Post: July 30th 2010, 12:21 AM
  2. multiple regression - discrete variables
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: May 7th 2010, 07:10 AM
  3. Qualitative Analysis
    Posted in the Differential Equations Forum
    Replies: 0
    Last Post: March 27th 2010, 04:22 PM
  4. The multiple coeficient of a regression analysis
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: March 15th 2009, 07:21 PM
  5. help with qualitative analysis of ODEs
    Posted in the Calculus Forum
    Replies: 9
    Last Post: August 26th 2007, 03:06 PM

Search Tags


/mathhelpforum @mathhelpforum