I am building a statistical model for domestic water use which is going to be a multiple regression equation. So far I have found that the type of house (detached, semi-detached, flat or terraced), whether the house has children or not (yes/no), socio-economic status (wealthy/not wealthy) and region of the country (14 regions) are all key predictor variables of household water demand (p<0.05 in each case). Each of the independant variables are also categorical, not quantitative, and so will have to be analysed as dummy variables.
However if I'm going to include all these categorical predictor variables then that will leave me with a regression equation with a lot of dummy variables, especially for 'region' variable as this has 14 possibilities!
Is this too many? Does it matter how many dummy variables I have in a multiple regression equation?
Thanks
-Rob