# Thread: Why is this a case of collinearity?

1. ## Why is this a case of collinearity?

My model is as follows:

reg lhourpay edage ... hiquala_1 hiquala_2 hiquala_3 hiquala_4 hiquala_6 hiquala_7 hiquala_8 hiquala_9 doc pgce other dk hiquala1_doc hiquala1_pgce hiquala1_other hiquala1_dk

Where hiquala = highest qualification:
hiquala_1 = Higher Degree
hiquala_2 = First Degree
hiquala_3 = Diploma in HE
hiquala_4 = HE below Degree
hiqaual_5 = A-Levels (default)
hiquala_6 = GCSEs
hiquala_7 = Below GCSE
hiquala_8 = No Qual
hiquala_9 = Don't Know

hiquala1_doc = Higher Degree * Doctorate
hiquala2_ma = Higher Degree * MA (default)
hiquala3_pgce = Higher Degree * PGCE
hiquala1_other = Higher Degree * Other
hiquala1_dk = Higher Degree * Don't Know

I know that the correct answer is to leave out the independent dummy variables for doc, pgcse, other and dk, but I'm confused as to why. For instance, a model such as:

y = a + b1female + b2child + b3female*child

...is apparently fine. So why is it different in my case? Thanks to anyone that can clarify this for me!

2. Originally Posted by Raving
My model is as follows:

reg lhourpay edage ... hiquala_1 hiquala_2 hiquala_3 hiquala_4 hiquala_6 hiquala_7 hiquala_8 hiquala_9 doc pgce other dk hiquala1_doc hiquala1_pgce hiquala1_other hiquala1_dk

Where hiquala = highest qualification:
hiquala_1 = Higher Degree
hiquala_2 = First Degree
hiquala_3 = Diploma in HE
hiquala_4 = HE below Degree
hiqaual_5 = A-Levels (default)
hiquala_6 = GCSEs
hiquala_7 = Below GCSE
hiquala_8 = No Qual
hiquala_9 = Don't Know

hiquala1_doc = Higher Degree * Doctorate
hiquala2_ma = Higher Degree * MA (default)
hiquala3_pgce = Higher Degree * PGCE
hiquala1_other = Higher Degree * Other
hiquala1_dk = Higher Degree * Don't Know

I know that the correct answer is to leave out the independent dummy variables for doc, pgcse, other and dk, but I'm confused as to why. For instance, a model such as:

y = a + b1female + b2child + b3female*child

...is apparently fine. So why is it different in my case? Thanks to anyone that can clarify this for me!
If you are using eviews, you can have collinear circles drawn and it will tell you which variables are collinear.

3. Originally Posted by Raving
My model is as follows:

reg lhourpay edage ... hiquala_1 hiquala_2 hiquala_3 hiquala_4 hiquala_6 hiquala_7 hiquala_8 hiquala_9 doc pgce other dk hiquala1_doc hiquala1_pgce hiquala1_other hiquala1_dk

Where hiquala = highest qualification:
hiquala_1 = Higher Degree
hiquala_2 = First Degree
hiquala_3 = Diploma in HE
hiquala_4 = HE below Degree
hiqaual_5 = A-Levels (default)
hiquala_6 = GCSEs
hiquala_7 = Below GCSE
hiquala_8 = No Qual
hiquala_9 = Don't Know

hiquala1_doc = Higher Degree * Doctorate
hiquala2_ma = Higher Degree * MA (default)
hiquala3_pgce = Higher Degree * PGCE
hiquala1_other = Higher Degree * Other
hiquala1_dk = Higher Degree * Don't Know

I know that the correct answer is to leave out the independent dummy variables for doc, pgcse, other and dk, but I'm confused as to why. For instance, a model such as:

y = a + b1female + b2child + b3female*child

...is apparently fine. So why is it different in my case? Thanks to anyone that can clarify this for me!
I'm having a hard time figuring out exactly how many predictors you have and what they are. There may be more than one collection of variables that you can throw out in order to fix this. Chucking out main effects seems like a bad plan in terms of interpretability. Either your model is overparameterized or some effects are confounded.