Hi all, this is my first post on these forums, I should make clear now I'm not after answers or anything just a nudge in the right direction and hopefully some helpful guidance...anyway I have some stats coursework all regarding the general linear model and I am completely stumped on where entirely to begin, I have made an attempt based on examples I have done previously but I doubt they are right...I should also state I am much more of a pure mathematician than statistician hence my confusion! Anyways the question is as follows and as stated I have no idea where to begin

Three possible models were considered:

(i) There is a difference between cholesterol levels for urban and rural males, and the relationship between cholesterol and age also differs between the two groups.

(ii) There is a difference between cholesterol levels for urban and rural males, but the relationship between cholesterol and age is the same for the two groups.

(iii) Both the cholesterol levels and the relationship between cholesterol and age are the same for urban and rural males.

The models are to be fitted in the general linear model framework

where the notation in the course notes applies.

(a) Give the matrix formulation of each of the three models, explaining all notation and the meaning of the parameters you are trying to estimate. For each model you should give the response vector, the design matrix, the parameter vector and the error vector. You should explain why you have chosen this particular set of parameters. State any assumptions clearly. You should aim to make the parameter systems as similar as possible for the three models: this will make model comparison easier.

(b) Fit model (ii) above to these data using the matrix operations available in Excel (or another computer package)

You should provide:

• parameter estimates

• a table of observed values, fitted values and residuals

• anything else you think might be useful

On the basis of these results, comment on the hypothesis that urban and rural cholesterol levels do not differ. What other analyses do you think would be helpful here?

You should check your results using a specialist statistics package (e.g. MINITAB).

Area Age Cholesterol

rural 46 181

rural 52 228

rural 39 182

rural 65 249

rural 54 259

rural 33 201

rural 49 121

rural 76 339

rural 71 224

rural 41 112

rural 58 189

rural 18 137

rural 44 173

rural 33 177

urban 78 241

urban 51 225

urban 43 223

urban 44 190

urban 58 257

urban 63 337

urban 19 189

urban 42 214

urban 30 140

urban 47 196

urban 58 262

urban 70 261

urban 67 356

urban 31 159

urban 21 191

urban 56 197

....So any help, or guidance to references or examples would be greatly appreciated, and as said this is my first post so be gentle!

Many thanks.