Hi all,

I've been trying to test some of my data using either a GLMM or a GEE in R. When using a gaussian family with an identity link function, I can easily assess that the assumptions of normality and homoscedasticity are met by using a qqplot and Shapiro-Wilks test on the residuals for the normality assumption, and a plot of the residuals over fitted values with a spline for the assumption of homoscedasticity. However, if I want to use a non-gaussian family and a link other than identity, how do I test that the residuals are meeting the assumptions? I've spent a great deal of time looking around online, but I can't seem to find anything. Specifically, I'm thinking I'd like to use a Gamma family with a log distribution for my data. I'll provide a sample of my code and the diagnostic plots I used (which are geared to testing against a normal distribution) below:

VPGLMM3<-glmmPQL(VPL~Age+Sex+DET+DOC+GS+NND+Month,random=~1|FocalID,family=Gamma(link="log"),data=VPtest)

#Normality

VPresid<-residuals(VPGLMM3)

qqnorm(VPresid)

qqline(VPresid,col=2)

#Homoscedasticity

plot(fitted(VPGLMM3), residuals(VPGLMM3),xlab = "Fitted Values", ylab = "Residuals")

abline(h=0, lty=2)

lines(smooth.spline(fitted(VPGLMM3), residuals(VPGLMM3)))

I also have a small question I have on the side... my sigma value (which is a measure equivalent to dispersion I believe) for this test is 1.01. I know that data is overdispersed when the value is over 1, and needs to be accounted for, but is a value of 1.01 really a big deal in that regard?

Any help is appreciated thanks!

Edit: This page could be helpful, but I'm still trying to parse it http://www.r-bloggers.com/exploratory-data-analysis-quantile-quantile-plots-for-new-yorks-ozone-pollution-data/

I've been trying to test some of my data using either a GLMM or a GEE in R. When using a gaussian family with an identity link function, I can easily assess that the assumptions of normality and homoscedasticity are met by using a qqplot and Shapiro-Wilks test on the residuals for the normality assumption, and a plot of the residuals over fitted values with a spline for the assumption of homoscedasticity. However, if I want to use a non-gaussian family and a link other than identity, how do I test that the residuals are meeting the assumptions? I've spent a great deal of time looking around online, but I can't seem to find anything. Specifically, I'm thinking I'd like to use a Gamma family with a log distribution for my data. I'll provide a sample of my code and the diagnostic plots I used (which are geared to testing against a normal distribution) below:

VPGLMM3<-glmmPQL(VPL~Age+Sex+DET+DOC+GS+NND+Month,random=~1|FocalID,family=Gamma(link="log"),data=VPtest)

#Normality

VPresid<-residuals(VPGLMM3)

qqnorm(VPresid)

qqline(VPresid,col=2)

#Homoscedasticity

plot(fitted(VPGLMM3), residuals(VPGLMM3),xlab = "Fitted Values", ylab = "Residuals")

abline(h=0, lty=2)

lines(smooth.spline(fitted(VPGLMM3), residuals(VPGLMM3)))

I also have a small question I have on the side... my sigma value (which is a measure equivalent to dispersion I believe) for this test is 1.01. I know that data is overdispersed when the value is over 1, and needs to be accounted for, but is a value of 1.01 really a big deal in that regard?

Any help is appreciated thanks!

Edit: This page could be helpful, but I'm still trying to parse it http://www.r-bloggers.com/exploratory-data-analysis-quantile-quantile-plots-for-new-yorks-ozone-pollution-data/

Last edited: