Ensuring test assumptions are met in a GLMM or GEE when using non-gaussian families

Sep 2014
Hi all,

I've been trying to test some of my data using either a GLMM or a GEE in R. When using a gaussian family with an identity link function, I can easily assess that the assumptions of normality and homoscedasticity are met by using a qqplot and Shapiro-Wilks test on the residuals for the normality assumption, and a plot of the residuals over fitted values with a spline for the assumption of homoscedasticity. However, if I want to use a non-gaussian family and a link other than identity, how do I test that the residuals are meeting the assumptions? I've spent a great deal of time looking around online, but I can't seem to find anything. Specifically, I'm thinking I'd like to use a Gamma family with a log distribution for my data. I'll provide a sample of my code and the diagnostic plots I used (which are geared to testing against a normal distribution) below:



plot(fitted(VPGLMM3), residuals(VPGLMM3),xlab = "Fitted Values", ylab = "Residuals")
abline(h=0, lty=2)
lines(smooth.spline(fitted(VPGLMM3), residuals(VPGLMM3)))

I also have a small question I have on the side... my sigma value (which is a measure equivalent to dispersion I believe) for this test is 1.01. I know that data is overdispersed when the value is over 1, and needs to be accounted for, but is a value of 1.01 really a big deal in that regard?

Any help is appreciated thanks!

Edit: This page could be helpful, but I'm still trying to parse it http://www.r-bloggers.com/exploratory-data-analysis-quantile-quantile-plots-for-new-yorks-ozone-pollution-data/
Last edited:


MHF Helper
Sep 2012
Re: Ensuring test assumptions are met in a GLMM or GEE when using non-gaussian famili

Hey StickyFishy.

What are the requirements for residuals in a GLMM or GEE?
Sep 2014
Re: Ensuring test assumptions are met in a GLMM or GEE when using non-gaussian famili

Well, I'm a bit confused about this part as well.

I believe that if you're doing a GLMM, then the following needs to occur:
1. Residuals of the random factor are normally distributed.
2. Residuals from the model output must fit the distribution of the given family.
3. Residuals from the model output must show variance in accordance with the family and link function.

As for GEEs, I think it's pretty much the same except the distribution of random factor residuals doesn't matter... but I don't feel very sure anymore. I'm finding it extremely hard to find information online about this stuff. Essentially I just want to know how I should test my GLMM's residuals given I specify, say, an inverse gaussian family with a log-link. I need to use the package glmmpql in my analysis it seems, and there don't seem to be any other aids to assess model fit here, such as AIC, BIC, and loglik that are given when using lme4.