Hello,

As a non statistician with a scientific background i have to build a model with binary outcome (survival or not).

The covariates(a mix of continuous and discrete variables) are collected each year on a given population.

For some reasons we cannot collect the data each year for each person.

For example the data may be available for the following person/year

-----Year

-------------2006---- 2007---- 2008---- 2009---- 2010

Person

1 ----------Yes----- Yes----- Yes----- No -----No

2 ----------No----- No----- Yes----- Yes -----Yes

3 ----------Yes----- Yes----- No----- No -----Yes

4 ----------Yes----- No----- Yes----- Yes -----Yes

5 ----------Yes----- Yes----- Yes----- Yes -----Yes

I already performed a logistic regression without considering correlation between the data.

At the time being I've read a few articles dealing with correlation between data and I think that I have to incorporate them in my model since the measure on the same person are correlated(repeated measure).

My questions are the following :

----------------------------------------------

+ Which class of model should I use ? I think i should use the Generalized Estimating Equations since I'm interested in the survival probability as a function of the covariates.

+ If there are several methods that could work do you have reference to a text describing them ?

+ Which function should I use in R to obtain the regression coefficients ? I have seen that a few packages that deal with correlated data (geepack, MCMglmm, lme4, glmmAK, ...)

+ Where can I find documents that explain the R-function and the models behind them ?

Your answer and comments would be of great use. Thank you.