Hello,

As a non statistician with a scientific background i have to build a model with binary outcome (survival or not).

The covariates(a mix of continuous and discrete variables) are collected each year on a given population.

For some reasons we cannot collect the data each year for each person.
For example the data may be available for the following person/year

-----Year
-------------2006---- 2007---- 2008---- 2009---- 2010
Person
1 ----------Yes----- Yes----- Yes----- No -----No
2 ----------No----- No----- Yes----- Yes -----Yes
3 ----------Yes----- Yes----- No----- No -----Yes
4 ----------Yes----- No----- Yes----- Yes -----Yes
5 ----------Yes----- Yes----- Yes----- Yes -----Yes


I already performed a logistic regression without considering correlation between the data.

At the time being I've read a few articles dealing with correlation between data and I think that I have to incorporate them in my model since the measure on the same person are correlated(repeated measure).

My questions are the following :
----------------------------------------------
+ Which class of model should I use ? I think i should use the Generalized Estimating Equations since I'm interested in the survival probability as a function of the covariates.
+ If there are several methods that could work do you have reference to a text describing them ?

+ Which function should I use in R to obtain the regression coefficients ? I have seen that a few packages that deal with correlated data (geepack, MCMglmm, lme4, glmmAK, ...)
+ Where can I find documents that explain the R-function and the models behind them ?


Your answer and comments would be of great use. Thank you.