I have data from a questionnaire asking about how they felt on a few variables (for example quality of sleep) before and after taking a fitness training course. There are 140+ responses.

However, there are several (maybe about 10) respondents who didn't answer the basic questions about their age and/or gender, although they did answer the ordinal scale questions that came later in the questionnaire (for example quality of sleep).

Would it be statistically acceptable if I keep such observations when I am analyzing only one variable at a time (for example a hypothesis test about only the variable quality of sleep)?

Could these missing values (age, gender) in the results be remedied with some imputation method, or I must delete all the observations with missing gender and/or age when I go on to jointly analyze at least one of these variables with one/several other answers in the questionnaire?

Hey osku809.

I took a look at my old notes for missing values [university ones] and the three classifications of missing values are missing completely at random, missing at random, and not missing at random.

Do you know about any of these?

chiro, can you explain me about missing random and not missing at random?

I dug up my notes and the definitions are as follows:

P(Ri | Y(i,O), Y(i,M), Xi) = P(Ri, Y(i,O), Xi) or that Ri is conditionally independent of Y(i,M)

Y(i,M) - Missing values [dependent variable]
Y(i,O) - Observed values [dependent variable]
Ri - Response indicator
Xi - Independent variable

The book references the following text - Applied Longitudinal Analysis by Fitzmaurice Et Al in lecture notes.