general: when are the results of t-tests interpretable?
is it sufficient if the distribution of the statistic under H0 is normal? or the data that we feed in the test also need to be reasonably gaussian?
The t-tests are interpretable when the mathematical assumptions are met as well as when the results and data in the context of the experiment or inference make sense as well.
Statistical tests are based on assumptions and if you put garbage in, you'll get garbage out as well.
The t-test has a specific distribution (Student-t distribution) and this distribution depends on s^2 having roughly a chi-square distribution and the sample mean having roughly a normal distribution where s^2 is independent from the sample mean. If these are not met, you can't use a t-distribution.
The Central Limit Theorem can give some sort of guarantee for the distribution of the sample mean (in fact the CLT is the reason all the frequentist statistics actually work) and in some respects it also does the same for the variance as well (recall that the sum of squared normals is what a chi-square is) but this depends on the distribution since really skewed distributions will require a lot more samples to get close to these assumptions than less skewed ones.
thanks for your reply.
I guess the question is how to test whether you are putting garbage in! How can the experimenter know their data is garbage? Isn't that why we do tests? If we knew the results already, we wouldn't be doing ay tests at the first place!
to be more specific and give an example:
say we have a sample of 20 numbers that we want to test if their mean is significantly different from zero and we prefer to use t-test. how would you go for doing this test?
I can see that one can opt for non-parametric alternatives but note that you would probably lose power if you switch to non-parametric provided that th required conditions are met.
Well the statistical techniques don't really care about how you got the data or if its accurately representing what it should be: they only care that the data meet statistical and mathematical assumptions for analysis.
In a t-test I've mentioned the sample mean and variance being independent from one another (and I'll come back to this), but probably the most important thing to consider is whether each observation is really independent from all the others.
If you are doing measurements where all the data have strong dependencies, then a lot of the statistical techniques (nearly all the common ones) will not work because they all assume that each observation or measurement, is independent from the others in that the values of one are not related to the others or are so weak they are considered independent.
Think of a <x,y,z> vector: if I change x then y and z won't change but if y = f(x,z) for example, then changing x or z will affect y.
Now in terms of the mean and variance being independent, in many underlying population distributions they are however not all are like this and the best example is a Poisson distribution which has its mean equal to its variance. In this case, you can't use something like a t-test, but if you had say a Normal, exponential, Binomial or other population with independent central moments then this would be fine.
Experimentally you need to make sure that when you get the observations, the experiment doesn't create situations where it creates dependencies between observations since all the statistics that most scientists/engineers/etc use rely on independent observations and only really specialized stuff deals with dependencies.
I very much agree with you on the independence point. I am pretty confident that my data are always independent (from physically separate sources at least).
Let's go to the more specific question now! Could you tell me what set of tests will you do on a set of numbers that you wish to compare its mean with zero using a t-test?
Well in terms of that question, you can't really test the distributional assumptions per se of say s^2 and x_bar (the sample variance and sample mean) unless you decide to graph the histogram. The x_bar assumption is taken care of by the CLT as is s^2, but what you can do is you can look at the sample sizes (the bigger they are the better these distributional approximations) and the histogram itself.
If something is really non-linear and very highly skewed, then it will take longer for the CLT to do its magic (i.e. more samples) and in some cases it may not actually be useful and other techniques need to be resorted to (like non-parametric testing).
In terms of the skewness, the quantitative aspect for this is related to the kurtosis and you could do a hypothesis test on this factor to see whether you have evidence of it being too large to be confident using the CLT (and hence the t-test analysis). You would have to check the literature on stuff like this though.
The other main thing is correlation between the mean and variance of the distribution: as pointed out before Poisson distributions have these equal and many others have relationships between the two (like exponential and so on). If s^2 is a function of x_bar (or the reverse) then this won't work (i.e. wont meet the assumptions).
In relation to the above advice, I would suggest that an avenue to look at is assessing whether the histogram gives a fit to particular parametric forms of common distributions in which you can see whether the first two central moments are independent quantities (i.e. you can't write a = f(b) or b = f(a) where a and b are the first two central moments).
There are always more general tests to resort to anyway, but the above are some things that come to mind.