stattistical analysis of md trjaectory

Dear all,

This is slightly basic question using on statistical method on MD trjajectory.

I have done a MD simulation of protein molecule(P) for 30ns using 2 force fields(F1 and F2). I have taken the RMSD from both the trajectories(rmsd1, rmsd2 vectors) and computed the average RMSD ( avg1 and avg2)and standard deviations.(std1 and std2)

Now, if i want to compare these trajectories and show that the sample mean in RMSD(avg1 = avg2), what statistical analysis do I need to conduct .? By looking at the avg. values , the difference (avg1-avg2) is not much, is there any way(statistical) to show that the difference is not significant and is the comparison of sample means alone is sufficient or even I need to use the comparison of standard deviations ..?

please let me know if there are any other suggestions

Thank you

Gurunath

Re: stattistical analysis of md trjaectory

Hey gurukatagi.

This test in statistical literature is basically a two-sample t-test. Depending on the nature of the data you may have to look at several kinds of tests that do all do the same thing intuitively, but have different assumptions that depend on the nature of the data and the experiment.

If you have a high enough sample size then a t-test is going to be your best bet.

You have three kinds of t-tests: one assumes the bare minimum (un-equal variances, independent samples and observations, enough observations, sample variance is roughly chi-square), another assumes equal variances, and another assumes that there is a link between the two data sets.

The second is called a pooled test, the third is a paired test and the first is an un-pooled/un-paired test.

You can use a variety of statistical packages to calculate whether there is evidence of these being equal or not-equal but you can do it by hand and use a computer (or a website that has an applet) to calculate the p-value given your test-statistic and then you select a statistical significance level, compare your p-value against this and then decide based on what the data, statistics, and your own experience tells you to make a decision of whether all of this indicates a difference between the means or not.

Here is a site that google returned that looks intuitive enough to get a result:

T-Test Statistics Calculator

Since I don't know that much about your data both data-wise, process-wise, and context-wise, I should stress to do your own research and double check what I or others say especially if this is going to be used in a paper or for a client/employer. The t-test is typically what is used for these kinds of comparisons and it works theoretically because of the central limit theorem, but one needs to be careful about using it.

Chances are if your sample size is big enough then this will be OK to use. The real issue for these tests has to do with the sample variance being chi-square distributed since the sample means distribution has already been taken care of by the Central Limit Theorem which means you would be good to go if these were met.

Also the mean has to be independent to the variance and if this is not the case, then you can't use a t-test. An example of where this is not the case (i.e. not independent) is a Poisson distribution.