
Originally Posted by
chiro
Hey froggy124.
Have you tried fitting the model by deleting some of the factors. re-fitting the model, and comparing with the other models?
Also try doing what is called a Principal Components Analysis on the data: This will create an orthogonal set (un-correlated but not necessarily independent) of random variables by "rotating" the data set to achieve this, and it will sort the components by decreasing order of variance.
Higher variance means higher contribution of that variable to the variance in the entire data set and lower variance means lower contribution.
It is useful as an additional tool to answer your question.
The other thing is the test for interaction components between the factors and tests for correlation between sets of random variables.
The PCA helps with regards to correlation in some regard (in terms of identification) when you do a PCA plot to see how variables are related.
The interaction components are just standard tests to see if the product of two factors is statistically significantly different from zero (or not). This also establishes the possibility of bias and is used in experimental design analyses.
In terms of worring about missing variables that contribute to causality, this is the reason why we say that correlation doesn't imply causation: it's basically a pessimistic (but needed) view that if you don't have all the data that truly represents the process, you can't establish causality with certainty.
There are techniques called forward, backwards, and step-wise regressions that you should check.
Finally you also want to think about the data in context of both the process involved and how the data really represents that process: if it misrepresents the process for any reason then that misalignment will cause bad inferences. This is the hardest thing because you can be really careful and have a lot of experience but still miss critical factors that are hidden between layers of things outside of your awareness.
It's definitely not an easy problem.