# Math Help - Is this due to a bias?

1. ## Is this due to a bias?

Consider the following:

An IQ-test has been carried out, and the test result is T. However, since the test result is known to be normal distributed around the true IQ Q, the following reasoning is done:

The IQ for people taking this test is known to be normal distributed around 100. Actually, the IQ is distributed with the following probability function:

$f_{100}(Q) = C_1e^{-A(Q-100)^2}$

where A and $C_1$ are constants. The test result is then known to be normal distributed around the IQ with the following probability function:

$f_Q(T) = C_2e^{-B(T-Q)^2}$

where B and $C_2$ are constants. Taking all of this into account, the probability function, for that a person taking the test has the IQ Q and gets the test result T, is

$f(Q, T) = f_{100}(Q)f_Q(T) = C_1e^{-A(Q-100)^2}C_2e^{-B(T-Q)^2}$

$= C_1C_2 e^{-(A+B)Q^2 + (A\cdot 200 + 2BT)Q -10000A - BT^2}$

$= C_Te^{-(A+B)\left(Q-\frac{A\cdot 100 + BT}{A+B}\right)^2}$

where $C_T$ is a variable depending only on T. What this tells us is basically that if a person has got the test result T, the most likely value for his IQ, and it's expected value, is actually:

$\hat Q_{mle} = E(Q) = \frac{A\cdot 100 + BT}{A+B}$

(which is closer to 100 than what T is) for a fixed value of T. However, this method of estimating the IQ from a test result is not consistent, since a sequence of estimators would converge in probability to $(A\cdot 100 + BQ)/(A+B)$ and not to Q (note that for real IQ tests, B is most often many times bigger than A).

Now, what I'm wondering is, do you say that this inconsistency is due to a bias? Or what do you say it is caused by? And when tests like these are carried out, what is most often used to estimate the measured parameter, the actual test value or the "adjusted" test value?

2. Any idea?

3. Originally Posted by TriKri
Consider the following:

An IQ-test has been carried out, and the test result is T. However, since the test result is known to be normal distributed around the true IQ Q, the following reasoning is done:

The IQ for people taking this test is known to be normal distributed around 100. Actually, the IQ is distributed with the following probability function:

$f_{100}(Q) = C_1e^{-A(Q-100)^2}$

where A and $C_1$ are constants. The test result is then known to be normal distributed around the IQ with the following probability function:

$f_Q(T) = C_2e^{-B(T-Q)^2}$

where B and $C_2$ are constants. Taking all of this into account, the probability function, for that a person taking the test has the IQ Q and gets the test result T, is

$f(Q, T) = f_{100}(Q)f_Q(T) = C_1e^{-A(Q-100)^2}C_2e^{-B(T-Q)^2}$

$= C_1C_2 e^{-(A+B)Q^2 + (A\cdot 200 + 2BT)Q -10000A - BT^2}$

$= C_Te^{-(A+B)\left(Q-\frac{A\cdot 100 + BT}{A+B}\right)^2}$

where $C_T$ is a variable depending only on T. What this tells us is basically that if a person has got the test result T, the most likely value for his IQ, and it's expected value, is actually:

$\hat Q_{mle} = E(Q) = \frac{A\cdot 100 + BT}{A+B}$

(which is closer to 100 than what T is) for a fixed value of T. However, this method of estimating the IQ from a test result is not consistent, since a sequence of estimators would converge in probability to $(A\cdot 100 + BQ)/(A+B)$ and not to Q (note that for real IQ tests, B is most often many times bigger than A).

Now, what I'm wondering is, do you say that this inconsistency is due to a bias? Or what do you say it is caused by? And when tests like these are carried out, what is most often used to estimate the measured parameter, the actual test value or the "adjusted" test value?
To talk about convergence in probability you need a sequence of random variables. T is a single observation. If, however, you were talking about taking the test infinitely often then you could talk about $\bar{T}_n$, which has variance proportional to $1/n$. The ultimate effect of this is that the term $B$ will be proportional to $n$, which will give consistency.

The final estimator you have is just the Bayes estimator of Q with a normal prior on it, but your formula is only valid for a single test observation. You have $EQ|T$ right, but you need to calculate $EQ|T_1, T_2, ..., T_n$ if you want to talk about what happens asymptotically (note that $\bar{T}_n$ is complete sufficient, so it suffices to only condition on that). If you take the test a bunch of times and use all the data, you'll end up with a $\bar{T}$ instead of just T the term $B$ will grow as n does.

4. Originally Posted by theodds
The ultimate effect of this is that the term $B$ will be proportional to $n$, which will give consistency.
No, the test is carried out once, and then the person leaves. What I mean is that, for a person with IQ $Q\ (Q \neq 100)$, the expected value of the corrected value of the IQ test, let's call that T', is actually closer to 100 that what Q is. The expected value of T' (which is the most likely value of the persons IQ, from the point of view of the people carrying out the test) is therefore not Q. But doesn't this imply some kind of contradiction for a test like this? Note that T' is calculated from the test result T.

5. Originally Posted by TriKri
No, the test is carried out once, and then the person leaves. What I mean is that, for a person with IQ $Q\ (Q \neq 100)$, the expected value of the corrected value of the IQ test, let's call that T', is actually closer to 100 that what Q is. The expected value of T' (which is the most likely value of the persons IQ, from the point of view of the people carrying out the test) is therefore not Q. But doesn't this imply some kind of contradiction for a test like this? Note that T' is calculated from the test result T.
It's totally fine for it to be biased. Except in very trivial cases, Bayes estimators are always biased when you condition on the true parameter value. From the standpoint of point estimation, unbiasedness isn't all it's cracked up to be and it isn't nearly as important as an introductory math-stat course would have you believe. The point is, though, that as you collect more and more data (i.e. you take the test more and more times) the Bayes estimator converges to an unbiased estimator.

One way to think about this is to imagine that there is actually very little variability in the population but very high variability in the test. Say the variance of the population is 1 and the variance of the test score is 40. Then, if someone gets a score of 140, I'm still going to be pretty sure that his true value is near 100. My propensity to pull him towards the prior mean is what creates the bias - after all, the test is assumed unbiased; it just isn't very precise in this example. But, as he takes the test infinitely often, if his true IQ is 140, I will become more and more convinced that his real IQ is 140 i.e. asymptotically I have unbiasedness.