# Thread: searching for an unbiased estimator of the Teachman index

1. ## searching for an unbiased estimator of the Teachman index

Hi all!

The Teachman index is often used as an estimator of diversity in the context of social sciences. It is defined as follows:

TI = - sum[
pk*ln(pk)]

where pk is the proportion of group members in the kth category. Therefore, we have maximum diversity if the sample is distributed equally across the k categories.
Unfortunately, this measure is biased in a way that it underestimates the diversity in smaller sample sizes (n). This is comparable to the biased estimation of the variance if you divide by n and not by (n-1).

I wonder if there is a correction formula to this equation that yields unbiased estimates of this index. I am not a statistician, so any hints are helpful, even the obvious ones...

thanks!

trantor

2. Originally Posted by Trantor75
Hi all!

The Teachman index is often used as an estimator of diversity in the context of social sciences. It is defined as follows:

TI = - sum[ pk*ln(pk)]

where pk is the proportion of group members in the kth category. Therefore, we have maximum diversity if the sample is distributed equally across the k categories.
Unfortunately, this measure is biased in a way that it underestimates the diversity in smaller sample sizes (n). This is comparable to the biased estimation of the variance if you divide by n and not by (n-1).
I wonder if there is a correction formula to this equation that yields unbiased estimates of this index. I am not a statistician, so any hints are helpful, even the obvious ones...

thanks!

trantor
This index has a very similar form to the Shannon information entropy. So I'm pretty sure that there's no unbiased estimator ..... See p8 of http://www.gatsby.ucl.ac.uk/~pel/pos...tham_sfn00.pdf for example.

However .... you might have a go at reading through this: Letter to the Editor

No doubt someone will correct me ......