reconciling 2 or more data sets when data volume is varying and wide
I am a real estate researcher by trade. I am doing a report on median and mean home prices
in my town over a comparable time period. i.e.- quarter to quarter and year over year.
My problem is this:
for example, in the 4th Qtr of 2009, there was 35 sales with a median price of $335,500
and a mean price of $343,960.
In Q1 of 2010 there were only 11 sales with a median of $349,900 and a mean of $368,245
My problem is comparing two or more data sets, that have a large delta in the number of sales.
In this instance, 35 versus 11. With a such a low volume of sales in Q1 I would think that the data is more volatile, yes?
Is there a formula or solution to 'normalize' or show the difference in the two data sets
with regard to the wide delta of sales. I think that the median and mean prices can't be compared correctly
using such high differential sets of sales numbers, am I right or way off?
I did perform a mean price comparison using a 'trimmed mean' analysis but I wanted something with more bite...
to show the volatility of the data when volume is erratic and to conclude that the median and mean
can only be reliable when sales volume is close to each data set.
Basically, to breakdown the data and make it reliable and comparative.