i have a population of items and i calculate how similar each item is to the rest of the items in the population. i store these values in a symmetric nxn similarity matrix. when i look at the distribution of scores, they follow a beta distribution -- most similarity scores are close to 0 (the items are dissimilar) and then they tail off and fewer and fewer are closer to 1 (the items are very similar).
i want to generate samples from this distribution -- i want to group items together whose distribution of scores is also beta, but only with a lot of the mass centered around the mean and tailing off on both ends (like a hump). these groupings of beta distributions should (i'm pretty sure) naturally form in my data, so my question is: is there a method to sample from my population that gives me a smaller population with a given distribution? any starting point would help...thanks!