# Thread: looking for algorithm for weighted random choice

1. ## looking for algorithm for weighted random choice

The subject line probably doesn't quite call it what it's supposed to be called but, anyway, here's the problem:

Just for the sake of argument, say there are 5 possibilities (each "possibility" represents someone's blog feed for instance.) Over the course of time, each feed receives votes on how interesting it is to the audience reading the feeds.

So, feed1 has received 41 votes, feed2 - 25, feed3 - 14, feed4 - 10, feed5 -8.

What would be the formula for randomly choosing one of the five feeds, while at the same time giving the feeds with a higher number of votes, a better chance at getting selected? In other words, how to randomly choose one of the feeds while factoring in their popularity (or unpopularity, as the case may be.)

Thanks in advance for any replies.

2. Originally Posted by m3p
The subject line probably doesn't quite call it what it's supposed to be called but, anyway, here's the problem:

Just for the sake of argument, say there are 5 possibilities (each "possibility" represents someone's blog feed for instance.) Over the course of time, each feed receives votes on how interesting it is to the audience reading the feeds.

So, feed1 has received 41 votes, feed2 - 25, feed3 - 14, feed4 - 10, feed5 -8.

What would be the formula for randomly choosing one of the five feeds, while at the same time giving the feeds with a higher number of votes, a better chance at getting selected? In other words, how to randomly choose one of the feeds while factoring in their popularity (or unpopularity, as the case may be.)

Thanks in advance for any replies.
Hi m3p,

If I may generalize your question slightly, I think you are asking for an algorithm for making a random choice from 1 of n objects in such a way that the probability of choosing object $i$ is $p_i$, where $0 \leq p_i$ and $p_1 + p_2 + ... + p_n = 1$.

On a computer, one way to do this is to generate a pseudo-random number $u$ from a Uniform(0,1) distribution. Then choose object $m$ where $m$ is the least integer such that
$\sum_{i=1}^m p_i > u$.

For your example, you would take $p_i$ to be the number of votes for feed $i$, divided by the total number of votes.

3. ## re: looking for algorithm for weighted random choice - solution

O.k., turns out the most concise phrase to use on search engines when looking for an answer to this question, is to
look for "weighted random numbers", or "weighted random number generator".

In any any case, I think I've found my answer and I thought I'd share it here in return for the possibility (pun
intended), there was for me to receive an answer from the knowledgeable members of this forum.

There were several pages I found that provided excellent approaches to the subject, but without question the most
elegant and efficient solution was found here. The Perl code (adapted from the PHP solution I found), is shown below:

001 @feeds = ('f1', 'f2', 'f3', 'f4', 'f5' );
002 @votes = (41, 25, 14, 10, 8 );
003
004 $tot= eval join '+', @votes; # total in this case = 98 005 006$rand = rand(1000);
007
008 $offset = 0; 009$indx=0;
010 foreach $vote (@votes) { 011$offset += ($vote /$tot) * 1000;
012 if ($rand <=$offset) {
013 return $indx; 014 } 015$indx++;
016 }

What's going on:

line 004: get the total number of votes
line 006: get a random number between 0-999
line 010: start looping thru the array of votes
line 011: for the current feed, find the ratio of votes to total votes, multiply that value times 1000,
add the result to $offset line 012: if the random number obtained before entering the loop is less than or equal to$offset (i.e. the
high boundary for the particular feed's possibility range), then return a pointer to the "random" feed

In other words, feed f1 has roughly a 41/98 (41%) chance of being selected, feed f2 - 25%, etc.
So out of 1000 random numbers (0-999), if a number between 1-418 comes up, then feed f1 "wins", between 419-673,
feed f2 wins, etc.

The beauty of this solution is that it doesn't use up a lot of memory, like some solutions I found, it doesn't
care how many possibilities there are, what their weights are, or what the total of the weights is, and most importantly in my case, it's so easy to understand.

So... thanks to all for tolerating my question.
---------------------------------------------------