I tried getting help with this in a computer science forum and newsgroup without any luck. If this is off topic then my apologies but I think you math gurus may be able to help.
My question concerns avoiding modulo bias which I came across when reading the Wikipedia page about the Fisher–Yates shuffle aka Knuth shuffle algorithm. Link to Wikipedia Fisher-Yates Shuffle Page, Modulo Bias Section
Many pseudo random number generators provide a random integer between 0 and a pre-defined maximum integer. Users generally need to convert the random number to a required range and do so using the modulo operator. This will introduce a modulo bias unless the range required evenly divides into the pre-defined maximum random number integer.
The way to avoid this modulo bias is to select only random numbers which are less in value than the maximum value which your number range can evenly divide into the pre-defined maximum random number integer as shown by the following pseudo code.
Code:
/* Random number in the range 0..99 with no modulo bias. */
int SystemMaxRandomNumber = 32768;
int RandomNumberRequiredRange = 100;
float f = SystemMaxRandomNumber / RandomNumberRequiredRange;
int i = RoundDown ( f );
int HighestAcceptableRandNum = ( i * RandomNumberRequiredRange ) - 1;
int RandomNumber = GetRandomNumber ();
while ( RandomNumber > HighestAcceptableRandNum ) {
RandomNumber = GetRandomNumber ();
}
int RandomNumberNoModuloBias = RandomNumber % RandomNumberRequiredRange;
So I understand this kind of modulo bias and how to avoid it. The last paragraph of the modulo bias section of the Wikipedia Fisher–Yates shuffle page states the following:
A related problem occurs with implementations that first generate a random floating-point number—usually in the range [0,1)—and then multiply it by the size of the desired range and round down. The problem here is that random floating-point numbers, however carefully generated, always have only finite precision. This means that there are only a finite number of possible floating point values in any given range, and if the range is divided into a number of segments that doesn't divide this number evenly, some segments will end up with more possible values than others. While the resulting bias will not show the same systematic downward trend as in the previous case, it will still be there.
I do not understand this. Can someone please explain to me more clearly why the floating point values can introduce bias and also what steps should be taken to avoid bias when generating random numbers using a random number generator which provides floating point numbers in the range 0 to 1?
Many thanks.