Generate a random integer of set [0,3]

T

Toby Newman

I need to randomly choose one of four paths in my program. Using the
tools I know, the best way I can think to do it is by doing something
like the following:

//==============================
#include <stdlib.h> //allow rand() function
int i;
i = rand (); // i = any number of the set [0,2^32 -1] or [0, 4294967295]

if (0<=i<(4294967295/4))
{ /* choice one*/ }
else if (4294967295/4<=i<(4294967295/2))
{ /* choice two*/ }
else if (4294967295/2<=i<3*(4294967295/4))
{ /* choice three*/ }
else if (3*4294967295<=i<=4294967295)
{ /* choice four*/ }
//==============================

This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?
 
B

boa

Toby said:
I need to randomly choose one of four paths in my program. Using the
tools I know, the best way I can think to do it is by doing something
like the following:

//==============================
#include <stdlib.h> //allow rand() function
int i;
i = rand (); // i = any number of the set [0,2^32 -1] or [0, 4294967295]

if (0<=i<(4294967295/4))
{ /* choice one*/ }
else if (4294967295/4<=i<(4294967295/2))
{ /* choice two*/ }
else if (4294967295/2<=i<3*(4294967295/4))
{ /* choice three*/ }
else if (3*4294967295<=i<=4294967295)
{ /* choice four*/ }
//==============================

This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

Check out the clc faq for an answer:
http://www.eskimo.com/~scs/C-faq/q13.16.html
 
P

pete

Toby said:
I need to randomly choose one of four paths in my program. Using the
tools I know, the best way I can think to do it is by doing something
like the following:

//==============================
#include <stdlib.h> //allow rand() function
int i;
i = rand (); // i = any number of the set [0,2^32 -1] or [0, 4294967295]

if (0<=i<(4294967295/4))
{ /* choice one*/ }
else if (4294967295/4<=i<(4294967295/2))
{ /* choice two*/ }
else if (4294967295/2<=i<3*(4294967295/4))
{ /* choice three*/ }
else if (3*4294967295<=i<=4294967295)
{ /* choice four*/ }
//==============================

This seems like a lot of work for a simple task.
Is there some function along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

/* BEGIN new.c */

#include <stdio.h>

#define LU_RAND_SEED 123456789LU
#define LU_RAND(S) ((S) * 69069 + 362437 & 0xffffffff)
#define LINES 40

int main(void)
{
long unsigned x, y;
int r;

y = LU_RAND_SEED;
puts("\n/* BEGIN output from new.c */\n");
for (x = 0; x != LINES; ++x) {
y = LU_RAND(y);
r = (y & 48) >> 4;
printf("%i", r);
}
puts("\n\n/* END output from new.c */");
return 0;
}

/* END new.c */


/* BEGIN output from new.c */

1202300203032200013123313232113330201220

/* END output from new.c */
 
P

Paul Hsieh

boa said:
Toby said:
This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

Check out the clc faq for an answer:
http://www.eskimo.com/~scs/C-faq/q13.16.html

This answer given inthe FAQ is just plain incorrect. Here's an
example of an answer that ought to work well in practice:

#define RAND_RANGE(x) (RAND_MAX / (x))

do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

Depending on the size of RAND_MAX and the range you want, it might be
better to try to concatenate two rand()'s together to form a larger
random number and perform the same trick as above, however, the
success of that will depend a lot on the actual value of RAND_MAX.

If you want more a predictable interface for random numbers, you can
use the Mersenne Twister which can be found here:

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/eindex.html
 
W

William L. Bahn

Paul Hsieh said:
boa said:
Toby said:
This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

Check out the clc faq for an answer:
http://www.eskimo.com/~scs/C-faq/q13.16.html

This answer given inthe FAQ is just plain incorrect. Here's an
example of an answer that ought to work well in practice:

#define RAND_RANGE(x) (RAND_MAX / (x))

do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

That'll need to be a 4 instead of 3 won't it?

The OP either wants {0,1,2,3} or {1,2} - I'm guessing the former.
 
E

Eric Sosman

Paul said:
boa said:
Toby said:
This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

Check out the clc faq for an answer:
http://www.eskimo.com/~scs/C-faq/q13.16.html


This answer given inthe FAQ is just plain incorrect.

That's a little strong, I think. The FAQ explicitly states
that its technique is to be used only when "N is much less than
RAND_MAX," and when that is the case the inaccuracy is negligible.
For the problem at hand we have N=4 and RAND_MAX>=32767, hence
the error is one-eighth of one percent or less. How many samples
would a statistical test require to detect an error of this size,
particularly given that soundness of the underlying rand() is
unknown?
> Here's an
example of an answer that ought to work well in practice:

#define RAND_RANGE(x) (RAND_MAX / (x))

do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

This answer given in the posting is just plain incorrect. ;-)
Once the typo is fixed, though, the technique is a valuable one,
and will work for all positive N up to and including RAND_MAX.
Depending on the size of RAND_MAX and the range you want, it might be
better to try to concatenate two rand()'s together to form a larger
random number and perform the same trick as above, however, the
success of that will depend a lot on the actual value of RAND_MAX.

That's a risky technique, since it will tend to emphasize
flaws in the underlying rand(). Combining K successive rand()
values uses rand() as a source of random K-dimensional points
rather than one-dimensional values, and the "spectral accuracy"
diminishes as K increases. (As W. Givens remarked, the points
generated by linear congruential generators "stay mainly in the
planes." Look up "spectral test" and/or Coveyou & MacPherson.)
If you want more a predictable interface for random numbers, you can
use the Mersenne Twister which can be found here:

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/eindex.html

The "Minimal Standard Random Number Generator" of Park and
Miller is another candidate, requiring a good deal less memory
per stream of values. George Marsaglia has posted several very
fast generators on this newsgroup, with memory requirements
greater than MS' but less than MT's.
 
T

Toby Newman

# Paul Hsieh
This answer given inthe FAQ is just plain incorrect. Here's an
example of an answer that ought to work well in practice:

#define RAND_RANGE(x) (RAND_MAX / (x))

do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

Maybe (Maybe? Probably!) I've misunderstood how macros work, but I can't
understand your example. This is how I interpret it:

####################################

#define RAND_RANGE(x) (RAND_MAX / (x))
// RAND_MAX is the maximum value that can
//be returned using the rand() function,
// i.e. (2^32 -1) = 4294967295
// on my target

do
{
i = rand();
// set i to a value between 0 and 4294967295
}
while
(
// i >= 3 * RAND_RANGE (3)
// or
// i >= 3 * RAND_MAX / (3)
// or
// i >= 3 * 4294967295 / 3
// or
i >= 4294967295 // This is not going to happen
);

// i /= RAND_MAX / (3);
i = i / 1431655765;

####################################

I'm afraid I'm not following the logic of what you're doing here.
 
P

Paul Hsieh

William L. Bahn said:
Paul Hsieh said:
boa said:
Toby Newman wrote:
This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

Check out the clc faq for an answer:
http://www.eskimo.com/~scs/C-faq/q13.16.html

This answer given inthe FAQ is just plain incorrect. Here's an
example of an answer that ought to work well in practice:

#define RAND_RANGE(x) (RAND_MAX / (x))

do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

That'll need to be a 4 instead of 3 won't it?

You're right. I misread it as [0,3).
 
J

Jarno A Wuolijoki

# Paul Hsieh
#define RAND_RANGE(x) (RAND_MAX / (x))
do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

Maybe (Maybe? Probably!) I've misunderstood how macros work, but I can't
understand your example. This is how I interpret it:

[snip]
####################################

I'm afraid I'm not following the logic of what you're doing here.

Paul was trying to derive an algorithm that would give you
a truly equiprobable random number between [0..n] if rand()
was a perfect random number generator. Unfortunately, for
most combinations of RAND_MAX and n, this is impossible to
do with any fixed amount of calls to rand().

His implementation was buggy, but the technique is pretty cool
anyway. The trick is to loop until get a number between 0 and
n*((RAND_MAX+1)/n)-1 (*) before proceeding with division.
^^ ^^

*) This is not C, btw. You'll have to tweak your way around
overflow by using tricks like RAND_MAX-n+1 or somesuch.
 
P

Paul Hsieh

Eric Sosman said:
Paul said:
boa said:
Toby Newman wrote:
This seems like a lot of work for a simple task. Is there some function
along similar lines to:
rand(3);
Which would generate a random integer of the set [0,3] ?

Check out the clc faq for an answer:
http://www.eskimo.com/~scs/C-faq/q13.16.html

This answer given inthe FAQ is just plain incorrect.

That's a little strong, I think. The FAQ explicitly states
that its technique is to be used only when "N is much less than
RAND_MAX," and when that is the case the inaccuracy is negligible.
For the problem at hand we have N=4 and RAND_MAX>=32767, hence
the error is one-eighth of one percent or less. How many samples
would a statistical test require to detect an error of this size,
particularly given that soundness of the underlying rand() is
unknown?

The FAQ gives a general formula that's almost always wrong. Only
powers of 2 will work correctly, and as I posted I mistakenly thought
the range was [0,3) not [0,3]. So the FAQ answer actually works
correctly for the specific question that the O.P. had, but fails as a
general answer.

It requires about 1000 * (RAND_MAX / range) number of samples to
definitively see that the FAQ answer is flawed for *every* range that
is not a power of 2.
Its still wrong for smaller samples, its just harder to detect. Four
of the ANSI C compilers I have installed on my system set RAND_MAX to
32767. I would not use the FAQ answers for ranges that were not a
power of 2 period -- its just plain wrong.
That's a risky technique, since it will tend to emphasize
flaws in the underlying rand(). Combining K successive rand()
values uses rand() as a source of random K-dimensional points
rather than one-dimensional values, and the "spectral accuracy"
diminishes as K increases.

Right, but your choice is between one random number generation that
can't even get the straight probability correct, and one which at
least has E(|x=X|) = n*p. When you get to this level of test, you
need to go to better generators like the Mersenne Twister anyhow.

The only problem I have with Marsaglia's generators, is that they
either have a smaller cycle than MT, or they are not portable (like
his CMWC generator.) Though he's constantly working on stuff, so I am
sure that he has not spoken his final word on the issue. I just point
out MT, because its the one generator that you can use with the fewest
issues (either theoretically or practically).
 
P

Paul Hsieh

Toby Newman said:
# Paul Hsieh
This answer given inthe FAQ is just plain incorrect. Here's an
example of an answer that ought to work well in practice:

#define RAND_RANGE(x) (RAND_MAX / (x))

do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

Maybe (Maybe? Probably!) I've misunderstood how macros work, but I can't
understand your example. This is how I interpret it:

[...]

while
(
// i >= 3 * RAND_RANGE (3)
// or
// i >= 3 * RAND_MAX / (3)
// or
// i >= 3 * 4294967295 / 3
// or
i >= 4294967295 // This is not going to happen

Why are you so sure that this is not going to happen? This is the
whole point of what this doing is trying to do.
);

// i /= RAND_MAX / (3);
i = i / 1431655765;

####################################

I'm afraid I'm not following the logic of what you're doing here.

Well first of all you can change the "3" to a "4" for your case. The
point of my logic is to reject a small number of the output results
from rand() in order to make sure the remaining results have the
correct probabilities (in your case it won't matter for the sole
reason that your range is a power of 2). This wont help if your range
is greater than RAND_MAX, but apparently your compiler has RAND_MAX
equal to INT_MAX, so you should be ok.
 
K

Keith Thompson

The FAQ gives a general formula that's almost always wrong. Only
powers of 2 will work correctly, and as I posted I mistakenly thought
the range was [0,3) not [0,3]. So the FAQ answer actually works
correctly for the specific question that the O.P. had, but fails as a
general answer.

The powers of 2 proviso is correct assuming that RAND_MAX is one less
than a power of 2. I suspect it is for most or all real-world
implementations, but the standard doesn't guarantee it, and robust
code shouldn't assume it.
 
P

Paul Hsieh

Jarno A Wuolijoki said:
# Paul Hsieh
#define RAND_RANGE(x) (RAND_MAX / (x))
do {
i = rand();
} while (i >= 3 * RAND_RANGE (3));
i /= RAND_RANGE (3);

Maybe (Maybe? Probably!) I've misunderstood how macros work, but I can't
understand your example. This is how I interpret it:

[snip]
####################################

I'm afraid I'm not following the logic of what you're doing here.

Paul was trying to derive an algorithm that would give you
a truly equiprobable random number between [0..n] if rand()
was a perfect random number generator. Unfortunately, for
most combinations of RAND_MAX and n, this is impossible to
do with any fixed amount of calls to rand().

Its impossible for any pseudo-random number generator. A *true*
U(0,1) distribution that matches the mathematical definition is not
something that you can practically produce. But if we instead ask
what can be practically achieved then we see that getting at least the
very minimal (and almost always sufficient) test of E(|X=x|) = np is
trivial, and achievable. The method described in the C FAQ does *NOT*
achieve this, and given what was posted in this thread I felt it was
extremely important that I point this out.
His implementation was buggy, but the technique is pretty cool
anyway. The trick is to loop until get a number between 0 and
n*((RAND_MAX+1)/n)-1 (*) before proceeding with division.
^^ ^^

*) This is not C, btw. You'll have to tweak your way around
overflow by using tricks like RAND_MAX-n+1 or somesuch.

Ok, yes, I have a bug in my program, but its not this. Using the
formula: n*((RAND_MAX+1)/n)-1 instead of the formula I use might widen
or maximize the range or whatever (I am too lazy right now to
precisely characterize it) but is kind of useless if RAND_MAX ==
INT_MAX (which is apparently the OP's actual situation.) Perhaps you
could create a more convoluted formula which includes a "(RAND_MAX ==
INT_MAX)? ... : ..." but I don't see how that's important.

The key is the "while (i >= RANGE * LARGE_ENOUGH_VALUE)" condition. I
may chose a "LARGE_ENOUGH_VALUE" that's sometimes 1 smaller than it
could be, but that's not going to change the overall correctness of
the algorithm.

My bug was the use of "3" instead of "4" which the original poster
asked for.
 
K

Keith Thompson

Toby Newman said:
# Paul Hsieh


Maybe (Maybe? Probably!) I've misunderstood how macros work, but I can't
understand your example.
[...]

rand() returns a value in the range 0..RAND_MAX (one of RAND_MAX+1
possible values, each equally likely). You want a value in the range
0..N-1 (N possible values, each equally likely).

You can do this straightfowardly if the number of possible values
returned by rand() is an exact multiple of the number of possible
values you want; just divide the value returned by rand() by some
number (using truncating integer division) to reduce it to the proper
range. Paul's RAND_RANGE macro determines the divisor (if you feed it
the right value).

For example, suppose rand() returns values in the range 0..9 (not
really possible), and you want values in the range 0..2. Since 10 is
not a multiple of 3, any mapping you pick is going to introduce a
bias. For example, you might have:

0,1,2 --> 0 (30%)
3,5,5 --> 1 (30%)
6,7,8,9 --> 2 (40%)

The do-while loop skips any rand() calls that yield a value greater
than 8, letting you pick an unbiased distribution:

0,1,2 --> 0 (33+%)
3,4,5 --> 1 (33+%)
6,7,8 --> 2 (33+%)
9 (discarded)

(This could be a problem for a hard real-time system; if rand()
happens to give you a long sequence of 9s, it could throw off your
program's timing.)

The effect isn't as dramatic for larger values of RAND_MAX (which must
be at least 32767), but it's still probably worth the effort.

For that matter, it's probably worth the effort of picking some better
random number generator than rand(), which is of mediocre quality in
many implementations.

If I were redesigning the C standard library, I'd probably add a
second function that takes N as an argument and returns a random
integer in the range 0..N-1, so this particular wheel doesn't have to
be reinvented quite so many times.
 
T

Tim Rentsch

Considering the amount of discussion there has been on this topic, it
might be worth getting the random number generator code given in
Knuth's Graph Base book, which produces high quality pseudo-random
numbers, has a period of at least 2**55, is written to produce
consistent values in different systems, runs fast, and has the nice
property that the low order bits are also nicely random. So
just 'gb_next_rand() & 3' would do nicely here.

A search for 'gb_flip_cycle' should find source code, if anyone is
interested. Or get the Graph Base book, it's a worthwhile addition
to any serious practitioner's library.
 
J

Jarno A Wuolijoki

The key is the "while (i >= RANGE * LARGE_ENOUGH_VALUE)" condition. I
may chose a "LARGE_ENOUGH_VALUE" that's sometimes 1 smaller than it
could be, but that's not going to change the overall correctness of
the algorithm.

Actually I just missed the equals sign in the comparision (just like
the other poster) and thought there was something fishy when
i>=4294967295 was "not" going to happen.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,145
Messages
2,570,828
Members
47,374
Latest member
anuragag27

Latest Threads

Top