Random Integers from 0 to 999

C

CBFalconer

Keith said:
.... snip ...

Do you know of a specific C library implementation that uses an
algorithm in which rand() never returns 0?

No. :) But I never bothered looking.
 
T

Tim Rentsch

Eric Sosman said:
Keith said:
AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.

[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

The trouble is in the testing: If you sample three
hundred twenty-seven quadrillion rand() values and see
no zeroes, can you conclude that rand() will never, ever,
under any circumstances return a zero?

If RAND_MAX is something reasonable, like 2**32 - 1 or less, do
100*(RAND_MAX+1) iterations, and if zero comes up less than 10 or more
than 1000 times, the implementation of rand() is no good anyway. (If
RAND_MAX is >= 2**32, do a similar test with 100*2**32 iterations,
checking for values < RAND_MAX/2**32.)

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

and make a simple, fast, high-quality random number generator. And
you'll get the benefit that programs using it will get (assuming the
same seed values) the same stream of random numbers on different
platforms.
 
M

Michael Wojcik

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

and make a simple, fast, high-quality random number generator. And
you'll get the benefit that programs using it will get (assuming the
same seed values) the same stream of random numbers on different
platforms.

Alternatively, George Marsaglia has posted code here for a couple of
C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs,
which though not portable (IIRC) should be easy enough to adapt to
most C implementations. CMWC PRNGs are flexible, appear to have
excellent characteristics (they have long periods, pass all of
George's DIEHARD tests, etc), and are based on relatively simple
theory. (Output bits are the binary digits of the fractional portion
of a ratio with a long period in that representation, if memory
serves.) Then there's the Mersenne Twister and so forth.

At any rate, I agree with the basic point: if you're dissatisfied, or
think you might be, with rand(), better generators are not hard to
come by.

On a related note: there was a thread on sci.crypt not long ago
regarding producing an unbiased subrange of the PRNG's range (a
common task which we were recently discussing here) while consuming
the minimum possible number of bits from the generator's output. In
other words, someone presented a more sophisticated and less wasteful
approach than just throwing away everything below the largest in-
range multiple of the desired range. That may be of interest to PRNG
tinkerers. It shouldn't be hard to find with Google Groups.
 
E

Eric Sosman

Michael said:
At any rate, I agree with the basic point: if you're dissatisfied, or
think you might be, with rand(), better generators are not hard to
come by.

Since the quality of rand() is implementation-specific,
"better" is hard to justify in a universal sense. Also, even
for a finite set of generators it is hard to justify "better"
across all possible problem spaces.

It seems that everybody[*] eventually latches onto a favorite
substitute for rand(). A few years ago it was proposed that
the C Standard should mandate the "Mersenne Twister" generator
(as it was before it was fixed ...). Now there's a fan base for
Knuth (in whatever version is now current; it, too, has been fixed)
and for Marsaglia (a name to reckon with in RNG's, although it's
not clear which of his many generators is today's favorite). My
own not-so-humble opinion is that the C Standard should leave the
requirements on rand() as weak as they are today, but should describe
it as a "coarse" source of variates, suitable for "casual" use but
not for serious work.

[*] Yes, I confess it: I am as guilty as any. Permit me, if
I may, to introduce you to the "Minimal Standard Random Number
Generator" of Park and Miller, CACM volume 31 number 10 (Oct 1988).
It Rulez!
 
T

Tim Rentsch

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

and make a simple, fast, high-quality random number generator. And
you'll get the benefit that programs using it will get (assuming the
same seed values) the same stream of random numbers on different
platforms.

Alternatively, George Marsaglia has posted code here for a couple of
C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs,
which though not portable (IIRC) should be easy enough to adapt to
most C implementations. [snip]

The CMWC code that I was able to find (some of it in CLC postings)
usually used 'long long'. Maybe I just wasn't looking in the right
places. I'd like to see a write-up that explains enough so I can see
why and how the code is working, but that doesn't bury the reader in a
blizzard of terminology and notation that's typical of a lot of the
writing in mathematics. Also code that doesn't have to depend on a
'long long' or 'uint64_t' type. On the plus side, the CMWC PRNG's
seem pretty lightweight in terms of memory footprint, so if someone
wanted a large number of random number streams then CMWC looks like
it might be a good choice.

By the way, I had trouble tracking down web articles on CMWC, because
most of the web writing on CMWC says complImentary rather than
complEmentary.

On a related note: there was a thread on sci.crypt not long ago
regarding producing an unbiased subrange of the PRNG's range (a
common task which we were recently discussing here) while consuming
the minimum possible number of bits from the generator's output. In
other words, someone presented a more sophisticated and less wasteful
approach than just throwing away everything below the largest in-
range multiple of the desired range. [snip]

For practical purposes I'm inclined to think it's not worth the
trouble. Even in the worst case where the subrange is one bigger than
half the PRNG range, it still takes less than two calls to the PRNG on
the average to get an in-range result. Unless the time cost of
producing a PRN is pretty high (and it shouldn't be), it seems like
it would be simpler and faster to use the usual approach.
 
T

Tim Rentsch

Ben Pfaff said:
Or, if you want a solution in portable C, you can use mine:
http://www.stanford.edu/~blp/writings/clc/random.html

The GraphBase code is also portable C, I believe, except for one
dependency that is easily corrected - the 'long' operands that are
being subtracted should be cast to 'unsigned long' before subtracting.
The GraphBase code is also portable in the sense that it produces the
same values on all systems. It doesn't have implementation dependent
behavior (assuming the noted change to 'unsigned long' has been made).
 
T

Tim Rentsch

Eric Sosman said:
Since the quality of rand() is implementation-specific,
"better" is hard to justify in a universal sense. Also, even
for a finite set of generators it is hard to justify "better"
across all possible problem spaces.

I think you may have over-snipped. Michael was responding to a
comment (by myself) about implementations of rand(). Certainly it
seemed to me like his comment could or should be read about
implementations of rand(), not necessarily about rand() generally.

And, to defend the generic point - if one's worry about rand() is
that it might be bad on some (future) platform, there are good
replacements that are "better" in that they will not misbehave on
those platforms. So that's a sense of better that seems both
reasonable and defensible.

It seems that everybody[*] eventually latches onto a favorite
substitute for rand().

There certainly is some evidence for your thesis. Among the three
people responding to my mention of the GraphBase RNG, there were
four different ("favorite") alternatives suggested.

I should say that the GraphBase RNG is not my "favorite" RNG. It is
the one I recommend to people when they want to rely on something
better than rand(). Mostly that recommendation is not because it's
fantastic at generating random numbers; I do think it's reasonably
good but I don't think it's fantastic. Rather it's because it's
fairly good along several different axes, and not really bad along any
axis. All the other RNG's that I've seen are weak in one aspect or
another - slow, large memory footprint, not as portable in one way or
another, smaller cycle length, and/or not explained well enough. I
think most implementors want code that they really understand and can
use without the code being "black magic"; very few RNG's satisfy this
property, and IMO the GraphBase RNG is one of those that do. The
GraphBase RNG is also the only one I've seen (in recent reading) that
includes a test case to make sure the RNG has been coded correctly.

A few years ago it was proposed that
the C Standard should mandate the "Mersenne Twister" generator
(as it was before it was fixed ...). Now there's a fan base for
Knuth (in whatever version is now current; it, too, has been fixed)
and for Marsaglia (a name to reckon with in RNG's, although it's
not clear which of his many generators is today's favorite). My
own not-so-humble opinion is that the C Standard should leave the
requirements on rand() as weak as they are today, but should describe
it as a "coarse" source of variates, suitable for "casual" use but
not for serious work.

I agree with you - specifications for rand() should be left more or
less as is. But I hope that doesn't stop people from adding newer,
more dependable RNG's to the set that is presumed commonly available.
 
W

websnarf

Eric said:
The trouble is in the testing: If you sample three
hundred twenty-seven quadrillion rand() values and see
no zeroes, can you conclude that rand() will never, ever,
under any circumstances return a zero?

No, but there *are* statistical statements you *CAN* make. Either:

1. The calls to rand() are not independent

or

2. With a high degree of confidence you can say there is an
eccentricity in the distribution of the output probabilities that
underrepresents zero as an output.

This actually matters to some people, and its an interesting echo to
the "Can you use C for mathematical purposes?" thread from earlier that
nobody here seems to be sensitive to the above in any way.
 
C

Chris Croughton

No, but there *are* statistical statements you *CAN* make. Either:

1. The calls to rand() are not independent

At least one implementation had the bottom bits cycling 0 1 2 3 on
successive calls. Not possible to detect statistically but detectable
using other methods.
or

2. With a high degree of confidence you can say there is an
eccentricity in the distribution of the output probabilities that
underrepresents zero as an output.

Or more generally that some values are represented more than others (one
I saw never produced RAND_MAX as a value).
This actually matters to some people, and its an interesting echo to
the "Can you use C for mathematical purposes?" thread from earlier that
nobody here seems to be sensitive to the above in any way.

Not true, some of us are sensitive to it, and prefer to use truly random
sources when they are available even though they are not "portable C"
(hardware random events, Linux /dev/random, timing keystroke entry,
etc.; GPG for instance will use whichever of those is around when
generating keys)). However, for a source of repeatable "random-ish"
numbers rand() is "good enough", and the 'repeatable' part is often the
most important (to be able to reproduce a problem for debugging).

The Fortran RNG may be 'better' in some senses, but it still isn't
totally random (unless the Fortran spec. now insists that the hardware
has an atomic source of random data!). The same will apply in any
language, people really concerned about it will use their own
implementations.

Chris C
 
M

Michael Wojcik

Since the quality of rand() is implementation-specific,
"better" is hard to justify in a universal sense.

Sure. In this case, though, a "better" one is merely one that
produces less dissatisfaction in the subject (since dissatisfaction
with rand() is a precondition). Especially perverse cases aside that
looks eminently achievable, as you yourself suggest:
It seems that everybody[*] eventually latches onto a favorite
substitute for rand().

There you are, then. Everyone will be able to find a "better"
substitute for rand, on the grounds that they will feel that it's
better, and so it will displease them less. What's programming for,
if not to console programmers?
My own not-so-humble opinion is that the C Standard should leave the
requirements on rand() as weak as they are today, but should describe
it as a "coarse" source of variates, suitable for "casual" use but
not for serious work.

Agreed.

--
Michael Wojcik (e-mail address removed)

What is it with this warm, quiet, nauseating bond between them?
-- Rumiko Takahashi, _Maison Ikkoku_, trans. Mari Morimoto, adapt. Gerard
Jones
 
M

Michael Wojcik

Alternatively, George Marsaglia has posted code here for a couple of
C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs,
which though not portable (IIRC) should be easy enough to adapt to
most C implementations. [snip]

The CMWC code that I was able to find (some of it in CLC postings)
usually used 'long long'.

Yes, I think the version I have does too, though if memory serves
there were no great differences in adapting it to use a pair of
unsigned longs, if long has fewer than 64 bits. It's been a while
since I looked at it, though.
By the way, I had trouble tracking down web articles on CMWC, because
most of the web writing on CMWC says complImentary rather than
complEmentary.

Oh, well. (Perhaps it *is* "complimentary": the multiply says
something favorable about its operands, or happens for free. I
suspect this is just a case of homophone confusion, though.)
For practical purposes I'm inclined to think it's not worth the
trouble.

Probably not, though it's a fun discussion. I can see some areas
where it might be useful, though, such as providing a deterministic-
time unbiased generator for a real-time application or making it
easier to blind the time and power consumption of the PRNG for crypto
purposes.

--
Michael Wojcik (e-mail address removed)

Duck: No secret what's worth a hoot ought to be kept quiet.
Pogo: Secrets is usually perty doggone fascinatin'.
Duck: Egg-zackly ... it's completely illogical to keep a secret secret.
Pogo: An' unfair. -- Walt Kelly
 
R

Richard Bos

1. The calls to rand() are not independent

or

2. With a high degree of confidence you can say there is an
eccentricity in the distribution of the output probabilities that
underrepresents zero as an output.

This actually matters to some people, and its an interesting echo to
the "Can you use C for mathematical purposes?" thread from earlier that
nobody here seems to be sensitive to the above in any way.

I dare say that people who use C for mathematical purposes _are_ aware
of the above, and moreover are aware that

Anyone who considers arithmetical methods of producing
random digits is, of course, in a state of sin.
Alfred E. Neumann, 1951

Richard
 
G

Grumble

Keith said:
If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

I was under the impression that the PRNG discussed in [1] had been
used often in the old days?

[1] "Random number generators: good ones are hard to find"
http://www-scf.usc.edu/~csci105/links/p1192-park.pdf

u_0 = 1
u_{n+1} = 16807*u_n mod (2^31-1)

Come to think of it, an implementor could very well return u_n - 1
and set RAND_MAX to 2^31-3...
 
K

Keith Thompson

I dare say that people who use C for mathematical purposes _are_ aware
of the above, and moreover are aware that

Anyone who considers arithmetical methods of producing
random digits is, of course, in a state of sin.
Alfred E. Neumann, 1951

I dare say most of them are aware that the quotation is by John von
Neumann, not the guy on the cover of Mad Magazine.

(There is, of course, a connection between Mad Magazine and computer
science; google Potrzebie Knuth for details.)
 
R

Richard Bos

Keith Thompson said:
I dare say most of them are aware that the quotation is by John von
Neumann, not the guy on the cover of Mad Magazine.

(There is, of course, a connection between Mad Magazine and computer
science; google Potrzebie Knuth for details.)

Or buy Van der Linden's Expert C Programming.

Richard
 
G

Grumble

Grumble said:
Keith said:
If an implementation's rand() function never returns 0, the
implementation is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

I was under the impression that the PRNG discussed in [1] had been
used often in the old days?

[1] "Random number generators: good ones are hard to find"
http://www-scf.usc.edu/~csci105/links/p1192-park.pdf

u_0 = 1
u_{n+1} = 16807*u_n mod (2^31-1)

Come to think of it, an implementor could very well return u_n - 1
and set RAND_MAX to 2^31-3...

Has this implementation of rand() been used often?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,162
Messages
2,570,896
Members
47,434
Latest member
TobiasLoan

Latest Threads

Top