O
orz
I am currently in the process of putting together a C++ library that
includes a variety of (pseudo-)random number generators (RNGs) and
related functionality, and I am interested in what people think the
best interface to an individual RNG instance would be. The current
interface has each RNG implemented as a class or template class with
the following methods:
//====== random number generating methods
Uint8 raw8 ();
Uint16 raw16 ();
Uint32 raw32 ();
Uint64 raw64 ();
//random bits - raw8() will return 0 to 255 with uniform probability,
//... raw16() will return 0 to 65535 with uniform probability, etc.
Uint32 randi ( Uint32 max );
Uint32 randi ( Uint32 min, Uint32 max ) {return randi(max-min) + min;}
//random integer - returns a uniform random integer less than "max"
and no less than "min".
//if no "min" is provided then the minimum is treated as zero
//note that there is no possible return value that could meet those
criteria if "min" is equal to "max"...
//... in that case, it will return any possible 32 bit value,
equivalent to raw32()
//also note that max must be greater than min...
//...if max is less than min then it will return any value that is >=
min OR < max
//that produces aproximately correct behavior in most cases when
negative signed integers
//... are used instead of the unsigned integers that these methods are
prototyped for
Uint64 randli ( Uint64 max );
Uint64 randli ( Uint64 min, Uint64 max ) {return randli(max-min) +
min;}
//random long integer - same as random integer except using 64 bit
values instead of 32 bit values
Uint32 randi_fast ( Uint32 max );
Uint32 randi_fast ( Uint32 min, Uint32 max ) {return randi_fast(max-
min) + min;}
//fast random integer - same as random integer, except optimized for
speed instead of correctness
//specifically, the distribution returned is only approximately
uniform instead of exactly uniform
//also, behavior when max == min is different - in that case, the
return value is
//... always equal min, just as if max was equal to min+1
//no 64 bit version is provided for the fast variant of randi
float randf ();
float randf (float max) {return randf() * max;}
float randf (float min, float max) {return randf() * (max-min) + min;}
//random floating point number - returns a uniform random floating
point number less than "max" and no less than "min"
//if no "min" is provided, then the minimum is treated as zero. if no
"max" is provided then the maximum is treated as 1
double randlf ();
double randlf (double max) {return randlf() * max;}
double randlf (double min, double max) {return randlf() * max;}
//random long floating point number - as random floating point
numbers, only double-precision instead of single
//====== end of random number generating methods
That's the basics for getting output from an RNG using this
interface.
I'm not at all sure about having the maximum values for randi
exclusive but the minimum values inclusive. Possibly randi (and
randi_fast and randli) should be changed to treat both max and min as
inclusive.
I'm also considering overloading to make more of those methods share a
common name, so that the compiler would automatically figure out which
to use. randi and randli could be combined, randf and randlf could be
combined, and conceivably randi and randf could be combined. However,
I'm a bit nervous about the extra ambiguity created by function
overloading, and in a few cases some of those functions would have to
get dropped (the zero parameter versions of randf and randlf in
particular). And I don't like the compile errors generated when the
compiler can't figure out which version to use.
Note this does not include any non-uniform distributions... the plan
is that for non-uniform distributions you use an external adaptor
object that takes an RNG as a parameter and uses it to generate
whatever distributions you want. That would look like this:
double a = distributions::gaussian(rng, mean, deviation);//
distributions::gaussian is a template function
They are not included inside the RNG class definitions because there
are a whole bunch of distributions that might be of interest to some
mathematician somewhere, and I don't want to clutter up the interface
with that many. But it would be possible to add one or two of the
most common distributions as methods of the RNGs if that was an
inconvenient interface to that functionality.
//====== state management methods
void seed32 (Uint32 seed);
void seed64 (Uint64 seed);
void seed (EntropyPool *seed);
//seeding functions
//seed the RNG with either a 32 bit integer, a 64 bit integer, or a
//... special object designed to help with special seeding
requirements
int serialize ( Uint8 *dest, int maxlength );
//writes the state to the destination pointer, provided that
//... the state would fit in to maxlength bytes or less
//if the state will not fit, it returns 0, otherwise it returns
//... the number of bytes written
int deserialize ( Uint8 *src, int maxlength );
//reads the state from the source pointer, provided that
//... maxlength bytes is sufficient to read the state from
//if maxlength is insufficient or if there is an error then it
//.. will return 0, otherwise it returns the # of bytes read
//====== end of state management methods
That's the basics for managing the state of an RNG using this
interface. Serialize and deserialize for saving state in a platform-
independant manner, seed for seeding, what more could you want?
Other details in the current interface:
The underlieing RNG engine definitions don't actually define all of
the standard methods... many are added by by adaptor templates that
convert the underlying RNG engine to this interface.
Some RNGs support additional special functionality (fast-forwarding,
rewinding, cryptographic checkpoints, etc).
The normal RNG interface has no virtual methods (and thus no vtable),
but you can get a virtualized version of any RNG, with all standard
interface methods available as virtual methods.
Also, there are currently 4 broad categories of ways RNGs are defined
in the library at the moment, which seems a bit excessive:
1. Standard RNGs:
Provides the complete standard interface as a simple class. Intended
for normal RNG use. No virtual methods or vtable. Only a few RNGs
algorithms are provided this way.
2. Raw RNGs:
The core of each RNG algorithm is provided as a simple class or
templated class, and it is adapted to the standard interface by a
system of templates. This is advantegous because it results in full
speed and functionality RNGs from minimal code, but it can also
produce slow compile times and possibly even extra sensitivity to
compiler bugs, so it's only really recommended for use in research
projects or other applications that may need a wide variety of RNG
implementations. A very large number of RNG algorithms are provided
this way - every algorithm included in the library, including
duplicates of those are are provided by other means.
3. Virtual RNGs:
Provides the complete standard interface through vtable, so that you
can pick which RNG you want to use at runtime. By default only a few
RNGs are available this way, but you can easily add support for any
Raw RNG via a simple template.
4. Inline-only RNGs:
Just in case you happen to like this stuff but don't want to link with
it or are unable to link with it, a very small number of RNG
algorithms are provided in an inline-only format that does not require
linking with this library. Inline-only RNGs may be missing some
functionality (like seeding from EntropyPool objects) because they are
completely stand-alone and don't rely on anything else from this
library except the integer type definitions (Uint8, Uint16, Uint32,
Uint64).
Unfortunately, I happen to be quite fond of each of those... I don't
want to have to deal with the long compile times of Raw RNGs in
ordinary applications that just happen to use RNGs, nor do I want
hundreds of RNG algorithms to each have to define as much code for
itself as the Standard RNGs, and certainly the virtual RNGs are needed
at least for research projects but the vtable calls have too much
overhead for normal use, and the inline-only RNGs are nice to have
around if there is difficulty building this library or linking to it
for some reason (or if you just don't want to link with it).
Other things to go in the library include:
statistical tests for RNG output
hash functions
statistical tests for hash functions
functions to adapt uniform random numbers to the gaussian distribution
cryptographic functions
tests for RNGs based upon automated analysis of their internal state
rather than of their output
etc
includes a variety of (pseudo-)random number generators (RNGs) and
related functionality, and I am interested in what people think the
best interface to an individual RNG instance would be. The current
interface has each RNG implemented as a class or template class with
the following methods:
//====== random number generating methods
Uint8 raw8 ();
Uint16 raw16 ();
Uint32 raw32 ();
Uint64 raw64 ();
//random bits - raw8() will return 0 to 255 with uniform probability,
//... raw16() will return 0 to 65535 with uniform probability, etc.
Uint32 randi ( Uint32 max );
Uint32 randi ( Uint32 min, Uint32 max ) {return randi(max-min) + min;}
//random integer - returns a uniform random integer less than "max"
and no less than "min".
//if no "min" is provided then the minimum is treated as zero
//note that there is no possible return value that could meet those
criteria if "min" is equal to "max"...
//... in that case, it will return any possible 32 bit value,
equivalent to raw32()
//also note that max must be greater than min...
//...if max is less than min then it will return any value that is >=
min OR < max
//that produces aproximately correct behavior in most cases when
negative signed integers
//... are used instead of the unsigned integers that these methods are
prototyped for
Uint64 randli ( Uint64 max );
Uint64 randli ( Uint64 min, Uint64 max ) {return randli(max-min) +
min;}
//random long integer - same as random integer except using 64 bit
values instead of 32 bit values
Uint32 randi_fast ( Uint32 max );
Uint32 randi_fast ( Uint32 min, Uint32 max ) {return randi_fast(max-
min) + min;}
//fast random integer - same as random integer, except optimized for
speed instead of correctness
//specifically, the distribution returned is only approximately
uniform instead of exactly uniform
//also, behavior when max == min is different - in that case, the
return value is
//... always equal min, just as if max was equal to min+1
//no 64 bit version is provided for the fast variant of randi
float randf ();
float randf (float max) {return randf() * max;}
float randf (float min, float max) {return randf() * (max-min) + min;}
//random floating point number - returns a uniform random floating
point number less than "max" and no less than "min"
//if no "min" is provided, then the minimum is treated as zero. if no
"max" is provided then the maximum is treated as 1
double randlf ();
double randlf (double max) {return randlf() * max;}
double randlf (double min, double max) {return randlf() * max;}
//random long floating point number - as random floating point
numbers, only double-precision instead of single
//====== end of random number generating methods
That's the basics for getting output from an RNG using this
interface.
I'm not at all sure about having the maximum values for randi
exclusive but the minimum values inclusive. Possibly randi (and
randi_fast and randli) should be changed to treat both max and min as
inclusive.
I'm also considering overloading to make more of those methods share a
common name, so that the compiler would automatically figure out which
to use. randi and randli could be combined, randf and randlf could be
combined, and conceivably randi and randf could be combined. However,
I'm a bit nervous about the extra ambiguity created by function
overloading, and in a few cases some of those functions would have to
get dropped (the zero parameter versions of randf and randlf in
particular). And I don't like the compile errors generated when the
compiler can't figure out which version to use.
Note this does not include any non-uniform distributions... the plan
is that for non-uniform distributions you use an external adaptor
object that takes an RNG as a parameter and uses it to generate
whatever distributions you want. That would look like this:
double a = distributions::gaussian(rng, mean, deviation);//
distributions::gaussian is a template function
They are not included inside the RNG class definitions because there
are a whole bunch of distributions that might be of interest to some
mathematician somewhere, and I don't want to clutter up the interface
with that many. But it would be possible to add one or two of the
most common distributions as methods of the RNGs if that was an
inconvenient interface to that functionality.
//====== state management methods
void seed32 (Uint32 seed);
void seed64 (Uint64 seed);
void seed (EntropyPool *seed);
//seeding functions
//seed the RNG with either a 32 bit integer, a 64 bit integer, or a
//... special object designed to help with special seeding
requirements
int serialize ( Uint8 *dest, int maxlength );
//writes the state to the destination pointer, provided that
//... the state would fit in to maxlength bytes or less
//if the state will not fit, it returns 0, otherwise it returns
//... the number of bytes written
int deserialize ( Uint8 *src, int maxlength );
//reads the state from the source pointer, provided that
//... maxlength bytes is sufficient to read the state from
//if maxlength is insufficient or if there is an error then it
//.. will return 0, otherwise it returns the # of bytes read
//====== end of state management methods
That's the basics for managing the state of an RNG using this
interface. Serialize and deserialize for saving state in a platform-
independant manner, seed for seeding, what more could you want?
Other details in the current interface:
The underlieing RNG engine definitions don't actually define all of
the standard methods... many are added by by adaptor templates that
convert the underlying RNG engine to this interface.
Some RNGs support additional special functionality (fast-forwarding,
rewinding, cryptographic checkpoints, etc).
The normal RNG interface has no virtual methods (and thus no vtable),
but you can get a virtualized version of any RNG, with all standard
interface methods available as virtual methods.
Also, there are currently 4 broad categories of ways RNGs are defined
in the library at the moment, which seems a bit excessive:
1. Standard RNGs:
Provides the complete standard interface as a simple class. Intended
for normal RNG use. No virtual methods or vtable. Only a few RNGs
algorithms are provided this way.
2. Raw RNGs:
The core of each RNG algorithm is provided as a simple class or
templated class, and it is adapted to the standard interface by a
system of templates. This is advantegous because it results in full
speed and functionality RNGs from minimal code, but it can also
produce slow compile times and possibly even extra sensitivity to
compiler bugs, so it's only really recommended for use in research
projects or other applications that may need a wide variety of RNG
implementations. A very large number of RNG algorithms are provided
this way - every algorithm included in the library, including
duplicates of those are are provided by other means.
3. Virtual RNGs:
Provides the complete standard interface through vtable, so that you
can pick which RNG you want to use at runtime. By default only a few
RNGs are available this way, but you can easily add support for any
Raw RNG via a simple template.
4. Inline-only RNGs:
Just in case you happen to like this stuff but don't want to link with
it or are unable to link with it, a very small number of RNG
algorithms are provided in an inline-only format that does not require
linking with this library. Inline-only RNGs may be missing some
functionality (like seeding from EntropyPool objects) because they are
completely stand-alone and don't rely on anything else from this
library except the integer type definitions (Uint8, Uint16, Uint32,
Uint64).
Unfortunately, I happen to be quite fond of each of those... I don't
want to have to deal with the long compile times of Raw RNGs in
ordinary applications that just happen to use RNGs, nor do I want
hundreds of RNG algorithms to each have to define as much code for
itself as the Standard RNGs, and certainly the virtual RNGs are needed
at least for research projects but the vtable calls have too much
overhead for normal use, and the inline-only RNGs are nice to have
around if there is difficulty building this library or linking to it
for some reason (or if you just don't want to link with it).
Other things to go in the library include:
statistical tests for RNG output
hash functions
statistical tests for hash functions
functions to adapt uniform random numbers to the gaussian distribution
cryptographic functions
tests for RNGs based upon automated analysis of their internal state
rather than of their output
etc