Math.random

  • Thread starter Dr J R Stockton
  • Start date
D

Dr J R Stockton

Considerable information about Math.random internals and derived
insecurities is in a 325kB PDF by Amit Klein *via* (!== at)
<http://www.trusteer.com/temporary-user-tracking-in-major-browsers>.

The flaws in the values of Math.random can in principle be overcome by
writing one's own; but the security flaws remain as long as an attacker
can use, in pages that he writes and you read, the present default
Math.random.

It's a good idea to read the newsgroup c.l.j and its FAQ. See below.
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]
rlyn.invalid>, Tue, 9 Jun 2009 18:15:46, Dr J R Stockton
Considerable information about Math.random internals and derived
insecurities is in a 325kB PDF by Amit Klein *via* (!== at)
<http://www.trusteer.com/temporary-user-tracking-in-major-browsers>.

The flaws in the values of Math.random can in principle be overcome by
writing one's own;

FYI, I expect soon to have achieved this; for integers, the code should
handle numbers up to the megadigit (hex) range, with acceptable speed
for reasonable sizes. There is, of course, difficulty in generating a
correspondingly-good seed. But, as the seed is exposed, randoms can be
generated reproducibly. Does anyone have the relevant volume of Knuth
handy?
 
E

Evertjan.

Dr J R Stockton wrote on 19 jun 2009 in comp.lang.javascript:
In comp.lang.javascript message <[email protected]
rlyn.invalid>, Tue, 9 Jun 2009 18:15:46, Dr J R Stockton


FYI, I expect soon to have achieved this; for integers, the code should
handle numbers up to the megadigit (hex) range, with acceptable speed
for reasonable sizes. There is, of course, difficulty in generating a
correspondingly-good seed. But, as the seed is exposed, randoms can be
generated reproducibly. Does anyone have the relevant volume of Knuth
handy?

Why megadigit?

99.962468 % of the pseudo-random sequences used in clientside Javascript
are about non negative integers between -0.3 and 365.7.

The perfection of the randomness isn't even that important,
as long as an accidental repeat of the same integer is not within say 4
steps.

So the function can be simple and the seeding can be done by a subset of
+new Date().

A repeatable seed is interesting however for code testing.
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]>
Dr J R Stockton wrote on 19 jun 2009 in comp.lang.javascript:


Why megadigit?

It is necessary to include multidigit. Using just ordinary JavaScript
arithmetic, one cannot without rounding errors produce Randoms of more
that 26 bits of value (exact multiplication is required). The
arithmetic has to be done with arrays of digits. JavaScript arrays are
bounded in size only be available memory, and ISTM that I can have the
necessary number of arrays each with 250,000 elements. The elements I
use are digits base 65536, hence a factor of 4 with some to spare. But
using that many is purely optional.

My <js-randm.htm> demonstrates such a generator more-or-less embedded in
a Form, and I an in the middle of changing its insides to a form-
independent version.
99.962468 % of the pseudo-random sequences used in clientside Javascript
are about non negative integers between -0.3 and 365.7.

Fewer that 0.037532% of the Dutch population are EjH ; but that subset
is clearly not insignificant.

The perfection of the randomness isn't even that important,
as long as an accidental repeat of the same integer is not within say 4
steps.

That depends on the application. And algorithms developed in JavaScript
can be re-implemented elsewhere.

So the function can be simple and the seeding can be done by a subset of
+new Date().

Amit Klein's cited paper shows that Math.random is commonly not as good
as one might reasonably have hoped, and also is a security leak.
Providing code for an alternative cannot fix the leak; but it can set a
good example. Seeding, however, is difficult; except that the client
can always be asked to enter a suitable seed, which can either. be used
alone or entangled with the time.

A repeatable seed is interesting however for code testing.

Indisputably.

That could be introduced in ECMA 5+ by simply saying that Math.random()
would do what it "always" has, but Math.random(Object) would use the
object to store the internal state, seed included. With different
Objects, one could have independent Random generators.
 
T

Thomas 'PointedEars' Lahn

Dr said:
Fewer that 0.037532% of the Dutch population are EjH ; but that subset
is clearly not insignificant.

What does "EjH" mean in this context?


PointedEars
 
E

Evertjan.

Dr J R Stockton wrote on 21 jun 2009 in comp.lang.javascript:

[..]
Indisputably.

That could be introduced in ECMA 5+ by simply saying that Math.random()
would do what it "always" has, but Math.random(Object) would use the
object to store the internal state, seed included. With different
Objects, one could have independent Random generators.

Something like this?

Math.random(myObject)
0..0.99999etc, where:
myObject.seed read/write,
doublevalue or null/undefined/false for auto reseeding,
0 for remember seed/step
myObject.step read/write,
integer step since last seeding, default 0 [= first step],
null/undefined/false or nextstep value for next step
cannot have value higher than next step [???]

Math.random()
Auto seeding on first call,
remember seed/step as in present version

Math.random(Null)
Auto reseeding

Math.random(-3.4,365.3)
Integer autoseeding, fast, easy implementation, perhaps not too
perfect, value interval includes endvalues
Math.random(365)
0... 365 included
 
E

Evertjan.

Thomas 'PointedEars' Lahn wrote on 21 jun 2009 in comp.lang.javascript:
What does "EjH" mean in this context?

Incomplete quote hides telling context.

Please quote correctly, Thommie. ;-) See the FAQ.

In fact about 6,1614294516327788046826863832409e-6 % of the Dutch
population is EjH and this number is changing minute by the minute,
depending on seed, fatalities and import/export.

This value probably will change to zero percent in an instant,
or become indefinite or 100 % in a major disaster.
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]>
Dr J R Stockton wrote on 21 jun 2009 in comp.lang.javascript:

[..]
Indisputably.

That could be introduced in ECMA 5+ by simply saying that Math.random()
would do what it "always" has, but Math.random(Object) would use the
object to store the internal state, seed included. With different
Objects, one could have independent Random generators.

Something like this?


What I had in mind, and have implemented (apart from some necessary
polishing) in <URL:http://www.merlyn.demon.co.uk/js-randm.htm>, is not
greatly like your suggestions.

There, an algorithm of the form
X[n+1] = (X[n]*A + B) % C
is used, and the Object holds integers X A B & log2(C). Since X*A needs
to be exact, and a good A is likely to be of the order of log2(C-1), one
cannot using this and ordinary arithmetic make a reliable 32-bit
generator. Therefore, X A & B are now arrays of 16-bit integers (could
be 24, since 24+24<53), automatically giving the aforementioned
megadigit capability.

That's not the only possible algorithm; but AIUI it's generally
suitable, with well-chosen A B C, for non-crypto work. One could
without much difficulty implement a long shift-register generator; but
it is necessary to determine efficient taps (algorithm uses a very long
array of bits each cycle moves all bits one place to the left, and sets
the vacated LSB to the XOR of the old MSB and some of the other bits).

The function alters the value of X in the Object, and returns what
should be the result of normalising sign = +, mantissa = any 53-bit
integer, exponent is constant, such that the range is 0.0 <= R < 1.0.

One sets the initial X with either a chosen number or a random one
obtained somehow.
Math.random(myObject)
0..0.99999etc, where:
myObject.seed read/write,
doublevalue or null/undefined/false for auto reseeding,
0 for remember seed/step
myObject.step read/write,
integer step since last seeding, default 0 [= first step],
null/undefined/false or nextstep value for next step
cannot have value higher than next step [???]

Math.random()
Auto seeding on first call,
remember seed/step as in present version

Math.random(Null)
Auto reseeding

Math.random(-3.4,365.3)
Integer autoseeding, fast, easy implementation, perhaps not too
perfect, value interval includes endvalues
Math.random(365)
0... 365 included

I think 0 to 364 would be better; it goes well with zero-based arrays,
and matches other languages.


The core would be more easily implemented in another language, for
example one supporting 32-bit multiplication and 64-bit addition.
 
E

Evertjan.

Dr J R Stockton wrote on 21 jun 2009 in comp.lang.javascript:
I think 0 to 364 would be better; it goes well with zero-based arrays,
and matches other languages.
ok

The core would be more easily implemented in another language, for
example one supporting 32-bit multiplication and 64-bit addition.

Assembler and Forth would be the best and second best.

Couldn't is be done with binary shifts and and binary and/or/exor only?

That would speed up the process as multiplication has overflow checking
that is not needed in randomizing, methinks.

My binary mathematic skills are not that active.

Automatic seeding would still have to be adressed, my preferences would go
to hardware implementation by measuring spontaneous decay times of a radio
isotope with a t/2>500y.

Cyberutopia?
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]>
Dr J R Stockton wrote on 21 jun 2009 in comp.lang.javascript:


Assembler and Forth would be the best and second best.

Or Delphi, or (one hopes) a language already used for writing JavaScript
engines.
Couldn't is be done with binary shifts and and binary and/or/exor only?

That would speed up the process as multiplication has overflow checking
that is not needed in randomizing, methinks.

My binary mathematic skills are not that active.

I think one must use at least some ordinary arithmetic when using
JavaScript to construct from pieces a 53-bit mantissa in a Number. On
other languages it can easily be done more directly - just drop the 64
bits into the mantissa of a suitable Extended value.

A code line can be removed if the internal number-length (right operand
of MOD in the description) is always a multiple of 16.

Improvements may be possible elsewhere in my code; X & 0xFFFF is faster
in FF3 than X % 0x10000 so I should test other browsers : also faster,
and always equivalent. Changed.


Automatic seeding would still have to be adressed, my preferences would go
to hardware implementation by measuring spontaneous decay times of a radio
isotope with a t/2>500y.

A well-designed PC would have a hardware random-bit generator, filling a
few kB of local RAM which would be readable in something like the manner
of the RAM of an RTC chip, or better. By the time the software is ready
to read it, there would be sufficient new bits available. That would be
readable by all languages, including JavaScript.

A purely electronic noise generator could easily enough be used, driving
circuits designed to remove both all possible and all detectable
regularity or bias.

I have seen and partly read a book "One Million Random Digits"; it was
typeset from paper tape by flexowriter; a selection of those could be
entered as a seed.
But see <URL:http://www.merlyn.demon.co.uk/humorous.htm#Rand>.
It was much older than
<http://www.amazon.co.uk/Million-Random-Digits-Normal-Deviates/dp/083303
0477/ref=sr_1_1?ie=UTF8&s=books&qid=1245697700&sr=8-1>; probably a First
Edition.

See <http://en.wikipedia.org/wiki/Random#External_links>.
 
D

Dr J R Stockton

In comp.lang.javascript message <2NSdne88j4HQyd3XnZ2dnUVZ8mdi4p2d@gigane
[Better version of Math.random]

I cannot tell whether you have recently looked at my
I think one must use at least some ordinary arithmetic when using
JavaScript to construct from pieces a 53-bit mantissa in a Number.

Two possible strategies would be:

1. work only with fractional numbers between 0 (included) and 1
(excluded), e.g. by using lines like "x -= Math.floor(x)" as needed,

2. work only with unsigned 32-bit integers, e.g. by using lines like
"x >>>= 0" as needed, and provide a function that takes two such
"random" integers, divide them by suitable constants, add the
results and truncate to deliver a fraction in [0..i).

Both using some ordinary arithmetic, of course.
While Strategy 1 may seem the more straightforward, it has the
major drawback that few PRNG algorithms work with floating point
arithmetic. The most obvious candidate is additive lagged Fibonacci,
but while its "randomness" is arguably better than the Lehmer linear
congruential method that you suggest, the results are not considered
good by current standards, and fail the more demanding tests. Still,
one might try to combine several sources, "shuffle" the output, etc.

Strategy 2 allows one to rely on a much more comprehensive repertoire
of well-researched algorithms, from Marsaglia's pragmatic KISS32
(http://groups.google.com/group/comp.lang.fortran/msg/6edb8ad6ec5421a5)
to the impressive theory behind the Mersenne Twister.

I've not been able to remember as much as I would wish to about methods
other than Lehmer. In a JavaScript context, there must be five classes
of algorithms for implementing Math.random :

1 The low-grade muck in actual browsers;
2 Lehmer with at least 53 bits and conversion to IEEE Double done
properly. That would give all multiples of 2^-53 in the range with
equal probability and gave a cycle length of at least 2^53;
3 Something in between 2 & 4;
4 Cryptographically sound, commercial grade;
5 Cryptographically sound, national security grade.

Of those, 1 should be eradicated, and 2 seems good enough for
Math.random (however, the ISO/IEC standard should say something about
expected quality, e.g. Cryptographic quality is not expected; but <as
above>).

of the others, 5 & 4 are beyond me; and 3 seems horribly like falling
between two stools (hoping you know the expression).

As an example, the following implements KISS32:

function Random() {
var x = 123456789;
var y = 362436069;
var z = 21288629;
var w = 14921776;
var c = 0;

function kiss32() {
x += 545925293;
x >>>= 0;

y ^= y << 13;
y ^= y >>> 17;
y ^= y << 5;

t = z + w + c;
z = w;
c = t >>> 31;
w = t & 0x7fffffff;

return x + y + w >>> 0;
}

this.random = function() { // provides a fraction in [0..1)
var x = (kiss32() / 67108864 + kiss32()) / 67108864;
return x - (x >>> 0);
}
}

However, as it stands, each new Random() yields exactly the same
sequence. The constructor should allow arguments to set the object's
initial state (the variables x, y, z, w and c), making sure that
the constraints are met (Marsaglia: "x can be any 32-bit integer,
y can be any 32-bit integer not 0, z and w any 31-bit integers not
multiples of 7559 c can be 0 or 1.") Further methods may be added to
save the state and reset it to a previously saved value.

They can be set, as in my cited page, by passing in an Object containing
those numbers; then, by having several Objects, one can have several
random streams. There can be an Object checker, too.


Even without extra hardware, there are enough sources of randomness on
most PCs to maintain an adequate pool of entropy, like /dev/random.

However, one must consider who is getting access to that pool. If
you've not read Amit Klein's paper (/via/
<http://www.trusteer.com/temporary-user-tracking-in-major-browsers>),
then I think you should. Essentially : if the results of Math.random
depend on *information* in the computer, then a Web page using
Math.random can be a security leak for that information. ISTM that the
pool must be much larger than at first seems necessary, if the tasty
ducks and tadpoles in it are not to be in some danger of being caught.

The problem is how to access them in ECMAScript.

That should be impossible, rather than a problem.
 
D

Dr J R Stockton

In comp.lang.javascript message <rpWdnTScPMPX2d_XnZ2dnUVZ8kNi4p2d@gigane
Dr J R Stockton :

Lehmer's algorithm with a well-chosen multiplier was the method of
choice thirty years ago. Today, it is mainly used as a one-liner to
seed other methods with less statistical bias and whose output cannot
not be as easily predicted from a few known samples.

I'm not so sure about "mainly used"; you may be underestimating
the number who know no better.


Actually, IMHO, quite adequate for its intended uses: somewhat
unpredictable visual effects, games, recreational simulations, etc.

Agreed; nevertheless, since an IEEE Double has 53 bits of resolution,
and 64-bit Lehmer is sufficiently easy to implement, and other good,
easy methods are well enough known (if not by me), it seem inexcusable
to provide anything other than an equivalent to a random selection from
the 53-bit integers then divided by 2^53. Things should be as good as
easily possible, even if the necessity is not obvious.

In Safari 4, Math.random seems to be slightly worse, in at least one
respect, than that in Safari 3.1.1.


But neither can your function RRN1(). It is actually worse than most
Math.random implementations - 32 bits vs. 48.

It is an emulation of Borland's 32-bit one, in Pascal & Delphi (but not
necessarily the latest Delphi). That might be considered of very
adequate quality when introduced, 22 years ago or earlier; but, for
similar reasons, not now. Random in Delphi should return an Extended;
and all one need do for that is to take a good random Int64 and precede
it by a positive signed word constant for the appropriate coded
exponent.

Each value would occur exactly the same number of times in a cycle -
but so would a simple counter. Apart from that, I fail to see what
probability would be equal. For instance, the lowest-order bit would
alternate quite regularly between 0 and 1.

The stated conditions were intended to be necessary, but not to be
satisfactorily sufficient.

An IEEE Double with a regular alternating pattern in its LSB, while not
ideal, is considerably better that what three recent PC browsers
provide, which appear to be IEEE Doubles with 20 or more LSBs all zero.

I fail to see the distinction. Any vendor that sells a "commercial
grade" CSPRNG which are not "national security grade" is a swindler,
since there are plenty of public domain CSPRNGs that are as secure
as anyone, even the NSA or the GCHQ, may use or fear.



I disagree on both points, as above.


There are quite decent ECMASCript implementations of excellent hash
and encryption algorithms that can be used to make even "national
security grade" CSPRNGs by the standard methods using cryptographic
primitives. E.g.: http://www.movable-type.co.uk/scripts/sha1.html
and http://www.movable-type.co.uk/scripts/aes.html. 256 bit AES in
counter mode should really be as secure as anyone would wish, or fear.
(It is approved by the NSA even for TOP SECRET classified documents.)
The main problem is that it requires quite elaborate code.



It seems to me that your placement of the stools is rather arbitrary.
I would definitely set 2 much higher, since there are many PRNGs available
which much better statistical behaviour than Lehmer's algorithm, and
more entropy.

But there seems a risk in trying to persuade browser makers to provide
anything better than what I meant in 2; they might then prefer not to
bother.

On the other hand, even 256 bit AES does not, per se, address the
problem of seeding the generator in a suitably impredictable way.

[Seeds]
They can be set, as in my cited page, by passing in an Object
containing those numbers; then, by having several Objects, one can
have several random streams. There can be an Object checker, too.

Something like this?

<g> IMHO, needs comment : // 134217728 = 2^27 </g>

Something like. Passing in the Seed suffices; my idea would pass in
also the numbers within the function, so that one piece of code could
produce various types of Random.
function Lehmer(seed) {
var x = seed; return (lehmer27() / 134217728 + lehmer27()) / 134217728;
}

I was not sure whether I ought to be able to understand whether, if the
two calls return integers in that range that have no significant
correlation, that should give an IEEE Double with all the desirable
properties = until I realised the value of 2^27. In particular, a
result at least 0.5 must be a multiple of 2^-53, in an IEEE Double; but
a smaller one can perhaps be an odd multiple of 2^-54. It might be
better to use lehmer26 and lehmer27; or to mask out a bit.


My suggestion is to replace Lehmer, e.g., by Marsaglia's KISS32. It
is not much more complicated and passes much more stringent
statistical tests. It also is much less easily predictable from
observed output. Finally, its internal state has slightly more valid
values than there are valid RFC 4122 UUIDs, which makes it at least
as resistant to accidental collisions. Provided, that is, that one
finds a way to seed it with truly random values...

There should be provided an obvious way to seed it with known values
(Delphi : assign to RandSeed); possibly as setRandSeed(X). That moves
the worry into another routine, which the user can provide - somehow.

But it is not a practical solution for the World Wide Web. The challenge
there is to maintain a large enough entropy pool with only the information
available to the browser. I think it is possible - but not by looking
up RAND corporation's One Million Random Digits, or, indeed, on relying
on voluntary user input, which tends not to be random at all.


That depends on the application. Web pages do not necessarily return
results to the server; they can instead be shown at the client end to
the user. In that case, it's the user's responsibility to enter a seed
with enough entropy for the purpose.

I wonder whether the current RAND book contains the same 1,000,000
digits as the first edition - given the estimate that probably one of
the original digits is wrong?



Is there a small set of URLs that you can recommend to Garrett Smith for
FAQ section 5.5, <http://jibbering.com/faq/index.html#randomNumber>,
ones leading to the world of good Random Number generation? FAQ
reference 1 is OK, 2 content is wrong, and you've seen 3 (which has some
such URLs).



IMHO, your thoughts should be known to those working on ECMA 262 Final
Draft 5
<http://www.ecma-international.org/publications/files/drafts/
tc39-2009-025.pdf>
which includes an address for feedback "by July 15, 2009".
 
D

Dr J R Stockton

In comp.lang.javascript message <U_ydnbAppckDzt7XnZ2dnUVZ8jdi4p2d@gigane
Dr J R Stockton :
There should be provided an obvious way to seed [the generator]
with known values (Delphi : assign to RandSeed); possibly as
setRandSeed(X). That moves the worry into another routine, which
the user can provide - somehow.

Yes, but it is not quite trivial to provide a seed that has a
reasonable probability of being *absolutely unique* over zillions of
instances in time and space.

And therefore the problem should be shifted to a different routine; one
version of that can be provided by the system, but the seed must be
transported from one to the other as the value of a variable (simple or
compound). Then one can use instead a better routine to generate the
seed; or one can re-use the same seed to get, when debugging, the same
random numbers again.

In fact, a seed generator is just a better, slower RNG.

Seed generation, seed insertion, and seed usage should be separated.
I shall give it an attempt, though, in another post.



http://en.wikipedia.org/wiki/Pseudorandom_number_generator is good
and should get one started on the subject.

http://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf gives
a comprehensive view of actual generators' performances.

But I'm not sure those references belong in the FAQ, general
pseudo-random number generation is hardly an ECMAScript issue.

Not the montreal one; but presenting "Pseudorandom_number_generator"
will indicate a route to, presumably, all else. It can replace the
current second link.
It is pretty useless, but why is it wrong?

Both say, near enough, "between 0 and 1" but 1 remarks "0 (inclusive) to
1 (exclusive)" and 2 does nothing like that.

But I use 2 and its brothers frequently (but local copy of a more
legible no CSS) version).
Yes. Would you please change the link on my name to a more definitive
version as soon as I have written one ? I forgot to declare var t in
function kiss32(), with the abominable result of making t global.
Hardly the sort of thing to recommend.

Of course. Let me know. Note : the address of this article includes
ISO 8601 YYWW.


I don't know. I rather believe it is an issue to be addressed by
user-defined objects or specialised libraries, since different
solutions may be better for different problems. But if I manage to
bring something together that the group approves of, I shall follow
the consensus.

You manifestly know far more than they would want; but pressing for all
53 effective bits of the Number to be evenly-distributed and random
could be useful. I don't know whether a 53 or 64 bit cycle length might
be preferred. And they could add that crypto-grade is not expected,
which serves the useful purpose of pointing out that better
possibilities exist.
 
D

Dr J R Stockton

In comp.lang.javascript message <JbednQAsP7H5Y97XnZ2dnUVZ8sFi4p2d@gigane
Johannes Baagoe :


Here it is: introducing the Random object. Comments are welcome.

The link in the master of my page is updated, to <that> at Google.
After discussion finishes, you could very easily put your article onto a
web site as a *.txt file; and publishing future changes would not then
require updating URLs.

Comment here is on details only.


The constructor may be called with any number of seeds of any kind -
dates, strings, numbers, events, etc. (They are converted internally
to strings.)

Conversion of new Date() to string loses randomness, as the strings lack
milliseconds. It also adds cross-browser non-reproducibility, as string
formats vary. I suggest Date Objects should be converted first to
Number, using a unary + sign,
You may, for instance, write things like

r = new Random(
new Date(),
+new Date() // if not handled internally
document.location.href,
document.documentElement.clientWidth,
document.documentElement.clientHeight
);

which should make it rather unlikely that two users ever generate the same
sequence of numbers.

For test, it is desirable that it should be possible for two colluding
user (or a self-colluding one) to get the same sequence repeatedly. But
the default should be that repeats are grossly improbable.

function kiss32() {

// IMHO, that needs to cite a reference for kiss32.

if (arg instanceof Date) {
seed += arg.getTime();
// or seed += +arg; // should be equivalent.
seed += arg.getTimezoneOffset();
That gives the *appearance* of adding minutes to milliseconds. It might
need a comment to show that it is not a mistake. World-wide, it only
adds about 5.3 bits of randomness; in a fixed location, 0 or 1 bit. It
may not be worth-while. A user can always put it in as an argument if
wanted.

Feedback on such and other issues would be much appreciated.


For test purposes, it should be possible for people on different systems
in different places to get the same results. For that, they can use an
external new Date(arguments) to simulate new Date() consistently.
Having getTimezoneOffset internally makes matters location-dependent.
 
D

Dr J R Stockton

In comp.lang.javascript message <v_GdnZ3zA6Mdh9nXnZ2dnUVZ8vidnZ2d@gigane
Dr J R Stockton :




I am not sure I understand what you mean, do my proposed Random
objects meet your specifications ?


I'm pretty sure, without actually having executed it yet, that the
version I have most recently seen does so; but not sure about the most
recent one that I had read when I wrote that.

OTOH, you have
reSeed(seed1, seed2, ...), getState(), setState(state)
but no way to generate a state without affecting the generator.

How about offering
makeState(,,,,), setState(state), getState(), reSeed(state)
where reSeed is possibly not needed if any argument of makeState(,,,,)
can be a state? I think it separates the functionality better, which
can often enable users to do things which had not been foreseen.

Analogy : Method getYear, returning 0 to 99 for 1900 to 1999, has that
fault; implementing it as getFullYear would have meant that
those wanting a number less than 100 would have needed to use
%100 (B.C. years can be disregarded here) and those wanting the
full year would always get it. Then, who wants 0 to 99? That's
OK in years 1900-1999, as years 1900-1909 are in the Distant
Past; but in the then-nearly-imminent years 2000 to 2009, it is
"00" to "99" that is needed. Add method getYY, returning a two-
digit string; and +getYY is another way of getting a number 0 to
99 if wanted.

My Date2 Object has a getYY equivalent.


This thread has bifurcated. Perhaps I should say that, while I have
broadband Web access, I still use dial-up for News and Mail. It's a
matter of what software does what with/without an operational licence.

I do not guarantee that new Random(seed1, seed2, seed3) will always
yield the same sequence on different implementations of ECMAScript.
Different implementations of the toString() method of Objects may
yield vastly differing results. If all the seeds are either Numbers
or Strings, it should be reasonably safe, though. And a state saved
with getState() should be completely transportable.

Therefore, .toString on types other than Number or String should be
avoided, where practical. As far as I know, Number.toString is browser-
independent ... but ECMA says it need not be. I imagine it's safe for
integers no bigger than 2^53.

But the main problem I wanted to address is that of providing truly
unique sequences that can be used to make UUIDs.

I rather wonder whether one should expect Math.random to do that. It
might be better to have Math.random being essentially numeric 53 or 64
bit, and to have another, slower method to produce GUID strings.
There's surely a lot to be said for having Random seedable so that it
always gives the same sequence while having no chance of thereby
affecting GUID generation.

In fact, if Math.random can be re-seeded and is used for GUIDs, it may
be possible for an attacker to force a known GUID and to achieve a
breach of security.


Both are much too short, in my opinion. Not that anyone would ever want
to generate 2^53 random numbers in ECMAScript,
...
Same response.



FYI : <js-randm.htm> now has a 32-bit routine in which the
multiplication is done with three 16-bit * 16-bit multiplications,
suitably combined without rounding, by first decapitating. That now
gives the right results, matching Pascal's Random. I won't explain more
here because it might be improved. It should be better than some
current Math.random; but it would not make a good Math.random.
 
D

Dr J R Stockton

In comp.lang.javascript message <O4GdnVpwkKgpsdvXnZ2dnUVZ8qNi4p2d@gigane
Johannes Baagoe :


Dr J R Stockton :


The problem is that apparently, some people *do* use Math.random to
produce supposedly unique IDs. I suppose they call it repeatedly
in order to generate successive groups of hexadecimal characters,
or something like that. But given that in most implementations, the
sequence of "random" numbers is *only* a function of startup time,
any two users starting at the same millisecond end up generating the
exact same IDs.

The best answer is to stop them doing it. One can provide a GUID-
generator in script independently of Math.random. One can hope that
ECMA will add a GUID-generator to the language, specified to be of
sufficient quality.


One can presumably generate entropy by timing the intervals between
keystrokes, after requesting the user to copy-type a text in mixed Welsh
and Finnish while playing the Overture to Tannhauser on one speaker and
some Stockhausen on the other, for distraction.

Given that, if random numbers are needed for arithmetic, the task
probably calls for a few sequences of a lot of them, the random number
generator should be reasonably fast. However, GUIDs are probably not
wanted at a great rate, but need more entropy. Therefore a good GUID
generator would be good for generating good seeds for numeric-random.

There is also, in my opinion, a lot to be said for having Random
*repeatedly* reseedable, e.g., in event handlers, in order to let
it collect entropy. My reSeed method does just that - please note
that is does not *replace* the existing state, it *combines* it in
a complex way with information from the new seeds.


In my approach, it is entirely up to the user

No. It is up to the coder of the web page, not the viewer thereof; an
innocuous looking GUID might hold information dependent on the value of
the seed. The local part of my message-ID, for example, encodes at
least part of the time_t of generation, and I think also the identity of
the generating system.

whether or not to
replace Math.random itself. It could easily be done, e.g., with

var myPRNG = new Random('');
Math.random = myPRNG.random;

and this would indeed provide completely predictable successive
Math.random()s, but I fail to see how introducing a new Random constructor
make matters worse than they are (one may redefine Math.random any way
one deems fit, such are the rules of ECMAScript), and I also fail to see
how this could be exploited unless the victim is uncommonly cooperative.




And same response.

Perhaps an indication of why we seem to be arguing about different
things is in your statement on js-randm.htm: "A pseudo-random sequence
generator is probably used. Such have a definite cycle length, which
cannot be more than 2^n for a generator with an n-bit seed."

Changed. But ISTM that it is common practice to use "seed" to refer to
that from which the next generation of random number is derived; the
Founder of the Line merely being the determining first.


Not so. The sequence cannot be more than 2^n for a generator with an
n-bit *state*. The length of the *seed* is quite another matter. For
instance, one implementation (*) in ECMAStript of the Mersenne
Twister is *seeded* with an arbitrary, possibly zero, number of 32-bit
unsigned integers, but its period is an amazing 2^19937 0 >if one provides much less than 623 such integers. On the other hand,
if one provides only one seed to the constructor, there can only be
at most 2^32 *different* such periods for the resulting instances.

That seems to me to be a partial seed; either it does not provide enough
bits to determine the initial state, which is otherwise indeterminate,
or sufficient implied bits are provided internally.



A generator should be initialised by a variable sufficient to determine
its entire state, and it should be possible to read that state back into
a variable. The variable must be capable of holding as many bits as the
generator can; but an instance of the variable need not appear to have
that many bits.

Example : I initialise 32-bit Lehmer demonstrations with 0, but
that's shorthand for a 64-bit IEEE Double of which only 32 bits-
worth are actually used.

There should be at least one way of generating such state-variables with
entropy-laden values; and the coder should be able to provide other
ways.

I don't think it is *necessary* to provide a routine to add entropy to
what already exists in the generator proper, since (with the above) one
can read the state, add entropy to that, and fully reinitialise the
generator with that -- and that seems a tidier approach, with good
separation of function. But it may be *convenient* to do so, perhaps by
a non-primitive routine.



And now it does not, since it now has two 32*16 multiplications instead.
As it happens, I did the same thing some time ago in an attempt to
simplify the aforementioned version of the Mersenne Twister. Here it is :

function mult32(n1, n2) {
var a0 = n1 & 0xffff,
a1 = n1 >>> 16,
b0 = n2 & 0xffff,
b1 = n2 >>> 16;
return a0 * b0 + (a0 * b1 + a1 * b0 << 16) >>> 0;
}


Slightly related : the numbers AB and CD, where each of A B C D
represent N (=many) digits can obviously be multiplied in time of the
order of (2N)^2 for calculating AC AD BC BD, plus a few times N for
additions. One can rearrange the work to take 3N^2 for multiplication,
plus a few additions/subtractions. That takes 3/4 of the time. Now
partition A into Aa and C into Cc, and the same trick done also for AD
BC BD gets to 9/16 of the time. Recurse a total of log2(N) times, and
get to (3/4)^log2(N). If interested, a Web search for Brassard
Bratley multiplication might lead to it; and one for ISBN
0-13-023169-X should do so.
 
D

Dr J R Stockton

In comp.lang.javascript message <Lt2dncfE8MSCidvXnZ2dnUVZ8jWdnZ2d@gigane
Dr J R Stockton :


Some thoughts on that :

The problem arises because the vast majority of PRNGs generate
integers, not fractions between 0 (included) and 1 (excluded). And
the ones most suitable for ECMAScript produce 32-bit integers

I disagree strongly. The ones most commonly used may be 32-bit; but
they are NOT suitable. There is no difficulty in programming 64-bit
Lehmer, and 64-bit code using the kiss32 approach, on any modern
machine; it just takes a little more work.

Analogy : I have in the past, on a primitive 8-bit machine with a 1 MHz
clock and 3125 bytes of RAM (and no real OS, bar a loader) programmed
three-channel 16-bit * 16-bit accumulative multiplication of, as I
recall, more than 256 products, with square rooting and output (granted,
it did have an 8-bit multiplier, soldered in by me).

VastRand, in <js-randm.htm>, does exact 64-bit Lehmer in JavaScript in
only a few lines of my code; it would be easier in ASM or Pascal (or, I
presume, on C/C++) on a 32-bit machine, especially if the flexibility
were omitted.

64-bit Lehmer, sagaciously converted to 53 bits, may not be the best
solution of its order of complexity; but it is surely better than what
the inferior browsers use for Math.random, when using Knuth-approved
constants.

Conversion from an integer of sufficient random bits to an IEEE Double
of 53-bit randomness is no real problem; one just inserts the best 53
bits into the mantissa, and normalises. VastRand does that, not
necessarily optimally as yet.

In Pascal/Delphi, where the ten-word Extended type exists with an eight-
byte mantissa, it's even easier; just prepend the correct sign and
exponent in a word, and call it an Extended.

, while
the best resolution one can attain in ECMASCript Numbers is 53 bits.

Following details read and largely agreed with.

Let a and b be two "random" unsigned 32-bit integers obtained by some
suitable method.

If they are obtained by successive calls of the same 32-bit method using
one 32-bit state, the second call adds no real randomness. One must
have sufficient state bits; and if they are in different generators the
generators should have co-prime lengths (except maybe if they are truly
independently seeded).
 
D

Dr J R Stockton

In comp.lang.javascript message <Bp6dnTw4NKTQH9jXnZ2dnUVZ8oxi4p2d@gigane
Johannes Baagoe :


Dr J R Stockton :


In that case, they should call the constructor with the same arguments,
like new Random('foo', 'bar') or simply new Random('').

Indeed; but it is the values of the arguments that matters, not the
written form; 'new Date()' is not the same for long. I suggest that the
examples shown should include

r = new Random("Any arbitrary fixed string will give a fixed sequence")

and the text should say that the order of the arguments matters.


Also : if you instead have a single argument which is an array of
pseudo-arguments, that at worst adds two characters [] to each call, and
it enables the initialisation to be handled as a single entity.

var arr = ["Trondheim", "Boulogne", "Cockfosters", "Spaghetti"]
var r1 = new Random(arr)
var r2 = new Random(arr)
My purpose with the given example is to show ways in which an
application programmer could make it very unlikely that two users
generate the same random numbers. If the hashing mechanism of seeds
functions adequately (which should thoroughly analysed and tested),
the biggest risk would be that they both load the same Web page at the
exactly same time in windows of exactly the same size. And if that is
not deemed safe enough, the programmer may add whatever other seeds she
wants to decrease the likelihood even more, perhaps the User-Agent string
or other results of browser sniffing or even cookies.


There is the problem: I don't know of anything that is universally
accessible to ECMAScript and more unforeseeable than the current
time. It is rather improbable that two users initialise their random
numbers at exactly the same millisecond, but it is not *grossly*
improbable. Not to mention that I do not really trust all platforms
to measure time with true millisecond granularity.


Windows XP sp3 IE7 JScript 5.7.18066 changes the value of +new Date() 64
times per second, jumping by 15 or 16 milliseconds. Ad I recall, Win98
1st Edn IE4 changed it 18.2 times a second. My page section
<URL:http://www.merlyn.demon.co.uk/js-dates.htm#Ress> refers, including
(on yellow) results for your browser.


Browser JavaScript cannot generate unique GUIDs without input by the
user (AIUI, it can open a window showing a site that shows fresh random
numbers, but it cannot see those numbers; it would have to ask for
copy'n'paste).




ISTM that we are mingling two * two = four distinct questions : the
combinations of

Generating random IEEE Doubles
Generating GUIDs

and

Doing those with ECMA 262-3 script
Considering how ECMA 262-N could be better

A scripted GUID generator has only time as a source of randomness; and
if other system information is used then there is a security leak, even
if only that an external attacker who can read the GUIDs may be able to
tell that two GUIDs probably came from the same system.

So I think it would be better not to consider implementing GUIDs in
browser JavaScript - Windows Script Host scripting is another matter,
since that can search the whole computer for entropy sources - or to do
that separately from considering what Math.random should be or could be
substituted by.

Not even that, since two users at the same location *at the same time*
are unlikely to have different TimezoneOffsets. It could only happen
if at least one of them decided to use a foreign time zone setting
on her computer.

I meant 0 or 1 bit for a single long-term user. It could also happen if
a travelling user had retained the home setting.


No problem at all, just don't use any of the mechanisms provided to
make the outputs truly *random*. The shortest way to do that is to
call the constructor with the empty string as sole argument, instead
of no argument at all : new Random('') instead of new Random().

To be pedantic, new Random(0) is shorter and should serve!
 
D

Dr J R Stockton

In comp.lang.javascript message <x6ydnWI21dMBbtvXnZ2dnUVZ8oRi4p2d@gigane
Johannes Baagoe :


Dr J R Stockton :


Why not? I am talking about the number of bits they produce at a time,
not their internal state which is always larger in the case of good
PRNGs, and often *much* larger.

OK; but it seems silly nowadays to have RNGs with many more states and
to get only 32-bit values.

Quite, but why would one bother when some of the most suitable
operations - shifts, XORs, etc - happen to operate on 32 bits
in ECMAScript, and several 32-bit well-researched PRNGS which
are straightforward to code pass all known tests with flying
colours, whereas even the best LCG fails them miserably,
and *must* fail them, for well-known theoretical reasons ?
(http://www.pnas.org/content/61/1/25.full.pdf)

You are still concentrating on systems suitable for generating GUIDs,
and possibly for crypto work. I am interested in getting an adequate
Math.random, preferably native to browsers, with the sort of performance
that a Double should give - equivalent to a fixed exponent, 53 random
mantissa bits, and a cycle length of at least 2^53, with a cycle length
over 2^64 being overkill.

An attempt to over-improve on Math.random by something that looks too
complicated may be self-defeating.

To get a non-native version, one must code in JavaScript, where logical
operations on over 32 bits are cumbersome. But one hopes that the
JavaScript engine is not written entirely in JavaScript. IIRC, the 8086
language has 16-bit shifts with carry, so that a 64-bit shift by 1
requires just 4 ops; and the 386 language may have 32-bit shifts; and
modern machines should be better than that.

Yes, I must confess that I haven't understood the code, not even to
the point of figuring out what the multiplier may be.

Well, it comes from the control marked "Mult", which has a name of Mult,
and is read into Obj.Mult. The default value is from Knuth, as in your
next paragraph.

Knuth recommends C. E. Haynes's 6364136223846793005, I know of no other
proposal. *Nobody* does 2^64 LCGs, there is simply no point. It is
like investing lots of efforts to build a bigger and better Zeppelin.



What do you mean by "real randomness" ? If the generator produces
successive numbers that are not correlated in any meaningful way,
as it should (and LCGs fail to do), what else would be needed ?

With a known algorithm, a 32-bit state and a cycle length of 2^32, the
result of the second call is knowable once the result of the first is
known.
Quite so, and 64 is probably not enough. The simplest generator I
*know* passes L'Ecuyers' BigCrunch battery of tests is his own MRG32k3a
which has a state of 192 bits. Marsaglia's KISS32, which I *suspect*
passes BigCruch as well (it passes Crunch, I haven't yet had the time
to test BigCrunch which is rather lengthy) has somewhat less than 127.

You are, of course, aiming for something much better than Math.random
needs, provided that Math.random is not used for crypto or GUIDs.

In any case, Math.random cannot be truly suitable for those, since it
gives a Double and those will want integers or strings.

That is indeed an important consideration for combined generators,
which are possibly the simplest way to make good ones.


That seems to me to be a completely different issue, am I missing
something ?

I thought that I saw a *possible* loophole to the co-prime requirement;
perhaps you know that it's not an actual loophole.
 
D

Dr J R Stockton

In comp.lang.javascript message <x6ydnZ0x1dPwldrXnZ2dnUVZ8oT_fwAA@gigane
Dr J R Stockton :


OK, but it only matters if one wants repeatability. In the more usual
case where that is exactly what one wants of avoid, it doesn't.

Better wording would be like "the results depend on the order of the
arguments".

Also : if you instead have a single argument which is an array of
pseudo-arguments, that at worst adds two characters [] to each call,
and it enables the initialisation to be handled as a single entity.
var arr = ["Trondheim", "Boulogne", "Cockfosters", "Spaghetti"]
var r1 = new Random(arr) var r2 = new Random(arr)

Well, since ECMASCript provides a standard means (the "arguments"
pseudo-Array) of handling any number of arguments, I think I shall
stick with it. It saves the trouble of explaining and enforcing
a particular number and type of seeds.

That pseudo-Array is accessible within the routine; not outside it.
Above, r1 & r2 (insert missing semi-colon) hold seed-sets which can
conveniently be re-used, as may be required in testing.

I'm not sure I understand what you mean, but my claim is that the user
*can* provide enough entropy without ever noticing it.

Agreed : the input can be unconscious (and not reproducible) or
conscious (in which case it may be bad, but can be reproduced).




I fail to see how that could be done in my scenario.

The worst kind of security leaks are those which are hardest to see.

ISTR that, during WWII, Enigma operators could often be recognised by
their Morse keying habits, which leaks information about troop
movements; and different operators would have different habits of
setting up their encoders sub-optimally, which helped the decoding
effort. Or something like that. The paper I originally cited shows
that identifying that two independent sessions are likely to be from the
same person can be of interest to the naughty.

Never heard of it ;)

EXMAScript in Internet Explorer on the PC is implemented by an engine
quite distinct from MS IE itself. The same engine is used by WSH to run
what MS call JScript, using a different DOM. WSH can also run VBScript,
a sort of Visual Basic, in that DOM - and that is much more common and
used for systems programming. That description is substantially, but
not necessarily pedantically, correct. To see a newsgroup for it,
<http://groups.google.com/group/microsoft.public.scripting.vbscript/topi
cs>.

Well, if an object can make sufficiently "random" numbers to make
UUIDs, it can surely also make random numbers to play with, can't it ?

Undoubtedly. But if it is perceived as large and complicated, it may be
ignored.

Perhaps it would be well to use a Subject line "Math.random" only for
discussing the generation of random Numbers of sufficient quality for
most purposes, and one perhaps of "Generating GUIDs" for GUIDs.


My <js-randm.htm> now includes Rand64 which is VastCalc reduced in
flexibility to just implement

X = ((X * A) + C) modulo M
A = 0x5851F42D4C957F2D ; C = 1 ; M = 2^64

with modulo being done by using 64-bit arithmetic.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,098
Messages
2,570,625
Members
47,236
Latest member
EverestNero

Latest Threads

Top