Puzzled about random initialisation

R

Robin Becker

We've been queried about the randomness of some filenames we're producing.
I worked through and seemed to satisfy myself that random is being initialised
from the system time in C
time(&now)
init_genrand(self, (unsigned long)now);

where now is a time expressed in integral seconds since the eopoch.

This would mean that our filenames are not particularly random as that would
make the sequences common if two cgi scripts started within a second of each other.

However, when I use these trivial scripts

rem doit.bat
python bimbo.py
python bimbo.py
python bimbo.py
python bimbo.py
python bimbo.py
python bimbo.py
python bimbo.py
python bimbo.py

#bimbo.py
import time, random
print time.time(), hex(random.randint(0,0xffffffffl))


I get dirrent draws for each run even when the system time doessn't differ bu
much. I'm puzzled exactly how the random generator is being initialised.

C:\TMP>timethis doit.bat

TimeThis : Command Line : doit.bat
TimeThis : Start Time : Thu Jul 08 14:25:20 2004


C:\TMP>python bimbo.py
1089293120.59 0x74e18bc1

C:\TMP>python bimbo.py
1089293120.67 0x9EECC786L

C:\TMP>python bimbo.py
1089293120.77 0x75d734d6

C:\TMP>python bimbo.py
1089293120.83 0x509afece

C:\TMP>python bimbo.py
1089293120.91 0x1edde1aa

C:\TMP>python bimbo.py
1089293120.98 0xE552E511L

C:\TMP>python bimbo.py
1089293121.06 0xC2C91369L

C:\TMP>python bimbo.py
1089293121.14 0xF739FCFEL

TimeThis : Command Line : doit.bat
TimeThis : Start Time : Thu Jul 08 14:25:20 2004
TimeThis : End Time : Thu Jul 08 14:25:21 2004
TimeThis : Elapsed Time : 00:00:00.656
 
P

Paul Rubin

Robin Becker said:
I get dirrent draws for each run even when the system time doessn't
differ bu much. I'm puzzled exactly how the random generator is being
initialised.

The rng has complicated internal state that's initialized completely
different if the seeds are even slightly different. If you want to
study the algorithm, google on "mersenne twister". The random module
is not random enough for the most demanding applications, particularly
those where someone who has studied the algorithm might be trying to
predict the output, for example online games being played for large
real prizes. But it should be good enough for most stuff like
physical simulations.
 
R

Robin Becker

Paul said:
The rng has complicated internal state that's initialized completely
different if the seeds are even slightly different. If you want to
study the algorithm, google on "mersenne twister". The random module
is not random enough for the most demanding applications, particularly
those where someone who has studied the algorithm might be trying to
predict the output, for example online games being played for large
real prizes. But it should be good enough for most stuff like
physical simulations.

I'm not sure that answers my question. The seed in question is supposed
to be determinate.

The docs say system time is used as an initial seed and my reading of
the code seems to suggest that the actual input is a single time_t value
which apparently changes once per second.

Observation suggests that we don't get the same value even when separate
python processes start in the same system time second. So my
understanding is flawed.

However, I would still like to know how random starting a python process
and taking a single randint value is? Can someone clever guess what
values will be thrown up in a specified system time second?
 
P

Paul Rubin

Robin Becker said:
I'm not sure that answers my question. The seed in question is
supposed to be determinate.

The docs say system time is used as an initial seed and my reading of
the code seems to suggest that the actual input is a single time_t
value which apparently changes once per second.

No, the input is the system timer including the microseconds, and it
may include some other stuff as well. The system timer doesn't really
change a million times a second, but it changes much faster than 1 hz.
However, I would still like to know how random starting a python
process and taking a single randint value is? Can someone clever
guess what values will be thrown up in a specified system time second?

Yes, pretty much so. If that's a concern for your application, you
should not depend on the MT algorithm even if the initial state is secret.

The patch attached to

https://sourceforge.net/tracker/?func=detail&atid=105470&aid=917055&group_id=5470

has a much more secure (but slower) PRNG with the same interface as the
library PRNG, if you want to use that.
 
T

Tim Peters

[Robin Becker]
We've been queried about the randomness of some filenames we're producing.
I worked through and seemed to satisfy myself that random is being initialised
from the system time in C
time(&now)
init_genrand(self, (unsigned long)now);

where now is a time expressed in integral seconds since the eopoch.

If you were calling the C code directly, that would be relevant. But
you're not, Instead you're using the default instance of the Random
class Python creates for you, with the initialization *it* uses. That
code is in random.py:

def seed(self, a=None):
...
if a is None:
import time
a = long(time.time() * 256) # use fractional seconds
super(Random, self).seed(a)

time.time() typically changes value only 18.2 times per second on
Windows, so the multiplication by 256 isn't as effective as it looks
on Windows. But you'll still get a different seed on Windows if start
times are more than about 0.055 seconds apart.

All the start times in the example you gave later were at least 0.06
seconds apart, so the different outcomes are expected. Your machine
is fast enough that you might start to see duplicates on Windows if
you ran your Python instances with the -S switch (which skips a bunch
of expensive initialization).

Or, more directly,

import time, random
rd = {}
td = {}
ntries = 0
finish = time.time() + 10
while 1:
t = time.time()
if t > finish:
break
td[t] = 1
rd[random.Random().random()] = 1
ntries += 1
print len(rd), "unique random() values in", ntries, "attempts"
print len(td), "unique time.time() values"

On my WinXP box, that printed

641 unique random() values in 165923 attempts
641 unique time.time() values

Since it ran for 10 seconds, I'm seeing time.time() change about 64
times per second. I believe that's because this is a hyper-threaded
box, Whatever, if I had been able to start ~166000 instance of Python
in that time, I would have seen only ~640 different seedings across
them.
 
R

Robin Becker

Tim said:
[Robin Becker]
We've been queried about the randomness of some filenames we're producing.
I worked through and seemed to satisfy myself that random is being initialised
from the system time in C
time(&now)
init_genrand(self, (unsigned long)now);

where now is a time expressed in integral seconds since the eopoch.

...... good stuff elided
641 unique random() values in 165923 attempts
641 unique time.time() values

Since it ran for 10 seconds, I'm seeing time.time() change about 64
times per second. I believe that's because this is a hyper-threaded
box, Whatever, if I had been able to start ~166000 instance of Python
in that time, I would have seen only ~640 different seedings across
them.

thanks for all this it makes things a lot clearer. Using time.time() makes it
harder, but still not impossible to guess what values could have been generated.
The application should probably be using /dev/urandom to seed generators.
 
C

Christopher T King

thanks for all this it makes things a lot clearer. Using time.time() makes it
harder, but still not impossible to guess what values could have been generated.
The application should probably be using /dev/urandom to seed generators.

Perhaps you could try something like time.time()*os.getpid()? This will
give different processes starting at the same time different random seeds.
 
H

Heather Coppersmith

Perhaps you could try something like time.time()*os.getpid()?
This will give different processes starting at the same time
different random seeds.

That (probably) doesn't help: If I know what time a process
started (down to the kernel-clock tick), then I (probably) know
that process' pid, too.

Regards,
Heather
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,661
Latest member
sxarexu

Latest Threads

Top