Cryptographically random numbers

Tuvas · Mar 7, 2006

I've actually done the tests on this one, it's actually faster to use
the += than a list, odd as it may sound. I ran into this one a while
back. The best way to do it is to build an array from scratch, fill the
array, and then join it, but I didn't have time to do it that way...

Steven D'Aprano · Mar 7, 2006

in 2.4 and later, += on strings does the operation in place when
possible. this means that += is often faster than append/join.

As much as we all love optimizations, I'm unhappy about this one. It used
to be a real simple rule: to append lots of strings, use append/join. That
worked for any Python you used.

But now code that runs really fast in CPython 2.4 will run like a dog in
Jython or CPython 2.3. Python's implementation independence is weakened.

Worse, because nobody really knows what "often" means (or at least, those
who know haven't said) a conscientious Python developer who wants fast
performance now has to measure all his string appending code using both
idioms to be sure that *this particular case* is/isn't one of the "often".

Is there some heuristic for telling when CPython can do the in-place
appending, and when it can't? If we were talking about a tiny (1% say)
performance penalty for getting it wrong, then it wouldn't matter, but the
penalty for using += when the optimization doesn't apply is severe.

Paul Rubin · Mar 7, 2006

Tuvas said:
I've actually done the tests on this one, it's actually faster to use
the += than a list, odd as it may sound.

Frederik explained the reason; there's an optimization in Python 2.4
that I'd forgotten about, for that specific case. It's not in earlier
versions. It's a bit fragile in 2.4:

a = ''
for x in something:
a += g(x)

is fast, but if a is aliased, Python can't do the optimization, so

a = ''
b = a
for x in something:
a += g(x)

is slow. Figuring out which case to use relies on CPython's reference
counting storage allocator (the interpreter keeps track of how many
pointers there are to any given object) and so the optimization may
not be feasible at all in other implementations which use different
storage allocation strategies (e.g. Lisp-style garbage collection).

All in all I think it's best to use a completely different approach
(something like StringBuffer) but my effort to start a movement here
on clpy a couple months ago didn't get anywhere.

Tim Hochberg · Mar 8, 2006

Paul said:
Frederik explained the reason; there's an optimization in Python 2.4
that I'd forgotten about, for that specific case. It's not in earlier
versions. It's a bit fragile in 2.4:

a = ''
for x in something:
a += g(x)

is fast, but if a is aliased, Python can't do the optimization, so

a = ''
b = a
for x in something:
a += g(x)

is slow.

Is this really true? After the first time through the loop, 'a' won't be
aliased any more since strings are immutable. After that the loops
should be equivalent. I tried this out to see if I could see a timing
difference, in case I was missing something, with Python 2.4.1, the
following two snippets timed essentially the same for N up to 2**20 (I
didn't try any higher):

def concat1():
a = ''
for x in ' '*N:
a += x
return a

def concat2():
a = ''
b = a
for x in ' '*N:
a += x
return a

Regards,

-tim

Paul Rubin · Mar 8, 2006

Is this really true? After the first time through the loop, 'a' won't
be aliased any more since strings are immutable.

Hmm, ok, there was some previous discussion of this problem and I
guess I mis-remembered it. Thanks. I think there is still a fairly
natural case where the optimization doesn't work. Anyone know?

I HAVE MADE AN IMPROVED INPUT FOR INTEGERS ONLY	3	Oct 28, 2024
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Win screen error	0	Jul 28, 2024
Python point location of intersect between two lines	0	Feb 28, 2018
minfuds - coding puzzle	0	Jul 15, 2013
I need help making a zooming function	11	Dec 14, 2021
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021

Cryptographically random numbers

Tuvas

Steven D'Aprano

Paul Rubin

Tim Hochberg

Paul Rubin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads