Is empty string cached?

Farshid Lashkari · Feb 15, 2006

When I pass an empty string to a function is a new string object created
or does python use some global pre-created object? I know python does
this with integer objects under a certain value. For instance, in the
following code is a new string object created for each function call?

func(0,'')
func(1,'')
func(2,'')
func(3,'')

I tried the following commands in the interactive shell:
True

This leads me to believe that python does reuse existing strings, but
once the variables are removed, does the item still exist in the cache?

-Farshid

Bryan Olson · Feb 15, 2006

Farshid said:
When I pass an empty string to a function is a new string object created
or does python use some global pre-created object? I know python does
this with integer objects under a certain value. For instance, in the
following code is a new string object created for each function call?

func(0,'')
func(1,'')
func(2,'')
func(3,'')

In this case, the language implementation may either create new
strings or re-use existing ones:

for immutable types, operations that compute new values
may actually return a reference to any existing object with
the same type and value, while for mutable objects this is
not allowed.
[http://docs.python.org/ref/objects.html]

[...]

This leads me to believe that python does reuse existing strings, but
once the variables are removed, does the item still exist in the cache?

Either; see the same reference page.

Steve Holden · Feb 16, 2006

Farshid said:
When I pass an empty string to a function is a new string object created
or does python use some global pre-created object? I know python does
this with integer objects under a certain value. For instance, in the
following code is a new string object created for each function call?

func(0,'')
func(1,'')
func(2,'')
func(3,'')

I tried the following commands in the interactive shell:

True

This leads me to believe that python does reuse existing strings, but
once the variables are removed, does the item still exist in the cache?

It takes far too little evidence to induce belief:

regards
Steve

Farshid Lashkari · Feb 16, 2006

It takes far too little evidence to induce belief:
I don't understand the point of your last expression. Were you intending
this instead:
True

However, the following commands add to my confusion:
False

So how are string literals cached? Is there an explanation somewhere? Is
it some freaky voodoo, and I should just assume that a string literal
will always generate a new object?

Thanks,
Farshid

Steve Holden · Feb 16, 2006

Farshid said:
I don't understand the point of your last expression. Were you intending
this instead:

Yes.

However, the following commands add to my confusion:

False

So how are string literals cached? Is there an explanation somewhere? Is
it some freaky voodoo, and I should just assume that a string literal
will always generate a new object?

I really don't understand why it's so important: it's not a part of the
language definition at all, and therefore whatever behavior you see is
simply an artifact of the implementation you observe.

regards
Steve

Farshid Lashkari · Feb 16, 2006

I really don't understand why it's so important: it's not a part of the

language definition at all, and therefore whatever behavior you see is
simply an artifact of the implementation you observe.

I guess I should rephrase my question in the form of an example. Should
I assume that a new string object is created in each iteration of the
following loop?

for x in xrange(1000000):
func(x,'some string')

Or would it be better to do the following?

stringVal = 'some string'
for x in xrange(1000000):
func(x,stringVal)

Or, like you stated, is it not important at all?

Thanks,
Farshid

Steve Holden · Feb 16, 2006

Farshid said:
I guess I should rephrase my question in the form of an example. Should
I assume that a new string object is created in each iteration of the
following loop?

for x in xrange(1000000):
func(x,'some string')

Or would it be better to do the following?

stringVal = 'some string'
for x in xrange(1000000):
func(x,stringVal)

Or, like you stated, is it not important at all?

It doesn't make a lot of difference:
Disassemble classes, methods, functions, or code.

With no argument, disassemble the last traceback.

... for x in xrange(1000000):
... func(x,'some string')""", "", 'exec'))
1 0 SETUP_LOOP 33 (to 36)
3 LOAD_NAME 0 (xrange)
6 LOAD_CONST 0 (1000000)
9 CALL_FUNCTION 1
12 GET_ITER 16 STORE_NAME 1 (x)

2 19 LOAD_NAME 2 (func)
22 LOAD_NAME 1 (x)
25 LOAD_CONST 1 ('some string')
28 CALL_FUNCTION 2
31 POP_TOP
32 JUMP_ABSOLUTE 13 ... stringVal = 'some string'
... for x in xrange(1000000):
... func(x,stringVal)""", "", 'exec'))
1 0 LOAD_CONST 0 ('some string')
3 STORE_NAME 0 (stringVal)

2 6 SETUP_LOOP 33 (to 42)
9 LOAD_NAME 1 (xrange)
12 LOAD_CONST 1 (1000000)
15 CALL_FUNCTION 1
18 GET_ITER 22 STORE_NAME 2 (x)

3 25 LOAD_NAME 3 (func)
28 LOAD_NAME 2 (x)
31 LOAD_NAME 0 (stringVal)
34 CALL_FUNCTION 2
37 POP_TOP
38 JUMP_ABSOLUTE 19
It just boils down to either a LOAD_CONST vs. a LOAD_NAME - either way
the string isn't duplicated.

regards
Steve

Farshid Lashkari · Feb 16, 2006

It just boils down to either a LOAD_CONST vs. a LOAD_NAME - either way

the string isn't duplicated.

Great, that's exactly what I wanted to know. Thanks Steve!

-Farshid

Peter Hansen · Feb 16, 2006

Farshid said:
However, the following commands add to my confusion:

False

So how are string literals cached? Is there an explanation somewhere? Is
it some freaky voodoo, and I should just assume that a string literal
will always generate a new object?

A few comments (which I hope are correct, but which I hope you will read
then mostly ignore since you probably shouldn't be designing based on
this stuff anyway):

1. What you see at the interactive prompt is not necessarily what will
happen in a compiled source file. Try the above test with and without
the question mark both at the interactive prompt and in source and see.

2. The reason for the difference with ? is probably due to an
optimization relating to looking up attribute names and such in
dictionaries. wtf? is not a valid attribute, so it probably isn't
optimized the same way.

2.5. I think I had a third comment like the above but after too little
sleep last night it seems it's evaporated since I began writing...

3. As Steve or someone said, these things are implementation details
unless spelled out explicitly in the language reference, so don't rely
on them.

4. If you haven't already written your code and profiled and found it
lacking in performance and proving that the cause is related to whether
or not you hoisted the string literal out of the loop, you're wasting
your time and this is a good opportunity to begin reprogramming your
brain not to optimize prematurely. IMHO. FWIW.

-Peter

Farshid Lashkari · Feb 16, 2006

A few comments (which I hope are correct, but which I hope you will read

then mostly ignore since you probably shouldn't be designing based on
this stuff anyway):

Thanks for the info Peter. My original question wasn't due to any
observed performance problems. I was just being curious

-Farshid

Magnus Lycka · Feb 17, 2006

Farshid said:
I guess I should rephrase my question in the form of an example. Should
I assume that a new string object is created in each iteration of the
following loop?

for x in xrange(1000000):
func(x,'some string')

Or would it be better to do the following?

stringVal = 'some string'
for x in xrange(1000000):
func(x,stringVal)

Or, like you stated, is it not important at all?

In this particular case, it's no big deal, since you use
a literal, which is something Python knows won't change.

In general, it's semantically very different to create an
object at one point and then use a reference to that over
and over in a loop, or to create a new object over and
over again in a loop. E.g.

for x in xrange(1000000):
func(x, str(5))

v.s.

stringVal = str(5)
for x in xrange(1000000):
func(x,stringVal)

This isn't just a matter of extra function call overhead. In
the latter case, you are telling Python that all calls to
"func(x,stringVal)" use the same objects as arguments (assuming
that there aren't any assignments to x and stringVal somewhere
else in the loop). In the former case, no such guarantee can
be made from studying the loop.

As for interning strings, it's my understanding that current
CPython interns strings that look like identifiers, i.e.
starts with an ASCII letter or an underscore and is followed
by zero or more ASCII letter, underscore or digit. On the other
hand, it seems id(str(5)) is persistent as well, so the current
implementation seems slightly simplified compared to the
perceived need. Anyway, this is just an implementation choice
made to improve performance, nothing to rely on.

trouble with nested closures: one of my variables is missing...	0	Oct 14, 2012
Text box simply do not stand out against the wall paper.	3	Feb 7, 2025
What code do I add / overwrite so that the ebDriver' object has no attribute 'find_element_by_css_selector error is gone ?	0	Sep 19, 2022
is list comprehension necessary?	15	Oct 26, 2010
Picture Comparison Code Not Working Properly	1	Jul 24, 2021
XML cached document fails intermittantly	0	Mar 13, 2007
eval() of empty string hangs	6	Jul 28, 2004
How can i compare a string which is non null and empty	9	Apr 2, 2007

Is empty string cached?

Farshid Lashkari

Bryan Olson

Steve Holden

Farshid Lashkari

Steve Holden

Farshid Lashkari

Steve Holden

Farshid Lashkari

Peter Hansen

Farshid Lashkari

Magnus Lycka

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads