Tuple assignment and generators?

D

Diez B. Roggisch

vdrab said:
Given this though, what other such beauties are lurking in the
interpreter, under the name of 'implementation accidents'? One of the
things that drew me to python is the claimed consistency and
orthogonality of both language and implementation, not sacrificing
clarity for performance, minimizing ad-hoc design hacks and weird
gotcha's, etc...
In fact, I think my code contains things like "if len(arg) is 0:" and
so on, and I feel I should be able to do so given the way python treats
(claims to treat?) constant objects, even if I don't care whether the
values actually represent the same object.

Python doesn't claim that 0 is 0 == True. You are abusing the "is" operator.
The only (or at least 99%) occasions I use "is" are

if foo is None:
...

as None is guaranteed to be a singleton object.

The thing you observe as accident is that sometimes "0 is 0" is true just
because of an optimization of number objects allocation. Such things happen
in the "real" world - other examples are string-interning in e.g. the JVM
(and I bet they have a similar scheme to boxed number object allocation as
python has).

Diez
 
V

vdrab

so anything you don't understand, and cannot be bothered to look up in
the documentation, just has to be an inconsistent ad-hoc weird-gotcha
design ?

Does the documentation mention that "x is y" returns True when they are
both 0 but not when they are 100001 ? If so, I stand corrected. *plonk*
away ...
s.
 
A

Alexandre Fayolle

Le 05-05-2006 said:
The thing you observe as accident is that sometimes "0 is 0" is true just
because of an optimization of number objects allocation. Such things happen
in the "real" world - other examples are string-interning in e.g. the JVM
(and I bet they have a similar scheme to boxed number object allocation as
python has).

String interning is available in Python too, by using the intern()
builtin function.
 
D

Daniel Nogradi

Given this though, what other such beauties are lurking in the
so anything you don't understand, and cannot be bothered to look up in
the documentation, just has to be an inconsistent ad-hoc weird-gotcha
design ?

I think we can all safely *plonk* you know.

I was just at a point when I thought I learned something but got
confused again after trying the following and unfortunately didn't
find an answer in the docs.
134536516

So the two memory addesses are the same, but
134604252

and they are not the same (I restarted the interpreter between the two
cases). So how is this now? Sorry if it's too trivial, but I simply
don't get it.
 
F

Fredrik Lundh

vdrab said:
Does the documentation mention that "x is y" returns True when they are
both 0 but not when they are 100001 ?

language reference, comparisions (is operator):

The operators is and is not test for object identity: x is y is true if and
only if x and y are the same object

language reference, objects:

"Even the importance of object identity is affected in some sense: for
immutable types, operations that compute new values may actually
return a reference to any existing object with the same type and value,
while for mutable objects this is not allowed. E.g., after "a = 1; b = 1",
a and b may or may not refer to the same object with the value one,
depending on the implementation, but after "c = []; d = []", c and d are
guaranteed to refer to two different, unique, newly created empty lists.

(note the use of "may or may not" and "depending on the implementation")

</F>
 
D

Diez B. Roggisch

I was just at a point when I thought I learned something but got
confused again after trying the following and unfortunately didn't
find an answer in the docs.

134536516

So the two memory addesses are the same, but

134604252

and they are not the same (I restarted the interpreter between the two
cases). So how is this now? Sorry if it's too trivial, but I simply
don't get it.

It's an optimization scheme that will cache number objects up to a certain
value for optimized reuse. However this is impractical for larger numbers -
you only hold a table of lets say 1000 or so objects. Then the look up of
one of those objects is extremely fast, whereas the construction of
arbitrary numbers is somewhat more expensive.

And as "is" is the operator for testing if objects are identical and _not_
the operator for testing of equality (which is ==), the above can happen.
And is totally irrelevant from a practical POV (coding-wise that is - it
_is_ a relevant optimization).

Diez
 
D

Daniel Nogradi

I was just at a point when I thought I learned something but got
It's an optimization scheme that will cache number objects up to a certain
value for optimized reuse. However this is impractical for larger numbers -
you only hold a table of lets say 1000 or so objects. Then the look up of
one of those objects is extremely fast, whereas the construction of
arbitrary numbers is somewhat more expensive.

And as "is" is the operator for testing if objects are identical and _not_
the operator for testing of equality (which is ==), the above can happen.
And is totally irrelevant from a practical POV (coding-wise that is - it
_is_ a relevant optimization).

Thanks a lot! So after all I really learned something :)
 
V

vdrab

language reference, objects:

"Even the importance of object identity is affected in some sense:
for
immutable types, operations that compute new values may actually
return a reference to any existing object with the same type and
value,
while for mutable objects this is not allowed. E.g., after "a = 1;
b = 1",
a and b may or may not refer to the same object with the value one,
depending on the implementation, but after "c = []; d = []", c and
d are
guaranteed to refer to two different, unique, newly created empty
lists.

(note the use of "may or may not" and "depending on the
implementation")

</F>

That, I knew. What I did not know, nor get from this explanation, is
that this behaviour "may" differ
not only within the same implementation, but with instances of the same
class or type (in this case, 'int'). Is this really a case of me being
too dumb or too lazy, or could it just be that this behaviour is not
all that consistent ?
v.
v.
 
D

Diez B. Roggisch

vdrab said:
That, I knew. What I did not know, nor get from this explanation, is
that this behaviour "may" differ
not only within the same implementation, but with instances of the same
class or type (in this case, 'int').

"""
E.g., after "a = 1;
b = 1",
    a and b may or may not refer to the same object with the value one,
    depending on the implementation,
"""

Diez
 
D

Duncan Booth

Daniel said:
134536516

So the two memory addesses are the same, but

134604252

and they are not the same (I restarted the interpreter between the two
cases). So how is this now? Sorry if it's too trivial, but I simply
don't get it.
If two immutable values are the same, then the interpreter has the right to
simply reuse the same value. Apart from the guarantee that it will do this
with None everything else is left open.

The current C-Python implementation will reuse small integers but not large
integers, it also reuses some strings. It reuses the empty tuple but not
(so far as I know) any other tuples. This could change at any time and
other Python implementations may do totally different things here.

Just because you saw it reusing a small value such as 10 doesn't mean that
there cannot be other small integers with the value 10 which aren't the
same as that one. Back before Python had a separate bool type, it used to
use two different integers for 0 (and two for 1), so you could (by an
accident of implementation) tell whether a value had been created by a
comparison operator. So far as I know, there is nothing stopping the author
of an extension written in C continuing to create their own versions of
small numbers today.
 
V

vdrab

"""
E.g., after "a = 1;
b = 1",
a and b may or may not refer to the same object with the value one,
depending on the implementation,
"""

But when in a specific implementation this property _does_ hold for
ints having value 1, I expect the
same behaviour for ints with other values than 1.
I guess I'm kind of weird that way.
 
P

Paul Boddie

Tim said:
I was hoping that there was just some __foo__ property I was
missing that would have been called in the process of tuple
unpacking that would allow for a more elegant solution such
as a generator (or generator method on some object) rather
than stooping to disassembling opcodes. :)

I suppose you wanted something like this...

a, b, c, d, e = xyz # first, let's consider a plain object

....to use the iterator "protocol" to populate the tuple elements, like
this:

_iter = xyz.__iter__()
a = _iter.next()
b = _iter.next()
c = _iter.next()
d = _iter.next()
e = _iter.next()

For such plain objects, it is possible to use such a mechanism. Here's
a "countdown" iterator:

class A:
def __init__(self, size):
self.size = size
def __iter__(self):
return B(self.size)

class B:
def __init__(self, size):
self.n = size
def next(self):
if self.n > 0:
self.n -= 1
return self.n
else:
raise StopIteration

xyz = A(5)
a, b, c, d, e = xyz

In fact, similar machinery can be used to acquire new values from a
generator:

def g(size):
while size > 0:
size = size - 1
yield size

a, b, c, d, e = g(5)

Note that if the iterator (specifically, the generator in the second
case) doesn't provide enough values, or provides too many, the tuple
unpacking operation will fail with the corresponding exception message.
Thus, generators which provide infinite or very long sequences will not
work unless you discard trailing values; you can support this either by
adding support for slicing to whatever is providing your sequences or,
if you're using generators anyway, by employing an additional
"limiting" generator:

def limit(it, size):
while size > 0:
yield it.next()
size -= 1

xyz = A(5)
a, b, c, d = limit(iter(xyz), 4)

The above generator may be all you need to solve your problem, and it
may be the case that such a generator exists somewhere in the standard
library.

Paul
 
B

Boris Borcic

vdrab said:
But when in a specific implementation this property _does_ hold for
ints having value 1, I expect the
same behaviour for ints with other values than 1.
I guess I'm kind of weird that way.

Are you telling us that you *had* read that doc,
and tripped because it says "depending on the implementation",
when it should say "at the choice of the implementation" ?

That's indeed a bit weird, imo.
 
V

vdrab

Are you telling us that you *had* read that doc,
and tripped because it says "depending on the implementation",
when it should say "at the choice of the implementation" ?

no.
let's see, where to start ... ?
let's say there's a certain property P, for the sake of this loooong
discussion, something
more or less like a class or type's property of "having immutable
values, such that any instance with value X has a single, unique
representation in memory and any two instantiations of objects with
that value X are in fact references to the same object".

Then, for example, python strings have property P whereas python lists
do not:
x = "test"
y = "test"
x is y True
x = []
y = []
x is y False

Now, as it turns out, whether or not python integers have property P
_depends_on_their_value_.
For small values, they do. For large values they don't. Yes, I
understand about the interpreter optimization. I didn't know this, and
I find it neither evident nor consistent. I don't think the above post
explains this, regardless of how you read "implementation".

In fact, the whole string of replies after my initial question reminded
me of something I read not too long ago, but didn't quite understand at
the time.
source :
http://www.oreillynet.com/ruby/blog/2006/01/a_little_antiantihype.html

'''
Pedantry: it's just how things work in the Python world. The status
quo is always correct by definition. If you don't like something, you
are incorrect. If you want to suggest a change, put in a PEP,
Python's equivalent of Java's equally glacial JSR process. The
Python FAQ goes to great lengths to rationalize a bunch of broken
language features. They're obviously broken if they're frequently
asked questions, but rather than 'fessing up and saying "we're
planning on fixing this", they rationalize that the rest of the world
just isn't thinking about the problem correctly. Every once in a
while some broken feature is actually fixed (e.g. lexical scoping), and
they say they changed it because people were "confused". Note that
Python is never to blame.
'''

taking this rant with the proverbial grain of salt, I did think it was
funny.

Anyway, thanks for all the attempts to show me.
I will get it in the end.
v.
 
D

Diez B. Roggisch

vdrab said:
But when in a specific implementation this property _does_ hold for
ints having value 1, I expect the
same behaviour for ints with other values than 1.

That is an assumption you made. The above sentence is true for that
assumption, but also - and that is the key point here - for the current
implementation.

And to put it frankly: if you'd had spend only half the time it took you to
participate in this argument to think about how one could possibly
implement the behavior you'd thought of, you'd realize that its totally
unfeasible. Try stuffing 2^64 python long objects in your memory to make
that guarantee hold... And then 2^65
I guess I'm kind of weird that way.

Maybe.

Diez
 
S

Sion Arrowsmith

vdrab said:
let's say there's a certain property P, for the sake of this loooong
discussion, something
more or less like a class or type's property of "having immutable
values, such that any instance with value X has a single, unique
representation in memory and any two instantiations of objects with
that value X are in fact references to the same object".

Then, for example, python strings have property P whereas python lists
do not:

Er, no:
False

Strings only get a unique instance if they are valid identifiers.
Again, it's an optimisation issue. As with ints, it
_depends_on_their_value_.
I find it neither evident nor consistent. I don't think the above post
explains this, regardless of how you read "implementation".

"Implementation dependent" => "Any behaviour you observe which is not
explicitly documented is not to be relied upon". Also, "Implementation
dependent" => "How this is implemented should be transparent and
irrelevant to the normal user". No, it's not particularly consistent.
Because it doesn't matter.
 
V

vdrab

oh wow... it gets better...

.... I had no clue.
I guess the take-away lesson is to steer clear from any reliance on
object identity checks, if at all possible. Are there any other such
"optimizations" one should like to know about?
v.
 
D

Dave Hansen

no.
let's see, where to start ... ?
let's say there's a certain property P, for the sake of this loooong
discussion, something
more or less like a class or type's property of "having immutable
values, such that any instance with value X has a single, unique
representation in memory and any two instantiations of objects with
that value X are in fact references to the same object".

IOW, property P is "(x == y) => (x is y)" (read "=>" as "implies").

Note that only immutable objects can have property P.
Then, for example, python strings have property P whereas python lists
do not:
x = "test"
y = "test"
x is y True
x = []
y = []
x is y False

Note this last relationship is _guaranteed_. Lists are not immutable,
and therefore can not have property P.
Now, as it turns out, whether or not python integers have property P
_depends_on_their_value_.

From the zen, I believe this falls out from "practicality beats
purity."
For small values, they do. For large values they don't. Yes, I

Even that's not necessarily true. The implementation is free to
always create a new immutable object, even for small values.
understand about the interpreter optimization. I didn't know this, and
I find it neither evident nor consistent. I don't think the above post
explains this, regardless of how you read "implementation".

Think about implementation for a moment. Consider the statement

x = some_arbitrary_integer()

Do you really want the interpreter to go through all the existing
integer objects in the program to see if that particular value exists,
just to guarantee some some later statement

x is y

returns True if x == y?

OK, maybe we can change the "is" operator on immutable objects such
that x is y returns True if x == y. But then you can encounter a
situation where "x is y" but "id(x) != id(y)" Now what?

Perhaps the solution would be to disable the "is" operator and "id"
function for immutable objects. But then _they_ lose generality.
There doesn't seem to be a way to win.

So it all comes down to "who cares?" Immutable objects are immutable.
You can't change them. Object identity is a non-issue.

This is not the case for mutable objects. Consider

a = [1,2,3]
b = [1,2,3]
c = a

a==b
a==c
b==c
a is not b
b is not c
c is a

c.append(4)
In fact, the whole string of replies after my initial question reminded
me of something I read not too long ago, but didn't quite understand at
the time.
source :
http://www.oreillynet.com/ruby/blog/2006/01/a_little_antiantihype.html
[...whinge elided...]

taking this rant with the proverbial grain of salt, I did think it was
funny.

Your original post in its entirety (ignoring the example) was "what
the...? does anybody else get mighty uncomfortable about this? "

The first response (paraphrased) was "No. Why should I? With
immutable objects, I care about ==, not is."

Your response seemed to want to cast doubt on the integrity of the
entire language: "Given this though, what other such beauties are
lurking in the interpreter, under the name of 'implementation
accidents'?"
Anyway, thanks for all the attempts to show me.
I will get it in the end.

I will ignore the double entendre, and simply hope I was of help, and
wish you luck. Regards,
-=Dave
 
F

Fredrik Lundh

vdrab said:
I guess the take-away lesson is to steer clear from any reliance on
object identity checks, if at all possible. Are there any other such
"optimizations" one should like to know about?

so in your little world, an optimization that speeds things up and saves
memory isn't really an optimization ?

good luck with your future career in programming.

*plonk*
 
D

Diez B. Roggisch

... I had no clue.

We figured that....
I guess the take-away lesson is to steer clear from any reliance on
object identity checks, if at all possible.

You've been told that quite a few times before that "is" is not intended for
what you used it.

Some people actually listen to what others tell. Others seem to be driven by
the deep desire to make even the tiniest bit of getting-a-grasp a public
affair.

Diez
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top