Inexplicable behavior in simple example of a set in a class

S

Saqib Ali

I have written two EXTREMELY simple python classes. One class
(myClass1) contains a data attribute (myNum) that contains an integer.
The other class (myClass2) contains a data attribute (mySet) that
contains a set.

I instantiate 2 instances of myClass1 (a & b). I then change the value
of a.myNum. It works as expected.

Then I instantiate 2 instances of myClass2 (c & d). I then change the
value of c.mySet. Bizarrely changing the value of c.mySet also affects
the value of d.mySet which I haven't touched at all!?!?! Can someone
explain this very strange behavior to me? I can't understand it for
the life of me.

Please see below the source code as well as the output.


-------------------------- SOURCE CODE ------------------------------
import sets

class myClass1:

myNum = 9

def clearNum(self):
self.myNum = 0

def __str__(self):
return str(self.myNum)

class myClass2:

mySet = sets.Set(range(1,10))

def clearSet(self):
self.mySet.clear()

def __str__(self):
return str(len(self.mySet))

if __name__ == "__main__":

# Experiment 1. Modifying values of member integers in two
different instances of a class
# Works as expected.
a = myClass1()
b = myClass1()
print "a = %s" % str(a)
print "b = %s" % str(b)
print "a.clearNum()"
a.clearNum()
print "a = %s" % str(a)
print "b = %s\n\n\n" % str(b)



# Experiment 2. Modifying values of member sets in two different
instances of a class
# Fails Unexplicably. d is not being modified. Yet calling
c.clearSet() seems to change d.mySet's value
c = myClass2()
d = myClass2()
print "c = %s" % str(c)
print "d = %s" % str(d)
print "c.clearSet()"
c.clearSet()
print "c = %s" % str(c)
print "d = %s" % str(d)




-------------------------- OUTPUT ------------------------------
python.exe myProg.py

a = 9
b = 9
a.clearNum()
a = 0
b = 9



c = 9
d = 9
c.clearSet()
c = 0
d = 0
 
C

Chris Rebert

Then I instantiate 2 instances of myClass2 (c & d). I then change the
value of c.mySet. Bizarrely changing the value of c.mySet also affects
the value of d.mySet which I haven't touched at all!?!?! Can someone
explain this very strange behavior to me? I can't understand it for
the life of me.
class myClass2:

   mySet = sets.Set(range(1,10))

   def clearSet(self):
       self.mySet.clear()

   def __str__(self):
         return str(len(self.mySet))

Please read a tutorial on object-oriented programming in Python. The
official one is pretty good:
http://docs.python.org/tutorial/classes.html
If you do, you'll find out that your class (as written) has mySet as a
class (Java lingo: static) variable, *not* an instance variable; thus,
it is shared by all instances of the class, and hence the behavior you
observed. Instance variables are properly created in the __init__()
initializer method, *not* directly in the class body.

Your class would be correctly rewritten as:

class MyClass2(object):
def __init__(self):
self.mySet = sets.Set(range(1,10))

def clearSet(self):
# ...rest same as before...

Cheers,
Chris
 
P

Peter Otten

Saqib said:
I have written two EXTREMELY simple python classes. One class
(myClass1) contains a data attribute (myNum) that contains an integer.
The other class (myClass2) contains a data attribute (mySet) that
contains a set.

I instantiate 2 instances of myClass1 (a & b). I then change the value
of a.myNum. It works as expected.

self.value = new_value

sets the attribute to a new value while

self.value.modifying_method()

keeps the old value (the object with the same id(object)), but changes the
state of that old object. All variables bound to that object will "see" that
internal change of state. As integers are "immutable" objects whose state
cannot be changed you'll never observe the behaviour described below with
them.
Then I instantiate 2 instances of myClass2 (c & d). I then change the
value of c.mySet. Bizarrely changing the value of c.mySet also affects
the value of d.mySet which I haven't touched at all!?!?! Can someone
explain this very strange behavior to me? I can't understand it for
the life of me.

What you access as c.mySet or d.mySet is really myClass2.mySet, i. e. a
class attribute and not an instance attribute. If you want a set per
instance you have to create it in the __init__() method:
.... def __init__(self):
.... self.my_set = set(range(10))
.... def clear_set(self):
.... self.my_set.clear()
....(set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 42]), set([0, 1, 2, 3, 4, 5, 6, 7, 8,
9]))(set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 42]), set([]))
 
S

Saqib Ali

Instance variables are properly created in the __init__()
initializer method, *not* directly in the class body.

Your class would be correctly rewritten as:

class MyClass2(object):
    def __init__(self):
        self.mySet = sets.Set(range(1,10))

    def clearSet(self):
# ...rest same as before...


Thanks Chris. That was certainly very helpful!!

So just out of curiosity, why does it work as I had expected when the
member contains an integer, but not when the member contains a set?
 
C

Chris Rebert

Thanks Chris. That was certainly very helpful!!

So just out of curiosity, why does it work as I had expected when the
member contains an integer, but not when the member contains a set?

To explain that, one must first understand that name lookup on an
object looks in the following places, in order:
1. the instance itself
2. the instance's class
3. the instance's superclasses

So, if we have:

class Foo(object):
bar = 7
foo_inst = Foo()

then both `foo_inst.bar` and `Foo.bar` refer to the same value.

However, if we then do:

foo_inst.bar = 42

then we'll have:

foo_inst.bar == 42 and Foo.bar == 7


Now back to your actual question. In clearNum(), you do:
self.myNum = 0
which creates a *new* instance variable that shadows the class
variable of the same name, like in my example. If you check, you'll
indeed see that myClass1.myNum is still 9 after calling clearNum().

By contrast, in clearSet() you do:
self.mySet.clear()
which just mutates the existing Set object in-place. No new variable
is created, and mySet is still a class variable and thus shared by all
instances.

Further reading:
http://effbot.org/zone/python-objects.htm
http://effbot.org/zone/call-by-object.htm

Cheers,
Chris
 
C

Chris Angelico

So just out of curiosity, why does it work as I had expected when the
member contains an integer, but not when the member contains a set?

It's not integer vs set; it's the difference between rebinding and
calling a method. It's nothing to do with object orientation; the same
happens with ordinary variables:
(2, 1)
c=d=[]
c,d ({}, [])
c.append("Test")
c,d
(['Test'], ['Test'])

But:
(['Foobar'], ['Test'])

When you do a=2 or c=['Foobar'], you're rebinding the name to a new
object. But c.append() changes that object, so it changes it
regardless of which name you look for it by.

Chris Angelico
 
C

Chris Rebert

So just out of curiosity, why does it work as I had expected when the
member contains an integer, but not when the member contains a set?

It's not integer vs set; it's the difference between rebinding and
calling a method. It's nothing to do with object orientation; the same
happens with ordinary variables:
({}, [])

Nasty typo in your pseudo-interpreter-session there...

Cheers,
Chris
 
S

Steven D'Aprano

Saqib said:
I have written two EXTREMELY simple python classes. One class
(myClass1) contains a data attribute (myNum) that contains an integer.
The other class (myClass2) contains a data attribute (mySet) that
contains a set.

I instantiate 2 instances of myClass1 (a & b). I then change the value
of a.myNum. It works as expected.

Then I instantiate 2 instances of myClass2 (c & d). I then change the
value of c.mySet. Bizarrely changing the value of c.mySet also affects
the value of d.mySet which I haven't touched at all!?!?!

But that is wrong -- you HAVE touched it. Look carefully: in myClass2, you
have this:

class myClass2:
mySet = sets.Set(range(1,10))
def clearSet(self):
self.mySet.clear()

mySet is a class attribute, shared by ALL instances, and mySet.clear()
modifies it in place. This is exactly the same as this snippet:

a = set([1, 2, 3])
b = a
b.clear()
a set([])
b
set([])

In Python, attributes assigned in the class scope are shared between all
instances. Attributes assigned directly on self are not:

class Test:
a = set([1, 2, 3])
def __init__(self):
self.b = set([1, 2, 3])

False


So why does myClass1 behave differently? Simple: look at the clear method:

class myClass1:
myNum = 9
def clearNum(self):
self.myNum = 0

It assigns a new attribute, rather than modifying the object in place!
Assignment to self.myNum creates a new unshared attribute:
9

The simplest way to fix this is to move the declaration of self.mySet into
the __init__ method.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,979
Messages
2,570,185
Members
46,728
Latest member
FernMcmull

Latest Threads

Top