J
Jimmy Retzlaff
I have a subclass of dict that acts kind of like Windows' file systems -
keys are case insensitive but case preserving (keys are assumed to be
strings, or at least they have to support .lower()). It's worked well
for quite a while - it used to inherit from UserDict and it has
inherited from dict since that became possible.
I just tried to pickle an instance of this class for the first time
using Python 2.3.2 on Windows. If I use protocols 0 (text) or 1 (binary)
everything works great. If I use protocol 2 (PEP 307) then I have a
problem when loading my pickle. Here is a small sample to illustrate:
######
import pickle
class myDict(dict):
def __init__(self, *args, **kwargs):
self.x = 1
dict.__init__(self, *args, **kwargs)
def __getstate__(self):
print '__getstate__ returning', (self.copy(), self.x)
return (self.copy(), self.x)
def __setstate__(self, (d, x)):
print '__setstate__'
print ' object already in state:', self
print ' x already in self:', 'x' in dir(self)
self.x = x
self.update(d)
def __setitem__(self, key, value):
print '__setitem__', (key, value)
dict.__setitem__(self, key, value)
d = myDict()
d['key'] = 'value'
protocols = [(0, 'Text'), (1, 'Binary'), (2, 'PEP 307')]
for protocol, description in protocols:
print '--------------------------------------'
print 'Pickling with Protocol %s (%s)' % (protocol, description)
pickle.dump(d, file('test.pickle', 'wb'), protocol)
del d
print 'Unpickling'
d = pickle.load(file('test.pickle', 'rb'))
######
When run it prints:
__setitem__ ('key', 'value') - self.x exists: True
--------------------------------------
Pickling with Protocol 0 (Text)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
object already in state: {'key': 'value'}
x already in self: False
--------------------------------------
Pickling with Protocol 1 (Binary)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
object already in state: {'key': 'value'}
x already in self: False
--------------------------------------
Pickling with Protocol 2 (PEP 307)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setitem__ ('key', 'value') - self.x exists: False
__setstate__
object already in state: {'key': 'value'}
x already in self: False
The problem I'm having stems from the fact that the subclass'
__setitem__ is called before __setstate__ when loading a protocol 2
pickle (the subclass' __setitem__ is not called at all with protocols 0
or 1). If I don't define __get/setstate__ then I have the same problem
in that the subclass' __setitem__ is called before the subclass'
instance variables are created by the pickle mechanism. I need to access
one of those instance variables in my __setitem__.
I suppose my question is one of practicality. I'd like my class
instances to work with all pickle protocols. Am I getting too fancy
trying to inherit from dict? Should I go back to UserDict or maybe to
DictMixin? Should I submit a bug report on this, or am I getting too
close to internals to expect a certain behavior across pickle protocols?
Thanks,
Jimmy
keys are case insensitive but case preserving (keys are assumed to be
strings, or at least they have to support .lower()). It's worked well
for quite a while - it used to inherit from UserDict and it has
inherited from dict since that became possible.
I just tried to pickle an instance of this class for the first time
using Python 2.3.2 on Windows. If I use protocols 0 (text) or 1 (binary)
everything works great. If I use protocol 2 (PEP 307) then I have a
problem when loading my pickle. Here is a small sample to illustrate:
######
import pickle
class myDict(dict):
def __init__(self, *args, **kwargs):
self.x = 1
dict.__init__(self, *args, **kwargs)
def __getstate__(self):
print '__getstate__ returning', (self.copy(), self.x)
return (self.copy(), self.x)
def __setstate__(self, (d, x)):
print '__setstate__'
print ' object already in state:', self
print ' x already in self:', 'x' in dir(self)
self.x = x
self.update(d)
def __setitem__(self, key, value):
print '__setitem__', (key, value)
dict.__setitem__(self, key, value)
d = myDict()
d['key'] = 'value'
protocols = [(0, 'Text'), (1, 'Binary'), (2, 'PEP 307')]
for protocol, description in protocols:
print '--------------------------------------'
print 'Pickling with Protocol %s (%s)' % (protocol, description)
pickle.dump(d, file('test.pickle', 'wb'), protocol)
del d
print 'Unpickling'
d = pickle.load(file('test.pickle', 'rb'))
######
When run it prints:
__setitem__ ('key', 'value') - self.x exists: True
--------------------------------------
Pickling with Protocol 0 (Text)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
object already in state: {'key': 'value'}
x already in self: False
--------------------------------------
Pickling with Protocol 1 (Binary)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
object already in state: {'key': 'value'}
x already in self: False
--------------------------------------
Pickling with Protocol 2 (PEP 307)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setitem__ ('key', 'value') - self.x exists: False
__setstate__
object already in state: {'key': 'value'}
x already in self: False
The problem I'm having stems from the fact that the subclass'
__setitem__ is called before __setstate__ when loading a protocol 2
pickle (the subclass' __setitem__ is not called at all with protocols 0
or 1). If I don't define __get/setstate__ then I have the same problem
in that the subclass' __setitem__ is called before the subclass'
instance variables are created by the pickle mechanism. I need to access
one of those instance variables in my __setitem__.
I suppose my question is one of practicality. I'd like my class
instances to work with all pickle protocols. Am I getting too fancy
trying to inherit from dict? Should I go back to UserDict or maybe to
DictMixin? Should I submit a bug report on this, or am I getting too
close to internals to expect a certain behavior across pickle protocols?
Thanks,
Jimmy