Pickle dict subclass instances using new protocol in PEP 307

J

Jimmy Retzlaff

I have a subclass of dict that acts kind of like Windows' file systems -
keys are case insensitive but case preserving (keys are assumed to be
strings, or at least they have to support .lower()). It's worked well
for quite a while - it used to inherit from UserDict and it has
inherited from dict since that became possible.

I just tried to pickle an instance of this class for the first time
using Python 2.3.2 on Windows. If I use protocols 0 (text) or 1 (binary)
everything works great. If I use protocol 2 (PEP 307) then I have a
problem when loading my pickle. Here is a small sample to illustrate:

######

import pickle

class myDict(dict):
def __init__(self, *args, **kwargs):
self.x = 1
dict.__init__(self, *args, **kwargs)

def __getstate__(self):
print '__getstate__ returning', (self.copy(), self.x)
return (self.copy(), self.x)

def __setstate__(self, (d, x)):
print '__setstate__'
print ' object already in state:', self
print ' x already in self:', 'x' in dir(self)
self.x = x
self.update(d)

def __setitem__(self, key, value):
print '__setitem__', (key, value)
dict.__setitem__(self, key, value)


d = myDict()
d['key'] = 'value'

protocols = [(0, 'Text'), (1, 'Binary'), (2, 'PEP 307')]
for protocol, description in protocols:
print '--------------------------------------'
print 'Pickling with Protocol %s (%s)' % (protocol, description)
pickle.dump(d, file('test.pickle', 'wb'), protocol)
del d
print 'Unpickling'
d = pickle.load(file('test.pickle', 'rb'))

######

When run it prints:

__setitem__ ('key', 'value') - self.x exists: True
--------------------------------------
Pickling with Protocol 0 (Text)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
object already in state: {'key': 'value'}
x already in self: False
--------------------------------------
Pickling with Protocol 1 (Binary)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
object already in state: {'key': 'value'}
x already in self: False
--------------------------------------
Pickling with Protocol 2 (PEP 307)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setitem__ ('key', 'value') - self.x exists: False
__setstate__
object already in state: {'key': 'value'}
x already in self: False


The problem I'm having stems from the fact that the subclass'
__setitem__ is called before __setstate__ when loading a protocol 2
pickle (the subclass' __setitem__ is not called at all with protocols 0
or 1). If I don't define __get/setstate__ then I have the same problem
in that the subclass' __setitem__ is called before the subclass'
instance variables are created by the pickle mechanism. I need to access
one of those instance variables in my __setitem__.

I suppose my question is one of practicality. I'd like my class
instances to work with all pickle protocols. Am I getting too fancy
trying to inherit from dict? Should I go back to UserDict or maybe to
DictMixin? Should I submit a bug report on this, or am I getting too
close to internals to expect a certain behavior across pickle protocols?

Thanks,
Jimmy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top