compressed serialization module

M

Mark

I used pickle and found the file was saved in text format. I wonder
whether anyone is familiar with a good compact off-the-shelf module
available that will save in compressed format... or maybe an opinion
on a smart approach for making a custom one? Appreciate it! I'm a
bit of a n00b but have been looking around. I found a serialize.py
but it seems like overkill.

Mark
 
J

Joe Strout

I used pickle and found the file was saved in text format. I wonder
whether anyone is familiar with a good compact off-the-shelf module
available that will save in compressed format... or maybe an opinion
on a smart approach for making a custom one?

Well, here's a thought: create a zip file (using the standard zipfile
module), and pickle your data into that.

HTH,
- Joe
 
M

Mark

Thanks guys. This is for serializing to disk. I was hoping to not
have to use too many intermediate steps, but I couldn't figure out how
to pickle data into zipfile without using either intermediate string
or file. That's cool here's what I'll probably settle on (tested) -
now just need to reverse steps for the open function.

def saveOjb(self, dataObj):
fName = self.version + '_' + self.modname + '.dat'
f = open(fName, 'w')
dStr = pickle.dumps(dataObj)
c = dStr.encode("bz2")
pickle.dump(c, f, pickle.HIGHEST_PROTOCOL)
f.close()

I'm glad to see that "encode()" is not one of the string ops on the
deprecate list (using Python 2.5).

Thx,
Mark
 
S

skip

Mark> def saveOjb(self, dataObj):
Mark> fName = self.version + '_' + self.modname + '.dat'
Mark> f = open(fName, 'w')
Mark> dStr = pickle.dumps(dataObj)
Mark> c = dStr.encode("bz2")
Mark> pickle.dump(c, f, pickle.HIGHEST_PROTOCOL)
Mark> f.close()

Hmmm... Why pickle it twice?

def saveOjb(self, dataObj):
fName = self.version + '_' + self.modname + '.dat'
f = open(fName, 'wb')
f.write(pickle.dumps(dataObj, pickle.HIGHEST_PROTOCOL).encode("bz2"))
f.close()

Skip
 
M

Mark

    Mark> def saveOjb(self, dataObj):
    Mark>     fName = self.version + '_' + self.modname + '.dat'
    Mark>     f = open(fName, 'w')
    Mark>     dStr = pickle.dumps(dataObj)
    Mark>     c = dStr.encode("bz2")
    Mark>     pickle.dump(c, f, pickle.HIGHEST_PROTOCOL)
    Mark>     f.close()

Hmmm...  Why pickle it twice?

    def saveOjb(self, dataObj):
        fName = self.version + '_' + self.modname + '.dat'
        f = open(fName, 'wb')
        f.write(pickle.dumps(dataObj, pickle.HIGHEST_PROTOCOL).encode("bz2"))
        f.close()

Skip


I wasn't sure whether the string object was still a string after
"encode" is called... at least whether it's still an ascii string.
And if not, whether it could be used w/ dumps. I tested your
variation and it works the same. I guess your "write" is doing the
same as my "dump", but may be more efficient. Thanks.
 
G

greg

Mark said:
Thanks guys. This is for serializing to disk. I was hoping to not
have to use too many intermediate steps

You should be able to use a gzip.GzipFile
or bz2.BZ2File and pickle straight into it.
 
G

greg

Nick said:
(Note that basic pickle protocol is likely to be more compressible
than the binary version!)

Although the binary version may be more compact to
start with. It would be interesting to compare the
two and see which one wins.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top