pickle: huge memory consumption *during* pickling

  • Thread starter Hans Georg Krauthaeuser
  • Start date
H

Hans Georg Krauthaeuser

Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg
 
J

Jean Brouwers

FWIIW, we pickle data extracted from large log files. The number of
pickled objects is about 1500, the size of the pickle file is 55+MB and
it takes about 3 mins to generate that file**.

This is cPickle, using protocol 2 (!) and all strings to be pickled are
intern'ed when initially created.

/Jean Brouwers
ProphICy Semiconductor, Inc.

**) On a dual 2.4 GHz Xeon machine with 2 GB of memory running RedHat
Linux 8.0.
 
G

Gerrit

Hans said:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

--
Weather in Twenthe, Netherlands 11/11 19:25:
-1.0°C wind 0.4 m/s None (57 m above NAP)
--
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
-Dwight David Eisenhower, January 17, 1961
 
N

Nick Craig-Wood

Hans Georg Krauthaeuser said:
I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

You've probably got lots of instances of a single class... We managed
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!
My class use data types taken from a c++ class via swig. Don't know if
that is important...

This may be important I don't know!
 
J

Jean Brouwers

Good point. Double check first that you use

- import cPickle

- cPickle.dump(<obj>, <file>, 2) # note, protocol 2

/Jean Brouwers
ProphICy Semiconductor, Inc.
 
T

Tony Clarke

Hans Georg Krauthaeuser said:
Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.
(Snip)
My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg

The tutorial books I read (including the Python Bible, I think) said
that pickle shouldn't be used for large objects, so I try to limit it
to smaller objects in small applications. I always wondered what they
meant by large objects, maybe this is an illustration of that? :)

Would it not be possible to save your data as a file, (or use a class
method to download the stored data to a file) on your disc? You could
always reload it from there for further use. Or split the class into
several smaller ones, each of which might be more efficient at using
pickle?

Regards

Tony Clarke
 
H

Hans Georg Krauthaeuser

Yes, I'm sure that I'm using cPickle.

But, I don't use protocol 2. I will try that and post the difference.

Thanks for the hint.

Hans Georg
 
H

Hans Georg Krauthaeuser

Nick said:
....
You've probably got lots of instances of a single class... We managed

You are right. My data are class objects taken from a c++ class (via swig).
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!

Interesting, I didn't noticed __slots__ before.

In the swig-wrapper I found that my object are inherited from _object
and that is

import types
try:
_object = types.ObjectType
_newclass = 1
except AttributeError:
class _object : pass
_newclass = 0
del types

So, this are new style classes.

Now, I have to see how to get swig to generate __slots__ ...

Thanks,

Hans Georg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,211
Messages
2,571,092
Members
47,693
Latest member
david4523

Latest Threads

Top