Python resident memory retention & Evan Jones' improvements

M

Matt Ernst

My apologies in advance if this has been addressed before. Google does
not presently seem to return search results for this group from more
than a couple of months ago.

I have some long-running Python processes that slowly increase in
resident memory size, and whose resident size goes down only when they
are restarted. I spent hours with gc and heapy but was unable to
identify obvious culprits. I eventually tracked the problem down to
buffering data to a queue for later processing. Putting to the queue
increases resident size, but getting from it never decreases resident
size. In fact I see the same behavior when I use plain lists instead
of queue objects.

I thought Evan Jones altered Python to deal with this very problem,
and the change went into the release of 2.5.

Here is Tim Peters announcing the change:
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

He included this simple test program to show the improvement:
"""
x = []
for i in xrange(1000000):
x.append([])
raw_input("full ")
del x[:]
raw_input("empty ")
"""

If you look at resident size in the "full" stage, the interpreter has
grown to tens of megabytes. If you look at it in the "empty" stage, it
goes back down to less than 10 megabytes. But if you run this trivial
variation on the same program, memory use goes up and stays up:

"""
x = []
for i in xrange(1000000):
x.append([])
raw_input("full ")
del x[:]
for i in xrange(1000000):
x.append([])
del x[:]
raw_input("empty ")
"""

At the "empty" prompt resident memory size has not decreased. I see
this pattern of behavior in CPython 3.1.1, 2.6.3, 2.5.2, and Jython
2.5.1. I have tested under 32 and 64 bit Intel Linux.

At this point I suspect that I am not going to be able to force my
long-running processes to shrink their resident size, since I can't
force it in much simpler tests. I am curious about why it happens
though. That the second program should retain a larger resident memory
footprint than the first is (to me) quite surprising.
 
A

Andrew MacIntyre

Matt Ernst wrote:

{...}
I thought Evan Jones altered Python to deal with this very problem,
and the change went into the release of 2.5.

Here is Tim Peters announcing the change:
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

He included this simple test program to show the improvement:
"""
x = []
for i in xrange(1000000):
x.append([])
raw_input("full ")
del x[:]
raw_input("empty ")
"""

If you look at resident size in the "full" stage, the interpreter has
grown to tens of megabytes. If you look at it in the "empty" stage, it
goes back down to less than 10 megabytes. But if you run this trivial
variation on the same program, memory use goes up and stays up:

"""
x = []
for i in xrange(1000000):
x.append([])
raw_input("full ")
del x[:]
for i in xrange(1000000):
x.append([])
del x[:]
raw_input("empty ")
"""

At the "empty" prompt resident memory size has not decreased. I see
this pattern of behavior in CPython 3.1.1, 2.6.3, 2.5.2, and Jython
2.5.1. I have tested under 32 and 64 bit Intel Linux.

At this point I suspect that I am not going to be able to force my
long-running processes to shrink their resident size, since I can't
force it in much simpler tests. I am curious about why it happens
though. That the second program should retain a larger resident memory
footprint than the first is (to me) quite surprising.

There are two things you need to be aware of in this situation:

- not all Python's memory is allocated through Python's specialised
malloc() - int and float objects in particular (in 2.x at least) are
allocated directly via a privately managed free list, and any
allocation requiring more than 256 bytes is directed to the platform
malloc(). Any memory not allocated via Python's malloc() is not
subject to the memory release facility referred to above.

Python 2.6 does improve the management of memory consumed by int and
float objects via the garbage collector.

- while Python attempts to maximally utilise memory arenas to improve
the chances of being able to free them, Python's malloc() does not do
any compaction of memory (ie moving allocations between arenas) within
existing arenas. Nor does garbage collection do this. So fragmented
allocations can cause the retention of nearly empty arenas.

I suspect that what you see with the second test script above is caused
by some sort of fragmentation.

My recollection of Evan's objective with his work was to deal with the
case where long running processes created lots of objects on startup,
but having initialised no longer need most of the created objects.
Without his modification, the memory would have been retained unused for
the remaining life of the process. It also helps with cyclic bursts of
object creation/deletion.

But there are circumstances where it doesn't kick in.

To get a deeper understanding of your issue will require deeper
debugging. I have done this at times by building Python with a wrapper
around malloc() (& friends) to log memory allocation activity.

--
 
M

Matt Ernst

There are two things you need to be aware of in this situation:

- not all Python's memory is allocated through Python's specialised
   malloc() - int and float objects in particular (in 2.x at least) are
   allocated directly via a privately managed free list, and any
   allocation requiring more than 256 bytes is directed to the platform
   malloc().  Any memory not allocated via Python's malloc() is not
   subject to the memory release facility referred to above.

   Python 2.6 does improve the management of memory consumed by int and
   float objects via the garbage collector.

- while Python attempts to maximally utilise memory arenas to improve
   the chances of being able to free them, Python's malloc() does not do
   any compaction of memory (ie moving allocations between arenas) within
   existing arenas.  Nor does garbage collection do this.  So fragmented
   allocations can cause the retention of nearly empty arenas.

I suspect that what you see with the second test script above is caused
by some sort of fragmentation.

Thank you for the explanation. Since my real application uses many
small objects, it makes sense that memory cannot be reclaimed in the
absence of compaction.

Matt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top