memory leak problem with arrays

S

sonjaa

Hi

I'm new to programming in python and I hope that this is the problem.

I've created a cellular automata program in python with the numpy array
extensions. After each cycle/iteration the memory used to examine and
change the array as determined by the transition rules is never freed.
I've tried using "del" on every variable possible, but that hasn't
worked. I've read all the forums for helpful hints on what to do, but
nothing has worked so far. I've even tried the "python memory
verification" (beta) program, which did point to numpy.dtype and
numpy.ndarray as increasing objects, before the whole computer crashed.


I can supply the code if needed. I'm desperate because this is part of
my thesis, and if I can't get this fixed, I'll try another programming
language.

thanks in advance
Sonja
 
S

Serge Orlov

sonjaa said:
Hi

I'm new to programming in python and I hope that this is the problem.

I've created a cellular automata program in python with the numpy array
extensions. After each cycle/iteration the memory used to examine and
change the array as determined by the transition rules is never freed.
I've tried using "del" on every variable possible, but that hasn't
worked.

Python keeps track of number of references to every object if the
object has more that one reference by the time you use "del" the object
is not freed, only number of references is decremented.

Print the number of references for all the objects you think should be
freed after each cycle/iteration, if is not equal 2 that means you are
holding extra references to those objects. You can get the number of
references to any object by calling sys.getrefcount(obj)
 
S

sonjaa

Serge said:
Python keeps track of number of references to every object if the
object has more that one reference by the time you use "del" the object
is not freed, only number of references is decremented.

Print the number of references for all the objects you think should be
freed after each cycle/iteration, if is not equal 2 that means you are
holding extra references to those objects. You can get the number of
references to any object by calling sys.getrefcount(obj)

thanks for the info. I used this several variables/objects and
discovered that little counters i.e. k = k +1 have many references to
them, up tp 10000+.
Is there a way to free them?

regards
Sonja
 
J

John Machin

thanks for the info. I used this several variables/objects and
discovered that little counters i.e. k = k +1 have many references to
them, up tp 10000+.
Is there a way to free them?

If (for example) k refers to the integer object 10, all that means is
that you have 10000+ objects whose value is 10. The references to them
will be scattered throughout your data structures somewhere.

Caveat: I'm not a numpy user. Now read on:

I would have thought [by extrapolation from the built-in "array" module]
that numpy would allow you to "declare" a homogeneous array of integers
which would be internal integers, not python object integers, in which
case you would not be getting 10000+ references to whatever "k" refers to.

Suggested approaches: read numpy manual, publish relevant parts of your
code, wait for a numpy guru to appear.

HTH,
John
 
R

Robert Kern

sonjaa said:
Hi

I'm new to programming in python and I hope that this is the problem.

I've created a cellular automata program in python with the numpy array
extensions. After each cycle/iteration the memory used to examine and
change the array as determined by the transition rules is never freed.
I've tried using "del" on every variable possible, but that hasn't
worked. I've read all the forums for helpful hints on what to do, but
nothing has worked so far. I've even tried the "python memory
verification" (beta) program, which did point to numpy.dtype and
numpy.ndarray as increasing objects, before the whole computer crashed.

Please post to numpy-discussion:

http://www.scipy.org/Mailing_Lists

We will need to know the version of numpy which you are using. There used to be
a bug that sounds like this, but it was fixed some time ago. Also, please try to
narrow your program down to the smallest piece of code that runs and still
displays the memory leak.

Thank you.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
S

Serge Orlov

sonjaa said:
thanks for the info. I used this several variables/objects and
discovered that little counters i.e. k = k +1 have many references to
them, up tp 10000+.
Is there a way to free them?

Although it's looks suspicious, even if you manage to free it you will
gain only 12 bytes. I think you should concentrate on more fat
objects ;)
 
S

sonjaa

Serge said:
Although it's looks suspicious, even if you manage to free it you will
gain only 12 bytes. I think you should concentrate on more fat
objects ;)


Sent message to the NumPy forum as per Roberts suggestion.
An update after implimenting the suggestions:

After doing this I see that iterative counters used to collect
occurrences
and nested loop counters (ii & jj) as seen in the code example below
are the culprits with the worst ones over 1M:

for ii in xrange(0,40):
for jj in xrange(0,20):
try:
nc = y[a+ii,b+jj]
except IndexError: nc = 0

if nc == "1" or nc == "5":
news = news +1
if news == 100:
break
else:
pass
y[a+ii,b+jj] = 4
else:
pass


The version of python I'm using is 2.4.3 and the version of NumPy is
0.9.8

thanks again for all the help
Sonja
 
F

Fredrik Lundh

After doing this I see that iterative counters used to collect occurrences
and nested loop counters (ii & jj) as seen in the code example below
are the culprits with the worst ones over 1M:

for ii in xrange(0,40):
for jj in xrange(0,20):
try:
nc = y[a+ii,b+jj]
except IndexError: nc = 0

if nc == "1" or nc == "5":
news = news +1
if news == 100:
break
else:
pass
y[a+ii,b+jj] = 4
else:
pass

what's "y" in this example ?

</F>
 
C

Carl Banks

sonjaa said:
I've created a cellular automata program in python with the numpy array
extensions. After each cycle/iteration the memory used to examine and
change the array as determined by the transition rules is never freed.

Are you aware that slicing shares memory? For example, say you defined
a grid to do the automata calculations on, like this:

grid = numpy.zeros([1000,1000])

And then, after running it, you took a tiny slice as a region of
interest, for example:

roi = grid[10:20,10:20]

Then deleted grid:

del grid

Then stored roi somewhere, for example:

run_results.append(roi)

If you do this, the memory for the original grid won't get freed.
Although grid was deleted, roi still contains a reference to the whole
1000x1000 array, even though it's only a tiny slice of it. Your poorly
worded description--no offense--of what you did suggests that this is a
possibility in your case. I recommend you try to create a new array
out of any slices you make, like this (but ONLY if the slice doesn't
depend on the memory being shared):

roi = numpy.array(grid[10:20,10:20])

This time, when you del grid, there is no object left referencing the
array data, so it'll be freed.

This might not be your problem. Details are important when asking
questions, and so far you've only given us enough to speculate with.

Carl Banks
 
S

sonjaa

Fredrik said:
After doing this I see that iterative counters used to collect occurrences
and nested loop counters (ii & jj) as seen in the code example below
are the culprits with the worst ones over 1M:

for ii in xrange(0,40):
for jj in xrange(0,20):
try:
nc = y[a+ii,b+jj]
except IndexError: nc = 0

if nc == "1" or nc == "5":
news = news +1
if news == 100:
break
else:
pass
y[a+ii,b+jj] = 4
else:
pass

what's "y" in this example ?

</F>

"y" is a 500x500 array.
 
S

sonjaa

Carl said:
sonjaa said:
I've created a cellular automata program in python with the numpy array
extensions. After each cycle/iteration the memory used to examine and
change the array as determined by the transition rules is never freed.

Are you aware that slicing shares memory? For example, say you defined
a grid to do the automata calculations on, like this:

grid = numpy.zeros([1000,1000])

And then, after running it, you took a tiny slice as a region of
interest, for example:

roi = grid[10:20,10:20]

Then deleted grid:

del grid

Then stored roi somewhere, for example:

run_results.append(roi)

If you do this, the memory for the original grid won't get freed.
Although grid was deleted, roi still contains a reference to the whole
1000x1000 array, even though it's only a tiny slice of it. Your poorly
worded description--no offense--of what you did suggests that this is a
possibility in your case. I recommend you try to create a new array
out of any slices you make, like this (but ONLY if the slice doesn't
depend on the memory being shared):

roi = numpy.array(grid[10:20,10:20])

This time, when you del grid, there is no object left referencing the
array data, so it'll be freed.

This might not be your problem. Details are important when asking
questions, and so far you've only given us enough to speculate with.

Carl Banks

I believe I understand your post. I don't think I was slicing the
array, I was only changing the values of the array.

I will try your suggestion and let you know how it goes

thanks
Sonja
 
S

Serge Orlov

sonjaa said:
Sent message to the NumPy forum as per Roberts suggestion.
An update after implimenting the suggestions:

After doing this I see that iterative counters used to collect
occurrences
and nested loop counters (ii & jj) as seen in the code example below
are the culprits with the worst ones over 1M:

That means you have over 1M integers in your program. How did it happen
if you're using numpy arrays? If I allocate a numpy array of one
million bytes it is not using one million integers, whereas a python
list of 1M integers creates 1M integers:
import numpy
a = numpy.zeros((1000000,), numpy.UnsignedInt8)
import sys
sys.getrefcount(0) 632
b=[0]*1000000
sys.getrefcount(0) 1000632

But that doesn't explain why your program doesn't free memory. But the
way, are you sure you have enough memory for one iteration of your
program?
 
S

sonjaa

Hi Fredrik

the array was created by reading in values from a ascii file.

also, I've implemented the suggestions, but nothing has worked to date.
And yes, I have enough memory for one iteration. The app usually runs
out of memory around the 12th iteration.

Also, I can send a working version of the app, and the two associated
ascii files, if anyone is interested.

-Sonja
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,218
Messages
2,571,124
Members
47,727
Latest member
smavolo

Latest Threads

Top