Program eating memory, but only on one machine?

P

Per B. Sederberg

Hi Everybody:

I'm having a difficult time figuring out a a memory use problem. I
have a python program that makes use of numpy and also calls a small C
module I wrote because part of the simulation needed to loop and I got
a massive speedup by putting that loop in C. I'm basically
manipulating a bunch of matrices, so nothing too fancy.

That aside, when the simulation runs, it typically uses a relatively
small amount of memory (about 1.5% of my 4GB of RAM on my linux
desktop) and this never increases. It can run for days without
increasing beyond this, running many many parameter set iterations.
This is what happens both on my Ubuntu Linux machine with the
following Python specs:

Python 2.4.4c1 (#2, Oct 11 2006, 20:00:03)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.'1.0rc1'

and also on my Apple MacBook with the following Python specs:

Python 2.4.3 (#1, Apr 7 2006, 10:54:33)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

Well, that is the case on two of my test machines, but not on the one
machine that I really wish would work, my lab's cluster, which would
give me 20-fold increase in the number of processes I could run. On
that machine, each process is using 2GB of RAM after about 1 hour (and
the cluster MOM eventually kills them). I can watch the process eat
RAM at each iteration and never relinquish it. Here's the Python spec
of the cluster:

Python 2.4.4 (#1, Jan 21 2007, 12:09:48)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-49)] on linux2
Type "help", "copyright", "credits" or "license" for more information.'1.0.1'

It also showed the same issue with the April 2006 2.4.3 release of python.

I have tried using the gc module to force garbage collection after
each iteration, but no change. I've done many newsgroup/google
searches looking for known issues, but none found. The only major
difference I can see is that our cluster is stuck on a really old
version of gcc with the RedHat Enterprise that's on there, but I found
no suggestions of memory issues online.

So, does anyone have any suggestions for how I can debug this problem?
If my program ate up memory on all machines, then I would know where
to start and would blame some horrible programming on my end. This
just seems like a less straightforward problem.

Thanks for any help,
Per
 
W

Wolfgang Draxinger

Per said:
Python 2.4.4c1 (#2, Oct 11 2006, 20:00:03)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on
[linux2 Type "help", "copyright", "credits" or "license" for
more information.

Doesn't eat up.
Python 2.4.3 (#1, Apr 7 2006, 10:54:33)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin
Type "help", "copyright", "credits" or "license" for more
information.

Doesn't eat up.
Python 2.4.4 (#1, Jan 21 2007, 12:09:48)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-49)] on linux2
Type "help", "copyright", "credits" or "license" for more
information.

Eats up memory
So, does anyone have any suggestions for how I can debug this
problem?

Have a look at the version numbers of the GCC used. Probably
something in your C code fails if it interacts with GCC 3.x.x.
It's hardly Python eating memory, this is probably your C
module. GC won't help here, since then you must add this into
your C module.
If my program ate up memory on all machines, then I would know
where to start and would blame some horrible programming on my
end. This just seems like a less straightforward problem.

GCC 3.x.x brings other runtime libs, than GCC 4.x.x, I would
check into that direction.

Wolfgang Draxinger
 
P

Per B.Sederberg

Wolfgang Draxinger said:
Have a look at the version numbers of the GCC used. Probably
something in your C code fails if it interacts with GCC 3.x.x.
It's hardly Python eating memory, this is probably your C
module. GC won't help here, since then you must add this into
your C module.


GCC 3.x.x brings other runtime libs, than GCC 4.x.x, I would
check into that direction.

Thank you for the suggestions. Since my C module is such a small part of the
simulations, I can just comment out the call to that module completely (though I
am still loading it) and fill in what the results would have been with random
values. Sadly, the program still eats up memory on our cluster.

Still, it could be something related to compiling Python with the older GCC.

I'll see if I can make a really small example program that eats up memory on our
cluster. That way we'll have something easy to work with.

Thanks,
Per
 
W

Wolfgang Grafen

I had a similar problem with an extension module on Solaris years ago.
My problem at that time:
I requested memory and released it and requested more memory in the next step
and so on.

The reason that the memory was eaten up:
An answer out of this group was that the operating system doesn't release the
memory space because it assumes you will need it soon again. The memory will
only be released with the end of the process.

The solution was always to request memory for the largest array the process will
demand and it worked for me.

Regards

Wolfgang
 
P

Per B.Sederberg

Per B.Sederberg said:
I'll see if I can make a really small example program that eats up memory on
our cluster. That way we'll have something easy to work with.

Now this is weird. I figured out the bug and it turned out that every time you
call numpy.setmember1d in the latest stable release of numpy it was using up a
ton of memory and never releasing it.

I replaced every instance of setmember1d with my own method below and I have
zero increase in memory. It's not the most efficient of code, but it gets the
job done...


def ismember(a,b):
ainb = zeros(len(a),dtype=bool)
for item in b:
ainb = ainb | (a==item)
return ainb

I'll now go post this problem on the numpy forums.

Best,
Per
 
R

Robert Kern

Per said:
Now this is weird. I figured out the bug and it turned out that every time you
call numpy.setmember1d in the latest stable release of numpy it was using up a
ton of memory and never releasing it.

Hmm. With a recent checkout from SVN, I don't see any memory increase.


In [15]: from numpy import *

In [16]: ar1 = arange(1000000)

In [17]: ar2 = arange(3, 7)

In [18]: import itertools

In [19]: for i in itertools.count(1):
....: if not i % 1000:
....: print i
....: x = setmember1d(ar1, ar2)

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top