Help! define_method leaking procs...

A

Ara.T.Howard

Just because a process frees all its memory before it > terminates doesn't
mean there isn't a leak. That doesn't help too much if you run out of
memory before the process is done. I've seen 2 definitions: a) when a
program isn't freeing memory that isn't used anymore, b) when a program
loses all references to memory that isn't being freed. From the ruby script
level for this example, by both of these definitions it is a memory leak.
From ruby itself, ruby probably still has references to the memory, so (b)
may not call it a "memory leak". To me (a) is the right definition because
it describes symtoms not the mechanism. If any program keeps growing in
memory and it shouldn't, I'd say it has a memory leak. It doesn't matter
how it comes about and whether or not the program could free the memory if
it was fixed, it is still "leaking" memory.

by your definition the code i was working with last week is 'leaky': it
malloc's an 4GB grid (yes that's right) and does nearest neighbor filling on
it. if you think about it for a few moments you'll realize that only a small
window of pixels needs to be in memory at once and, after a pixel has been
written it will never be needed again, except very soon after as a neighbor
itself, thus it need not remain in memory. the code didn't run - it ran out of
memory. there is not leak, just too large a memory requirement. changing the
malloc/write/free to mmap/munmap fixed the problem - proving there was no leak
since allocating/deallocating a different way allowed the code to run and
recognizing that the code had not lost the ability (pointer) to free the
memory, it just hadn't yet.

they key to 'a' above is "isn't used anymore." all ruby knows is that your
proc __might__ need __anything__, for instance:

harp:~ > cat a.rb
def scope
a, b = 4, 2
lambda{|s| eval "p #{ s }" }
end

scope[ ARGV.shift ]

harp:~ > ruby a.rb a
4

harp:~ > ruby a.rb b
2

so it must, according the present promised behaviour of lambda, preverse
everything. so it seems that it cannot be a leak according to 'a' because it's
t.b.d if the memory will be used or not and it cannot be a leak according to
'b' because references are clearly not lost. an interpreter cannot manage
memory as efficiently as a very bright programmer - it can do it better than
most though.

fork/drb/ and co-processes are good tools for cases like this. it's easy to
setup a memory sandbox in ruby that you can pass and return objects from and
which is guarunteed to free all memory (on exit).

in any case, it seems the issue is that people may not always like what lambda
does, but it seems like it's doing it properly.
If you want to assign blame to ruby, the problem is not GC.
The problem is that closures hold strong references to all
variables in the scope whether or not they are used. One of my
solutions was to make references (in Proc#binding) to currently
unused variables (by the closure) "weak", so that they wouldn't
prevent GC.

agreed - but no 'leak'.

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| anything that contradicts experience and logic should be abandoned.
| -- h.h. the 14th dalai lama
===============================================================================
 
A

Ara.T.Howard

The other leaking examples were technically quadratic (O(n**2)) were they
should have been O(n). The above one is also O(n). Take on the "s = '42' *
i and" and you'll see the same memory usage. I don't see an issue in the
above example. It's just that the memory usage was so small for anything
below n=1024, that it probably didn't have to grow much beyond what ruby had
allocated to it when it started up.

you are correct : never send email at 1am ;-)

so, no leak from define_method then.


-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| anything that contradicts experience and logic should be abandoned.
| -- h.h. the 14th dalai lama
===============================================================================
 
E

Eric Mahurin

--- ts said:
=20
E> to memory that isn't being freed. From the ruby script
level
E> for this example, by both of these definitions it is a
memory
E> leak. From ruby itself, ruby probably still has
references to
E> the memory, so (b) may not call it a "memory leak". To me
(a)
E> is the right definition because it describes symtoms not
the
E> mechanism. If any program keeps growing in memory and it
E> shouldn't, I'd say it has a memory leak. It doesn't
matter how
E> it comes about and whether or not the program could free
the
E> memory if it was fixed, it is still "leaking" memory.
=20
No please.
=20
You are writing stupid program, don't expect that ruby
correct what you
write. ruby is a programming language, first learn it before
trying to
blame it.

The above paragraph doesn't even blame ruby. This thread
started with a real ruby program "leaking" memory. I gave a
possible reason as that closures hold strong references to all
variables in there scope (with a leaky example). Matz also
responded to this thread giving the exact same possibility. He
has also discussed addressing this issue in another thread
(where he said eval could be a keyword). Does he need to go
learn the language better too?



=09
=09
__________________________________=20
Yahoo! Mail - PC Magazine Editors' Choice 2005=20
http://mail.yahoo.com
 
T

ts

E> The above paragraph doesn't even blame ruby. This thread
E> started with a real ruby program "leaking" memory. I gave a
E> possible reason as that closures hold strong references to all
E> variables in there scope (with a leaky example).

This is really the first time that you use a language which has closure ?


Guy Decoux
 
E

Eric Mahurin

--- ts said:
=20
E> The above paragraph doesn't even blame ruby. This thread
E> started with a real ruby program "leaking" memory. I gave
a
E> possible reason as that closures hold strong references to
all
E> variables in there scope (with a leaky example).
=20
This is really the first time that you use a language which
has closure ?=20

Why do you feel the need to get personal? And I noticed you
deleted the sentences where Matz discussed the same issue.

To answer you question though - No. I've used perl for over 10
years before coming to ruby. I think perl is where closures
started. And here is the same type of example with perl:

perl -e '@f=3D();$n=3D2**$ARGV[0];for my $i (0..$n)
{my($a)=3D[0..$i];push(@f,sub{$i*$i})};system("cat
/proc/$$/status|grep VmSize")' 14
VmSize: 12812 kB

The memory usage was O(n).

Compare this to the case where it should blow up because
closure references the large array:

perl -e '@f=3D();$n=3D2**$ARGV[0];for my $i (0..$n)
{my($a)=3D[0..$i];push(@f,sub{$a;$i*$i})};system("cat
/proc/$$/status|grep VmSize")' 14
Killed

So perl only keeps variables that it uses with the closure.=20
Ruby keeps everything because of eval and Proc#binding. But, I
still think this needs to be addressed.




=09
__________________________________=20
Start your day with Yahoo! - Make it your home page!=20
http://www.yahoo.com/r/hs
 
T

ts

E> To answer you question though - No. I've used perl for over 10
E> years before coming to ruby. I think perl is where closures
E> started. And here is the same type of example with perl:

Bad reference, change your reference.


Guy Decoux
 
A

Ara.T.Howard

E> To answer you question though - No. I've used perl for over 10
E> years before coming to ruby. I think perl is where closures
E> started. And here is the same type of example with perl:

Bad reference, change your reference.

can you show us what you mean guy?

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| anything that contradicts experience and logic should be abandoned.
| -- h.h. the 14th dalai lama
===============================================================================
 
E

ES

Eric said:
So you definition of "memory leak" says that it is not a memory
leak if the unused memory can still be freed by the program.

Yes. As far as I knew, this was the only sensible definition.
If there is a problem, it is not a memory leak (as most or all
C programmers understand the term anyway); changing the term might
yield a more successful debate.
With the example I gave, you could still free the memory using
the Proc#binding with eval to assign the unused variables to
nil. But, here is a slightly modified example where (using
#define_method) where you lose this access:

n=2**13;(1..n).each{|i|
a=(1..i).to_a;self.class.send:)define_method,:"f#{i}"){i*i}};GC.start;
IO.readlines("/proc/#{Process.pid}/status").grep(/VmSize/).display'
VmSize: 172028 kB

As far as I know, you can't access any of thoses a's after you
get out of the each loop. I call that a memory leak by any
definition.

Here is the fixed version:

ruby -e '
n=2**13;(1..n).each{|i|
a=(1..i).to_a;self.class.send:)define_method,:"f#{i}"){i*i};a=nil};GC.start;
IO.readlines("/proc/#{Process.pid}/status").grep(/VmSize/).display'
VmSize: 11504 kB

E
 
E

Eric Mahurin

--- ES said:
program.=20
=20
Yes. As far as I knew, this was the only sensible definition.
If there is a problem, it is not a memory leak (as most or
all
C programmers understand the term anyway); changing the term
might
yield a more successful debate.

Try a google on "memory leak". The first several defintions
you'll find are more general and not limited to languages that
don't have automatic GC. Just because you have a perfect
language with perfect GC doesn't mean you can't write a program
that "leaks" memory - over time uses more and more memory when
it doesn't need it. I think the language/GC/libs should try to
plug as many holes as possible, but you still won't be able to
prevent all memory leak bugs of programs written in the
language.
a=3D(1..i).to_a;self.class.send:)define_method,:"f#{i}"){i*i}};GC.start;
IO.readlines("/proc/#{Process.pid}/status").grep(/VmSize/).display'
a=3D(1..i).to_a;self.class.send:)define_method,:"f#{i}"){i*i};a=3Dnil};GC=
start;
IO.readlines("/proc/#{Process.pid}/status").grep(/VmSize/).display'



=09
__________________________________=20
Start your day with Yahoo! - Make it your home page!=20
http://www.yahoo.com/r/hs
 
Y

Yohanes Santoso

Randy Kramer said:
I'm really not trying to start a flamewar. Some of the statements made in
this thread seem completely contrary to what I thought I knew, and
counter-intuitive as well.

Based on what you wrote below, the statements I made are in
agreement with what you thought you knew for the most part.
I didn't/don't think so. Unused memory allocations that are not freed while
the process is running seem to be the very definition of a memory
leak:

I can go with this definition. The difference between this and mine is
the word 'unused'. In a language with GC, 'unused' is an expensive
(difficult too) proposition to automatically detect. If the GC is
aggressive, the number of unused allocations has a higher probability
of being smaller than a conservative GC, and Ruby has a conservative
GC. There is this trade-off of being agressive & more expensive
vs. conservative & cheaper.
Result of Googling [define:"memory leak"]:

I agree with all these statements, and they do not contradict my
statement either.

In addition, I'd like to emphasis this:
Memory leaks are often thought of as failures to release unused memory by a
computer program. Strictly speaking, it is just unneccesary memory
consumption.

unused allocation is just an alias for unnecessary memory consumption.
A memory leak occurs when the program loses the ability to free
the memory.
en.wikipedia.org/wiki/Memory_leak

No evidence yet that the GC loses the ability to free the
memory. Proofing this amounts to proofing that an allocation is not
freed by the time the process ends. Which is exactly my statement.
I'd call it a memory leak even if the allocation is freed when the process
ends. Consider processes that (should) run continuously.

Then this really falls under the category 'unnecessary memory
consumption' as per the reference you gave.

That may be true (I really don't know)--but:
* if increasing VmSize corresponds to a slowdown in my machine, I'd call it
a pretty good "indicator"

Try tracing through leak not_immed. You will see this peculiar
variable called 'hole'. hole holds a pointer to the last allocation
performed. It's there to show that allocated memory is not necessarily
returned to OS.

By the time it gets to the 'unwinding' part, there are exactly 8194
allocations (8193 for array&elements, 1 for hole). By the time the
execution returns back to main and displays the vminfo, there are
exactly 8193 free(). There is still 1 allocation that is not
freed. Yet, the VmSize shows that the other 8193 free() does not
result in the memory being returned to the OS.

What I was saying to Eric Mahurin was, the ruby gc, even being
conservative as it is, could have free()ed all the unnecessary
allocations yet, there could still be one allocation that would result
in every freed allocations in not being returned to the OS. In this
case, a program really cannot do anything.

So a large VmSize value could have meant anything: it could have meant
that the gc is too conservative, it could have meant the gc is buggy,
it could have meant the gc has free()ed all allocations saves one, it
could have meant ..., etc.
* what can you use as a better indicator

object count would be a better indicator.

def count_objects
object_count=0
ObjectSpace.each_object{ object_count += 1}
object_count
end

Had Eric used this, I would not have been involved at all in this
thread.
Out of curiosity (and the intent to try to avoid such), which OS?

My first RL experience with this is in 1997 while working with a
HP-UX. Memory allocated was never returned to the OS. If you were to
modify leak so that it free() hole immediately and run it on that OS,
the VmSize would not decrease, although the VmRes would (see below).
better clarify your statement--are you saying that in some OS, VmSize does
not decrease even if a program is stopped?

VmSize depends on the existence of the process, as such, if the
process dies, VmSize no longer exists too. But I'm guessing that your
real question is whether the OS will reclaim the memory when a process
dies. In all *NIX OS, this is a true statement. Some versions of win95
(win3.11 too?) do not. Make a program that allocates some memory, kill
the program, run it again, kill it again, until you see allocation
failure.
Again, seems like the very defintion of a memory leak.

'unnecessary memory consumption' :)
It is true that if the memory allocated to a process is returned to the OS
when the process dies, that gives you a workaround for the memory leak, but
it is still a memory leak.

Yeah, this depends on your definition of memory leak. It is an example
where terms have taken new meanings (incorporating the concept of
'unnecessary memory consumption'). Another example would be the word
'scalable' which today would be taken as 'high-performant' (thanks
zedas) by many people, or a more extreme example would be 'hacker'
which has became to mean the opposite, 'the bad guy'.
occasionally close kmail and epihany and get back a lot of memory (and make
my system run a lot better.)

In *NIX OS, what you should really be watching for is not VmSize,
rather VmRes[ident] which shows you how much RAM your program is
really taking. The diff of VmSize-VmRes would be in swap where it
could just stay there until the process ends.

This certainly is true in long-running processes. For example, on my
desktop system which has been running for 3 months (and some
applications have been running that long too: emacs, kde, konq), the
total VmSize of all processes is around 3GB, way more than the amount
of RAM I have (around 700MB), yet those apps still respond crisply,
and the total VmRes of all process is still under 400M.
regards,
Randy Kramer

Regards,
YS.
 
L

Lyndon Samson

------=_Part_16160_17162615.1129510644763
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Have you thought about a ruby leak detector?

You could monitor object allocation with define_finalizer and a wrapped new=
 
R

Randy Kramer

Based on what you wrote below, the statements I made are in
agreement with what you thought you knew for the most part.

---<good stuff snipped>---

Thanks for the reply--it will probably take me a little while to digest this.
After I do, if I have any questions/comments, I'll write again.

regards,
Randy Kramer
 
J

Jamis Buck

Have you thought about a ruby leak detector?

You could monitor object allocation with define_finalizer and a
wrapped new.
Sort of like replacing new/delete in C++

We have identified and fixed the leak, and all is well now. Thanks,
everyone, for your suggestions.

If you're curious, the problem was fixed by calling undef_method on
all the dynamically added methods before doing remove_const on the
class. We also went through each class and removed all
instance_variables from the class. Doing this seems to have allowed
the Proc objects to be garbage collected.

- Jamis
 
E

Eric Mahurin

--- Jamis Buck said:
On Oct 16, 2005, at 6:57 PM, Lyndon Samson wrote:
We have identified and fixed the leak, and all is well now.
Thanks, =20
everyone, for your suggestions.
=20
If you're curious, the problem was fixed by calling
undef_method on =20
all the dynamically added methods before doing remove_const
on the =20
class. We also went through each class and removed all =20
instance_variables from the class. Doing this seems to have
allowed =20
the Proc objects to be garbage collected.

Would you mind showing us a snippet of code that demonstrates
the problem? Maybe there is some circular references?



=09
=09
__________________________________=20
Yahoo! Mail - PC Magazine Editors' Choice 2005=20
http://mail.yahoo.com
 
J

Jamis Buck

Would you mind showing us a snippet of code that demonstrates
the problem? Maybe there is some circular references?

Well... here's a small script that duplicates the problem, but it
requires that it be run from the root directory of a Rails
application that has a model class called Dummy. And Dummy must have
an association to some other model class.

require './config/environment'

module ObjectSpace
def self.count(mod=Object)
count = 0
ObjectSpace.each_object(mod) { count += 1 }
count
end
end

20.times do
GC.start

Dummy.find(1)
Dependencies.clear
ActiveRecord::Base.reset_subclasses
Dependencies.remove_subclasses_for(ActiveRecord::Base)

procs = ObjectSpace.count(Proc)

puts "procs: #{procs}"
end

The set it up for this, you could do:

* gem install rails
* rails memleak
* cd memleak
* script/generate model Dummy
* script/generate model Thing
* edit Dummy so that it has_one :thing
* edit config/database.yml to point to a database
* add a dummies table
* insert a record into the dummies table
* run the script given above

Running against Rails 0.13.1, you should see the number of procs grow
with each request. Running against the beta gems, you'll see the
number remain constant.

- Jamis
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,181
Messages
2,570,970
Members
47,536
Latest member
VeldaYoung

Latest Threads

Top