debugging memory use and GC

S

snacktime

What is a good way to find out what objects are not being GC'd ? I am
seeing a strange pattern I can't figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.

Chris
 
D

David Vallner

--------------enig69C66D70ED972AED9C59B421
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
What is a good way to find out what objects are not being GC'd ? I am
seeing a strange pattern I can't figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.
=20

Debug, set a breakpoint after GC when you expect the anomaly to occur,
inspect ObjectSpace?

David Vallner


--------------enig69C66D70ED972AED9C59B421
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFP9D7y6MhrS8astoRAlOJAJ40wFUJfCPCCLBDhgUtSoo+VyV84ACeJa2r
+0TYggyQAkE4LysDfriIz1Y=
=aP1J
-----END PGP SIGNATURE-----

--------------enig69C66D70ED972AED9C59B421--
 
R

Rick DeNatale

What is a good way to find out what objects are not being GC'd ? I am
seeing a strange pattern I can't figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.

Of course, what you really want to know is not just what's not getting
GCed but WHY.

This can be a difficult problem. You really want to find the
reference paths from root objects.

Some GC languages like Smalltalk have methods in object to get a list
of everything which references it. I haven't seen such a facility in
Ruby. Then the problem with this is that calling this method
generates additional references to the objects referencing the object
etc. This kind of heisenberg effect makes building a tool to find
reference paths difficult.
 
R

Rick DeNatale

If you suspect some objects, add a finalizer using
ObjectSpace#add_finalizer and put some trace in it.

Of course adding a finalizer won't be much help in debugging why an
object is not being GCed, since the finalizer will never be invoked.
 
A

ara.t.howard

Of course, what you really want to know is not just what's not getting
GCed but WHY.

This can be a difficult problem. You really want to find the
reference paths from root objects.

Some GC languages like Smalltalk have methods in object to get a list
of everything which references it. I haven't seen such a facility in
Ruby. Then the problem with this is that calling this method
generates additional references to the objects referencing the object
etc. This kind of heisenberg effect makes building a tool to find
reference paths difficult.

it would be expensive, but i wonder of dumping the objects in objectspace
might be useful - since Marshal.dump already follows all references it seems
like a custom _dump method on object which could all themselves to a tree
might do the trick. in otherwords, if you dumped an object with a global tree
in contect then all objects being dumped as a result would add themselves to
this tree. after the dump, you simply keep a copy of the tree...

just a thought...

-a
 
R

Rick DeNatale

it would be expensive, but i wonder of dumping the objects in objectspace
might be useful - since Marshal.dump already follows all references it seems
like a custom _dump method on object which could all themselves to a tree
might do the trick. in otherwords, if you dumped an object with a global tree
in contect then all objects being dumped as a result would add themselves to
this tree. after the dump, you simply keep a copy of the tree...

Not sure, what I was suggesting was that the real goal is to somehow
root out the reference path or path which is keeping an object from
being reclaimed without making additional references.

Another issue is that ObjectSpace.each_object can give you objects
which aren't really alive:

ick@frodo:/public/rubyscripts$ cat gctest.rb
class Foo
def initialize
@iv = "bar"
end
end

def make_foo
p Foo.new
end

GC.enable

make_foo

ObjectSpace.each_object {|f| p f if Foo === f }
ObjectSpace.garbage_collect
puts "after gc"
ObjectSpace.each_object {|f| p f if Foo === f }
puts "done"
rick@frodo:/public/rubyscripts$ ruby gctest.rb
#<Foo:0xb7dc1804 @iv="bar">
#<Foo:0xb7dc1804 @iv="bar">
after gc
#<Foo:0xb7dc1804 @iv="bar">
done

I've played around with various versions of this, like
each_object(Foo) and that instance of Foo with no apparent references
to it seems to be sticking around for some reason.

I instantiated Foo in the make_foo method to make sure that it wasn't
still in the current stack frame.

This really goes to show that the guarantee that the GC makes is not
to free live objects, and not to free dead ones ASAP.

It also shows why you shouldn't rely on finalization as part of
application/system logic, since you never know when, or even if it
will be called.
 
J

Joel VanderWerf

Rick said:
Of course, what you really want to know is not just what's not getting
GCed but WHY.

This can be a difficult problem. You really want to find the
reference paths from root objects.

Some GC languages like Smalltalk have methods in object to get a list
of everything which references it. I haven't seen such a facility in
Ruby. Then the problem with this is that calling this method
generates additional references to the objects referencing the object
etc. This kind of heisenberg effect makes building a tool to find
reference paths difficult.

I wrote a patch for ruby 1.6/1.7 that would search for all ways of
reaching an object from root objects in objectspace:

http://redshift.sourceforge.net/debugging-GC/

Usage info at:

http://redshift.sourceforge.net/debugging-GC/gc-patch.txt

Nobu updated it for CVS as of 12 Aug 2005:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/151854

I've found this to be useful only once or twice, but in those rare cases
it can be very helpful...
 
A

ara.t.howard

Not sure, what I was suggesting was that the real goal is to somehow root
out the reference path or path which is keeping an object from being
reclaimed without making additional references.
<snip>

right. i was thinking of something like this:

harp:~ > cat a.rb
def really_stupid_reference_finder obj
begin
class << obj
def _dump(*_) throw 'referer', true end
end
rescue TypeError
nil
end
ObjectSpace.each_object do |candidate|
next if candidate == obj
referer = catch 'referer' do
begin
Marshal.dump candidate
rescue TypeError
false
end
false
end
return candidate if referer
end
return nil
ensure
GC.start
end



a = [b = '42']

referer = really_stupid_reference_finder b
p referer
p referer == a

referer = really_stupid_reference_finder [ 'new_array' ]
p referer
p referer == a



harp:~ > ruby a.rb
["42"]
true
nil
false


-a
 
J

Joel VanderWerf

<snip>

right. i was thinking of something like this:

harp:~ > cat a.rb
def really_stupid_reference_finder obj

That's a nice idea!

One problem is that a referrer can be something other than an object: a
ruby global var, a C global var, a local var. Or it can be an object
that is not dumpable, such as a proc binding.

But the throw/dump combo is a great trick to remember...

What I'd really like to see is a general object graph traversal
mechanism that can be used to help implement marshal and other dumpers,
gc tools, etc. Several (3 or 4) years ago, matz said he was moving in
this direction...[1]

----

[1] See:

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/ruby/marshal.c?cvsroot=src

and grep for vjoel:

* marshal.c (w_object): T_DATA process patch from Joel VanderWerf
<[email protected]>. This is temporary hack; it remains
undocumented, and it will be removed when marshaling is
re-designed.

The hack is still there (as of 1.8.5, anyway), still undocumented, and
still useful.

The original discussion about why it is useful starts at:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/34037
 
J

Joel VanderWerf

Joel said:
What I'd really like to see is a general object graph traversal
mechanism that can be used to help implement marshal and other dumpers,
gc tools, etc. Several (3 or 4) years ago, matz said he was moving in
this direction...[1]

This is the thread where matz said he was looking at a more general
traversal mechanism to support marshal and other purposes:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/34335

Maybe it is still "vapor"...
 
A

ara.t.howard

----

[1] See:

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/ruby/marshal.c?cvsroot=src

and grep for vjoel:

* marshal.c (w_object): T_DATA process patch from Joel VanderWerf
<[email protected]>. This is temporary hack; it remains
undocumented, and it will be removed when marshaling is
re-designed.

The hack is still there (as of 1.8.5, anyway), still undocumented, and still
useful.

The original discussion about why it is useful starts at:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/34037

now __that__ is good know!

cheers.

-a
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,215
Messages
2,571,113
Members
47,716
Latest member
MiloManley

Latest Threads

Top