Memory Profiler for Heap Analysis

M

Moritz Wissenbach

Hi,

I have an application (xml-editor) that uses up an unreasonable amount of
java heap space.
I have tried profiling with tptp (eclipse) and jmap/jhat, and they do give
me memory histograms and instance counts, but what I REALLY want is to see
the total size of a classes instances INCLUDING the referenced objects.

Is there a profiler that will traverse the heap and find the "recursive"
size of objects?
Because knowing that char-Arrays use up most memory is not exactly helpful.

Plese help, I've searched the web and found nothing; but have been assured
that something like this exists.

TIA,

Moritz
 
R

Robert Klemme

I have an application (xml-editor) that uses up an unreasonable amount
of java heap space.
I have tried profiling with tptp (eclipse) and jmap/jhat, and they do
give me memory histograms and instance counts, but what I REALLY want is
to see the total size of a classes instances INCLUDING the referenced
objects.

Is there a profiler that will traverse the heap and find the "recursive"
size of objects?
Because knowing that char-Arrays use up most memory is not exactly helpful.

Plese help, I've searched the web and found nothing; but have been
assured that something like this exists.

I don't think something like that exists. Assuming something like this
exists, how does it properly calculate sizes? How many levels does it
follow when counting? How do you count objects that are referenced by
multiple other instances etc.

Having said that I was pretty satisfied with OptimizeIT which also does
not do recursive analysis but lets you easily find places in code where
all those objects were allocated and you could walk the object graph. I
haven't used it in a while so maybe today there are other (free?) tools
around that server similar purposes.

Kind regards

robert
 
M

Moritz Wissenbach

Hi Robert,
I don't think something like that exists. Assuming something like this
exists, how does it properly calculate sizes? How many levels does it
follow when counting?

Until it hits a cycle?
How do you count objects that are referenced by multiple other
instances etc.

Add the objects size to both instaces...?

Well, I was sceptical too as to wheather this was even possible, but when
I asked for this feature in the TPTP (eclipse profiler) group, they replied
This is a valid and useful use-case for memory profiling. While
technically possible (both in theory and in practice), it is not
currently
in the plan for TPTP.

So I thought this might be implemented, if not in a free, then in a
commercial tool.

Of course it's not a trivial (or probably fast for that matter) task, but
I guess you could somehow work around the problems.

I don't want a 100% accurate size count, just an overview over which
classes use up most memory. I find it kind of hard to see with the
standard "heap historgram" but maybe I'm missing something.

Does anyone have any other suggestions? As stated I have an application
(edior) that uses too much memory (about 21 MB for each MB file size
loaded) and want to see where all of this goes (DOM, Graphical
representation, etc..)

Moritz
 
R

Robert Klemme

Hi Robert,


Until it hits a cycle?

If it does not?
Add the objects size to both instaces...?

Remember that n is not restricted to 2, could be much higher.
Well, I was sceptical too as to wheather this was even possible, but
when I asked for this feature in the TPTP (eclipse profiler) group, they
replied


So I thought this might be implemented, if not in a free, then in a
commercial tool.

Of course it's not a trivial (or probably fast for that matter) task,
but I guess you could somehow work around the problems.

I assume the problems are solvable more on a theoretical level. I don't
see this problem practically solvable. But maybe I'm missing something
- the TPTP folks surely have a better understanding of the matter.
I don't want a 100% accurate size count, just an overview over which
classes use up most memory. I find it kind of hard to see with the
standard "heap historgram" but maybe I'm missing something.

Does anyone have any other suggestions? As stated I have an application
(edior) that uses too much memory (about 21 MB for each MB file size
loaded) and want to see where all of this goes (DOM, Graphical
representation, etc..)

DOM is a good candidate. If I were you, I'd create an application
object model, throw away the DOM parsing and create a SAX parser which
directly creates the model. Just my 0.02 EUR...

Kind regards

robert
 
M

Moritz Wissenbach

If it does not?

Well, then all the better: it's a proper tree and at some point you'll
have traversed it completely.
Remember that n is not restricted to 2, could be much higher.

The point you're interested in is the size of a certain object, say A. You
go ahead and look at what other objects it points. Say it points to object
B. Since A points to B, B's size needs to be added to A's size. It doesn't
matter what other objects point to B, for it will always be referenced
from A, thus, is part of A no matter what.
On the other hand: I'm looking for things to REMOVE from the program. So,
if you would remove A in the above scenario, B would still remain in
memory, so it would be misleading.

Oh well. If we continue discussing, we could as well implement it. Are you
in ? ;)

DOM is a good candidate. If I were you, I'd create an application
object model, throw away the DOM parsing and create a SAX parser which
directly creates the model. Just my 0.02 EUR...

I think the things we want to achieve (online validation for example) do
require a DOM...

It's just frustrating. I have the heap dump in front of my eyes, I can see
all the classes of the DOM, still I can't figure out just how much of the
memory they cost in total.

Moritz
 
L

Lew

Moritz said:
Well, then all the better: it's a proper tree and at some point you'll
have traversed it completely.


The point you're interested in is the size of a certain object, say A.
You go ahead and look at what other objects it points. Say it points to
object B. Since A points to B, B's size needs to be added to A's size.
It doesn't matter what other objects point to B, for it will always be
referenced from A, thus, is part of A no matter what.
On the other hand: I'm looking for things to REMOVE from the program.
So, if you would remove A in the above scenario, B would still remain in
memory, so it would be misleading.

Strictly speaking, B-the-object's size is not part of A; A only holds a
reference to B, not the object itself.
 
R

Robert Klemme

Well, then all the better: it's a proper tree and at some point you'll
have traversed it completely.

This still does not address the issue of aliasing properly.
The point you're interested in is the size of a certain object, say A.

The point I am trying to make is that this concept does not exist with
regard to /referenced/ objects. The only size you can reasonably well
detect is the size of the object itself.
You go ahead and look at what other objects it points. Say it points to
object B. Since A points to B, B's size needs to be added to A's size.
It doesn't matter what other objects point to B, for it will always be
referenced from A, thus, is part of A no matter what.
On the other hand: I'm looking for things to REMOVE from the program.
So, if you would remove A in the above scenario, B would still remain in
memory, so it would be misleading.

Oh well. If we continue discussing, we could as well implement it. Are
you in ? ;)

LOL - no.
I think the things we want to achieve (online validation for example) do
require a DOM...

If it is a general XML editor then yes, maybe. If you are editing a
specific XML format, then no.
It's just frustrating. I have the heap dump in front of my eyes, I can
see all the classes of the DOM, still I can't figure out just how much
of the memory they cost in total.

Did you try the memory profiler from Eclipse's TPTP? How did it go?

Kind regards

robert
 
M

Moritz Wissenbach

Strictly speaking, B-the-object's size is not part of A; A only holds a
reference to B, not the object itself.
Of course. But then the size of A is not representative of memory usage.
 
M

Moritz Wissenbach

The point I am trying to make is that this concept does not exist with
regard to /referenced/ objects. The only size you can reasonably well
detect is the size of the object itself.

Point taken. That was never the question, but more:
-given you know what you are doing
-are able to tune some parameters
-is there a reasonable approximation

I thought (and still think) there must be a way
If it is a general XML editor then yes, maybe. If you are editing a
specific XML format, then no.

General XML editor. I have to look into it some more, but 20/1 Memory
Usage/Document Size seems a little much. Maybe that's what they mean by
"Java is memory intensive"
Did you try the memory profiler from Eclipse's TPTP? How did it go?
Well, maybe I didn't see something, but it basically leaves me with saying
"Most memory is used by class char[]". Well thanks!

I can then manually check what references to those char[]s but I don't
think I'm going to do that a 150 000 times.

Moritz
 
L

Lew

Hm, people keep making this point.
Point taken. That was never the question, but more:
General XML editor. I have to look into it some more, but 20/1 Memory
Usage/Document Size seems a little much. Maybe that's what they mean by
"Java is memory intensive"

No, it isn't. "They" are referring to the JVM footprint.

The overhead you're talking about is DOM overhead, not Java.
 
M

Moritz Wissenbach

No, it isn't. "They" are referring to the JVM footprint.

The overhead you're talking about is DOM overhead, not Java.

Well, I meant the general style that comes with Java, for example using
the collection classes (HashMaps) which will add their share to memory
usage.
I guess I'm still surprised that everything in the implementation seems
ok/necessary and still 1 MB blows up to 20 when loading a file.

Moritz
 
L

Lew

Moritz said:
Well, I meant the general style that comes with Java, for example using
the collection classes (HashMaps) which will add their share to memory
usage.

It would be different in what other language?

Moritz said:
I guess I'm still surprised that everything in the implementation seems
ok/necessary and still 1 MB blows up to 20 when loading a file.

That's the price of all those links between nodes, and all that metadata, and
whatnot. It's not the language, it's the algorithm. This is still not what
"they" mean by "Java needs a lot of memory." This is what "they" mean by "DOM
needs a lot of memory."

There's no question that the memory cost is high. If that's a problem,
consider SAX or StAX parsing instead - it's much more PARSimonious. Cover
what you need in one pass and it's also extremely fast.
 
J

Juha Laiho

Moritz Wissenbach said:
I have an application (xml-editor) that uses up an unreasonable amount of
java heap space.
I have tried profiling with tptp (eclipse) and jmap/jhat, and they do give
me memory histograms and instance counts, but what I REALLY want is to see
the total size of a classes instances INCLUDING the referenced objects.

That looks pretty much what Yourkit does (among other things).

http://www.yourkit.com/

(disclaimer: I have no affiliation with the company)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top