J
JK
Hi everyone,
I have a stand-alone application that spends (or should spend)
most of its time waiting for network packets. However, it
consistently uses upwards of 20% of the CPU time on a
1.8GHz Pentium M/2GB RAM/WinXP laptop.
I have run it for 24 hours with hprof sampling enabled:
java -agentlib:hprof=cpu=samples my.MainClass
(If I run it with "cpu=times" it runs far too slowly to
react promptly to the network events of interest, so profiling
actual CPU time usage is not feasible.)
The resulting profile data shows that the top CPU users are:
1 37.44% 37.44% 142475377 300285
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0
2 12.48% 49.91% 47492232 300210
java.net.PlainDatagramSocketImpl.receive0
3 12.48% 62.39% 47492137 300348
java.net.PlainSocketImpl.socketAccept
4 12.48% 74.87% 47492136 300347
java.net.PlainDatagramSocketImpl.receive0
5 12.48% 87.35% 47492125 300342
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0
6 12.48% 99.83% 47490216 300355
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0
7 0.17% 100.00% 650911 300344
java.net.NetworkInterface.isUp0
And even after a 24-hour run, using 20% of the CPU the whole
time, these are the *only* stack traces that show up in the
profile data. Since my selects are called with 1000ms
timeouts, and datagrams arrive at the socket only perhaps
a couple of times per second at most (verified via Ethereal),
I would expect most of the time in all of these methods to be
spent sleeping. I cannot see any reason why socketAccept(),
for example, would be using anything close to even 1% of the
CPU; I would expect that call to spend most of its time blocked
in the Windows kernel. And NetworkInterface.isUp() is called
only every ten seconds, and returns almost instantaneously;
its presence in the profiling data baffles me. I see nothing
here that could account for 20% CPU utilization.
So, questions:
(0) Am I completely misinterpreting this data?
(1) I would expect hprof to ignore samples taken from
blocked and/or sleeping threads. Is that the case?
(2) Is there a profiler other than hprof that I would be
better-advised to use? I need one that will not have too
serious an impact on performance, since when a packet
arrives my software must react to it and send a response
within approximately 100 ms. Also, open-source is vastly
preferable, since if I have to buy a commercial profiler,
it is going to take me weeks to get that approved
Any input would be appreciated.
Many thanks,
-- JK
I have a stand-alone application that spends (or should spend)
most of its time waiting for network packets. However, it
consistently uses upwards of 20% of the CPU time on a
1.8GHz Pentium M/2GB RAM/WinXP laptop.
I have run it for 24 hours with hprof sampling enabled:
java -agentlib:hprof=cpu=samples my.MainClass
(If I run it with "cpu=times" it runs far too slowly to
react promptly to the network events of interest, so profiling
actual CPU time usage is not feasible.)
The resulting profile data shows that the top CPU users are:
1 37.44% 37.44% 142475377 300285
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0
2 12.48% 49.91% 47492232 300210
java.net.PlainDatagramSocketImpl.receive0
3 12.48% 62.39% 47492137 300348
java.net.PlainSocketImpl.socketAccept
4 12.48% 74.87% 47492136 300347
java.net.PlainDatagramSocketImpl.receive0
5 12.48% 87.35% 47492125 300342
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0
6 12.48% 99.83% 47490216 300355
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0
7 0.17% 100.00% 650911 300344
java.net.NetworkInterface.isUp0
And even after a 24-hour run, using 20% of the CPU the whole
time, these are the *only* stack traces that show up in the
profile data. Since my selects are called with 1000ms
timeouts, and datagrams arrive at the socket only perhaps
a couple of times per second at most (verified via Ethereal),
I would expect most of the time in all of these methods to be
spent sleeping. I cannot see any reason why socketAccept(),
for example, would be using anything close to even 1% of the
CPU; I would expect that call to spend most of its time blocked
in the Windows kernel. And NetworkInterface.isUp() is called
only every ten seconds, and returns almost instantaneously;
its presence in the profiling data baffles me. I see nothing
here that could account for 20% CPU utilization.
So, questions:
(0) Am I completely misinterpreting this data?
(1) I would expect hprof to ignore samples taken from
blocked and/or sleeping threads. Is that the case?
(2) Is there a profiler other than hprof that I would be
better-advised to use? I need one that will not have too
serious an impact on performance, since when a packet
arrives my software must react to it and send a response
within approximately 100 ms. Also, open-source is vastly
preferable, since if I have to buy a commercial profiler,
it is going to take me weeks to get that approved
Any input would be appreciated.
Many thanks,
-- JK