futex high system utilization

J

Jaydeep Chovatia

Hi,

In my c++ (Linux) application I am seeing high CPU utilization (almost 90%)and it that sys-cpu:user-cpu ratio is 8:2. I then tried to run “strace” command and found that around 80% of time is being spent in “futex”system call, please see snippet of strace here:

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
80.66 100.007743 1567 63806 22761 futex
9.15 11.340274 96925 117 4 restart_syscall
5.60 6.939598 4401 1577 poll

Initially I though this is because of pthread_lock, unlock, condition timedwait, condition wait, etc. system calls I am using in the application, to prove that I have overridden pthread_lock, unlock, condition timedwait, condition wait, unlock, etc. using LD_PRELOAD functionality and concluded that not many threads are waiting here (mutex->__data.__nusers remains less than5 most of the time), so it doesn’t seem that these sys calls are part ofthe problem.

Now I am running out of clue about how to find out code location/system call which is causing this futex high utilization, any help on this would be appreciated.

Thank you,
Jaydeep
 
J

Jorgen Grahn

Hi,

In my c++ (Linux) application I am seeing high CPU utilization
(almost 90%) and it that sys-cpu:user-cpu ratio is 8:2. I then tried
to run ?strace? command and found that around 80% of time is being
spent in ?futex? system call,

I'm not sure you can use strace to find that out -- strace makes
syscalls much more expensive.
please see snippet of strace here:

Please break your paragraphs into lines, so I don't have to do it for
you.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
80.66 100.007743 1567 63806 22761 futex
9.15 11.340274 96925 117 4 restart_syscall
5.60 6.939598 4401 1577 poll
Initially I though this is because of pthread_lock, unlock,
condition timedwait, condition wait, etc. system calls I am using in
the application, to prove that I have overridden pthread_lock, unlock,
condition timedwait, condition wait, unlock, etc. using LD_PRELOAD
functionality and concluded that not many threads are waiting here
(mutex->__data.__nusers remains less than 5 most of the time), so it
doesn?t seem that these sys calls are part of the problem.
Now I am running out of clue about how to find out code
location/system call which is causing this futex high utilization, any
help on this would be appreciated.

This is a question which you should ask in some Linux newsgroup.

I have only one suggestion to offer: try ltrace too. It traces
library calls instead of system calls. Both strace and ltrace are
very useful on Linux.

/Jorgen
 
M

Miquel van Smoorenburg

In my c++ (Linux) application I am seeing high CPU utilization (almost
90%) and it that sys-cpu:user-cpu ratio is 8:2. I then tried to run

Is this a machine that has been running for a few weeks at least,
and runs ntpd? If so the system might be suffering from a
leap-second-adjustment bug .. as root run date -s "`date`" or
reboot, then see if your problem is gone.

Mike.
 
J

Jorgen Grahn

Is this a machine that has been running for a few weeks at least,
and runs ntpd? If so the system might be suffering from a
leap-second-adjustment bug .. as root run date -s "`date`" or
reboot, then see if your problem is gone.

Do you have any URL for that? I was aware of the recent leap second,
but haven't heard about that triggering ntpd bugs. Also I'm always
interested in new ways to botch time calculations in software :)

/Jorgen
 
M

Miquel van Smoorenburg

Do you have any URL for that? I was aware of the recent leap second,
but haven't heard about that triggering ntpd bugs. Also I'm always
interested in new ways to botch time calculations in software :)

It isn't a bug in ntpd, it's just that if you run ntpd it would
have told the kernel to handle the extra 2012-06-30 23:59:60 leapsecond,
and that triggered a bug in the kernels timer implementation.
It crashed older kernels, and on newer kernels it caused applications
using futexes to use 100% cpu ..

See for example

http://www.devand.com/index.php/technology/12-tech/11-leap

or Google for "linux leap second bug 2012".

Mike.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top