SIGKILL

cerr · Mar 17, 2010

Hi There,

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

		    if(gpsDataObj->getGPSLatitude() < 49.1937 && gpsDataObj-[QUOTE]
getGPSLatitude() > 49.1292){[/QUOTE]
		      struct tm *ptr;
		      time_t sec;
		      time(&sec);
		      char tmpFileName[500];

		      ptr = localtime(&sec);
		      sprintf(tmpFileName, "/var/log/PIDdata%d%d%d",
			      (ptr->tm_year + 1900),
			      (ptr->tm_mon + 1),
			      ptr->tm_mday);
		      ofstream PIDfile (tmpFileName,ios::app);
		      if (PIDfile.is_open()){
			PIDfile << PIDMessageBuf << ", " << ptr->tm_year + 1900 << "/" <<
ptr->tm_mon + 1 << "/" << ptr->tm_mday << ", " << ptr->tm_hour << ":"
<< ptr->tm_min << ":" << ptr->tm_sec << endl;
			PIDfile.close();
		      }
		      else{
			cout<< "Couldn't open " << tmpFileName << endl;
			}
		    }

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!
Thank you!

Ian Collins · Mar 17, 2010

Code:
Hi There,

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

[/QUOTE] [QUOTE]

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!

Get them to either a) send you a core or b) run the code in a debugger
and give you a shout when it aborts.

cerr · Mar 17, 2010

Code:
Hi There,

Click to expand...

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

[/QUOTE] [QUOTE]

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!

Click to expand...

Get them to either a) send you a core or b) run the code in a debugger
and give you a shout when it aborts.

Well, it was run in gdb but gdb doesn't say anything else than SIGKILL
either and after the sigkill you can't make a backtrace cause the app
terminated...
How would I get a core dump that i can do something with?

Fred · Mar 17, 2010

Code:
Hi There,

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

if(gpsDataObj->getGPSLatitude() < 49.1937 && gpsDataObj->getGPSLatitude() > 49.1292){ struct tm *ptr; time_t sec; time(&sec); char tmpFileName[500]; ptr = localtime(&sec); sprintf(tmpFileName, "/var/log/PIDdata%d%d%d", (ptr->tm_year + 1900), (ptr->tm_mon + 1), ptr->tm_mday); ofstream PIDfile (tmpFileName,ios::app); if (PIDfile.is_open()){ PIDfile << PIDMessageBuf << ", " << ptr->tm_year + 1900 << "/" << ptr->tm_mon + 1 << "/" << ptr->tm_mday << ", " << ptr->tm_hour << ":" << ptr->tm_min << ":" << ptr->tm_sec << endl; PIDfile.close(); } else{ cout<< "Couldn't open " << tmpFileName << endl; } }

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!

What is PIDMessageBuf ? Where is it defined? How is it defined?
Is it defined?

What happens if you output to cout instead of PIDfile?

cerr · Mar 17, 2010

Code:
Hi There,

Click to expand...

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

if(gpsDataObj->getGPSLatitude() < 49.1937 && gpsDataObj->getGPSLatitude() > 49.1292){[/QUOTE] [QUOTE] struct tm *ptr; time_t sec; time(&sec); char tmpFileName[500];[/QUOTE] [QUOTE] ptr = localtime(&sec); sprintf(tmpFileName, "/var/log/PIDdata%d%d%d", (ptr->tm_year + 1900), (ptr->tm_mon + 1), ptr->tm_mday); ofstream PIDfile (tmpFileName,ios::app); if (PIDfile.is_open()){ PIDfile << PIDMessageBuf << ", " << ptr->tm_year + 1900 << "/" << ptr->tm_mon + 1 << "/" << ptr->tm_mday << ", " << ptr->tm_hour << ":" << ptr->tm_min << ":" << ptr->tm_sec << endl; PIDfile.close(); } else{ cout<< "Couldn't open " << tmpFileName << endl; } }

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!

Click to expand...

What is PIDMessageBuf ? Where is it defined? How is it defined?
Is it defined?

it is declared as a private std::string in the header.

What happens if you output to cout instead of PIDfile?

Worth a try....

cerr · Mar 17, 2010

Code:
Hi There,
I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

if(gpsDataObj->getGPSLatitude() < 49.1937 && gpsDataObj->getGPSLatitude() > 49.1292){ struct tm *ptr; time_t sec; time(&sec); char tmpFileName[500]; ptr = localtime(&sec); sprintf(tmpFileName, "/var/log/PIDdata%d%d%d", (ptr->tm_year + 1900), (ptr->tm_mon + 1), ptr->tm_mday); ofstream PIDfile (tmpFileName,ios::app); if (PIDfile.is_open()){ PIDfile << PIDMessageBuf << ", " << ptr->tm_year + 1900 << "/" << ptr->tm_mon + 1 << "/" << ptr->tm_mday << ", " << ptr->tm_hour << ":" << ptr->tm_min << ":" << ptr->tm_sec << endl; PIDfile.close(); } else{ cout<< "Couldn't open " << tmpFileName << endl; } }

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!

Click to expand...

Click to expand...

What is PIDMessageBuf ? Where is it defined? How is it defined?
Is it defined?

Click to expand...

it is declared as a private std::string in the header.

What happens if you output to cout instead of PIDfile?

Click to expand...

Worth a try....

Now I got a "child terminated with signal 9" on the shell.... what
does that mean, any clues?

Sems to be something from pthread but uhm... :-?
Thanks,

Ian Collins · Mar 18, 2010

Now I got a "child terminated with signal 9" on the shell.... what
does that mean, any clues?

Try comp.unix.programmer (hint: man signal)

AnonMail2005 · Mar 18, 2010

Code:
Hi There,

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

if(gpsDataObj->getGPSLatitude() < 49.1937 && gpsDataObj->getGPSLatitude() > 49.1292){ struct tm *ptr; time_t sec; time(&sec); char tmpFileName[500]; ptr = localtime(&sec); sprintf(tmpFileName, "/var/log/PIDdata%d%d%d", (ptr->tm_year + 1900), (ptr->tm_mon + 1), ptr->tm_mday); ofstream PIDfile (tmpFileName,ios::app); if (PIDfile.is_open()){ PIDfile << PIDMessageBuf << ", " << ptr->tm_year + 1900 << "/" << ptr->tm_mon + 1 << "/" << ptr->tm_mday << ", " << ptr->tm_hour << ":" << ptr->tm_min << ":" << ptr->tm_sec << endl; PIDfile.close(); } else{ cout<< "Couldn't open " << tmpFileName << endl; } }

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!
Thank you!

Since you mentioned threads, I do believe that localtime is not thread
safe. It uses an internal static variable and returns a pointer to
that. Check out the localtime_r function instead.

HTH

cerr · Mar 18, 2010

Code:
Hi There,

Click to expand...

I had to add a certain portion of code to an application which had
been considered to run stable (bewfore my addition). Now the QA guy
came back to me saying that he's seeing a SIGKILL after a while
(several hours) since my code addition. The code I added simply writes
a string (PIDMessageBuf - declared private) and a at runtime generated
timestamp into a text file a la:

Code:

if(gpsDataObj->getGPSLatitude() < 49.1937 && gpsDataObj->getGPSLatitude() > 49.1292){[/QUOTE] [QUOTE] struct tm *ptr; time_t sec; time(&sec); char tmpFileName[500];[/QUOTE] [QUOTE] ptr = localtime(&sec); sprintf(tmpFileName, "/var/log/PIDdata%d%d%d", (ptr->tm_year + 1900), (ptr->tm_mon + 1), ptr->tm_mday); ofstream PIDfile (tmpFileName,ios::app); if (PIDfile.is_open()){ PIDfile << PIDMessageBuf << ", " << ptr->tm_year + 1900 << "/" << ptr->tm_mon + 1 << "/" << ptr->tm_mday << ", " << ptr->tm_hour << ":" << ptr->tm_min << ":" << ptr->tm_sec << endl; PIDfile.close(); } else{ cout<< "Couldn't open " << tmpFileName << endl; } }

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop that comes
around once a second.
Any hints or suggestions are greatly appreciated!
Thank you!

Click to expand...

Since you mentioned threads, I do believe that localtime is not thread
safe. It uses an internal static variable and returns a pointer to
that. Check out the localtime_r function instead.

Hey, Thanks for pointing that out! I replaced any occurences of
localtime() (there's plenty of them) with localtime_r() and back it
goes to QA, gotta see if that's leading to an improvment. *crossing my
fingers*
Thanks dude!

Michael Doubez · Mar 18, 2010

On 18 mar said:
Hey, Thanks for pointing that out! I replaced any occurences of
localtime() (there's plenty of them) with localtime_r() and back it
goes to QA, gotta see if that's leading to an improvment. *crossing my
fingers*
Thanks dude!

You make a random change and send back the program to QA ?

Wouldn't you prefer to locate the bug and be sure you made the right
change ?
I expect a good valgrind would have given you the answer.

James Kanze · Mar 18, 2010

Code:
On 03/18/10 09:20 AM, cerr wrote:

I had to add a certain portion of code to an application
which had been considered to run stable (bewfore my
addition). Now the QA guy came back to me saying that he's
seeing a SIGKILL after a while (several hours) since my code
addition. The code I added simply writes a string
(PIDMessageBuf - declared private) and a at runtime
generated timestamp into a text file a la:

Code:

[/QUOTE] [QUOTE]

I cannot see how this code would lead to a SIGKILL, anyone?
Oh by the way, this is running in a threaded while(1) loop
that comes around once a second. Any hints or suggestions
are greatly appreciated!

Click to expand...

Get them to either a) send you a core or b) run the code in a debugger
and give you a shout when it aborts.

SIGKILL doesn't give a core. I don't even know if you can do
anything with it in the debugger. And it almost always comes
from outside the process. I'd guess that there's something
monitoring the processes in the environment, which decides that
his process is up to no good, so kills it. Maybe his
modification makes some monitoring software think it's a virus.

James Kanze · Mar 18, 2010

It means SIGKILL, what your QA guy is telling you. SIGKILL is
signal 9. man pages are your friends. "man 7 signal",
orhttp://manpages.courier-mta.org/htmlman7/signal.7.html

According to Posix, it's not guaranteed. (But practically all
Unix do agree here.)

As you can see in the convenient table in the middle of the
man page, signal 9 is called … drumroll … SIGKILL. Who woulda
thunk it?

One thing about C++ -- bugs may not necessarily manifest
themselves right away. For various reasons, which are too
boring to go into, stomping on a few random bytes of memory,
or dereferencing an uninitialized pointer, may go completely
unnoticed at first. But, a long time later, an innocuous
change elsewhere in the program -- a few lines of added code,
or a few lines of removed code -- subtly changes the contents
of your compiled program in such a manner that the new
internal memory layout of the code, or a slightly different
heap allocation pattern, suddenly makes those few stomped
bytes of memory be something important. Result: an ugly crash,
and you're staring at the innocent bit of code that you just
changed, and wondering how the FRAK could that possibly change
anything?

A SIGKILL is *not* a crash, at least not on a normal Unix. A
SIGKILL is what you send when you want a program to terminate,
and it refuses to do so otherwise. Of course, it's quite
possible to generate a signal 9 from your own code. It's just
very unlikely to happen accidentally.

Ian Collins · Mar 18, 2010

SIGKILL doesn't give a core. I don't even know if you can do
anything with it in the debugger.

Good spot, I overlooked that!

Jorgen Grahn · Mar 20, 2010

You make a random change and send back the program to QA ?
Wouldn't you prefer to locate the bug and be sure you made the right
change ?

Depends on how his QA is organized, I guess. But personally I'd rather
waste my own time than someone elses with long-shots.

I expect a good valgrind would have given you the answer.

Except for the special meaning of SIGKILL mentioned elsewhere in the
thread. There is almost certainly a watchdog of some kind somewhere
in the system, and his job is to (a) find it, (b) check its logs, (c)
find out what it assumes about the processes it watches.

/Jorgen

cerr · Mar 22, 2010

Depends on how his QA is organized, I guess. But personally I'd rather
waste my own time than someone elses with long-shots.

I got valgrind goign - finally... but am having troubles reading its
output. I get stuff back like this one e.g.:
==18816== at 0x41C37DF: ??? (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x41C33CF: strtol (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== by 0x41C0740: atoi (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x8058469:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:306)
==18816== by 0x8057F89: gpsnmeareader::run() (gpsnmeareader.cpp:
257)
==18816== by 0x804E9F9: TSPThread::StartThread(void*)
(tspthread.cpp:37)
==18816== by 0x417F80D: start_thread (in /lib/tls/i686/cmov/
libpthread-2.10.1.so)
==18816== by 0x425F8DD: clone (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== Address 0x436d29d is 13 bytes inside a block of size 15
free'd
==18816== at 0x402454D: operator delete(void*) (vg_replace_malloc.c:
346)
==18816== by 0x40D935C:
std::string::_Rep::_M_destroy(std::allocator<char> const&) (in /usr/
lib/libstdc++.so.6.0.13)
==18816== by 0x40DAD6B: std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::~basic_string() (in /
usr/lib/libstdc++.so.6.0.13)
==18816== by 0x80583B1:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:302)
==18816== by 0x8057F89: gpsnmeareader::run() (gpsnmeareader.cpp:
257)
==18816== by 0x804E9F9: TSPThread::StartThread(void*)
(tspthread.cpp:37)
==18816== by 0x417F80D: start_thread (in /lib/tls/i686/cmov/
libpthread-2.10.1.so)
==18816== by 0x425F8DD: clone (in /lib/tls/i686/cmov/
libc-2.10.1.so)

Now what does that mean? All i can read is "Address 0x436d29d is 13
bytes inside a block of size 15 free'd" but i have no clue where
0x436d29d is...

Well there clearly seems to be issues in
gpsnmeareader.cpp but how do i dig further?

Thanks,

cerr · Mar 22, 2010

On Thu, 2010-03-18, Michael Doubez wrote:
[snip]
Hey, Thanks for pointing that out! I replaced any occurences of
localtime() (there's plenty of them) with localtime_r() and back
it goes to QA, gotta see if that's leading to an improvment.
*crossing my fingers*
Thanks dude!
You make a random change and send back the program to QA ?
Wouldn't you prefer to locate the bug and be sure you made the
right change ?
Depends on how his QA is organized, I guess. But personally I'd
rather waste my own time than someone elses with long-shots.
I expect a good valgrind would have given you the answer.

Click to expand...

I got valgrind goign - finally... but am having troubles reading its
output. I get stuff back like this one e.g.:
==18816== at 0x41C37DF: ??? (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x41C33CF: strtol (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== by 0x41C0740: atoi (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x8058469:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:306)

Click to expand...

It seems you try to access some std::string data buffer here

==18816== by 0x8057F89: gpsnmeareader::run() (gpsnmeareader.cpp:
257)
==18816== by 0x804E9F9: TSPThread::StartThread(void*)
(tspthread.cpp:37)
==18816== by 0x417F80D: start_thread (in /lib/tls/i686/cmov/
libpthread-2.10.1.so)
==18816== by 0x425F8DD: clone (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== Address 0x436d29d is 13 bytes inside a block of size 15
free'd
==18816== at 0x402454D: operator delete(void*)
(vg_replace_malloc.c: 346)
==18816== by 0x40D935C:
std::string::_Rep::_M_destroy(std::allocator<char> const&) (in /usr/
lib/libstdc++.so.6.0.13)
==18816== by 0x40DAD6B: std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::~basic_string() (in /
usr/lib/libstdc++.so.6.0.13)
==18816== by 0x80583B1:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:302)

Click to expand...

Which has been destroyed here (possibly at scope exit). I would take a
very careful look of the function residing in gpsnmeareader.cpp lines
302-306.

==18816== by 0x8057F89: gpsnmeareader::run() (gpsnmeareader.cpp:
257)
==18816== by 0x804E9F9: TSPThread::StartThread(void*)
(tspthread.cpp:37)
==18816== by 0x417F80D: start_thread (in /lib/tls/i686/cmov/
libpthread-2.10.1.so)
==18816== by 0x425F8DD: clone (in /lib/tls/i686/cmov/
libc-2.10.1.so)

Click to expand...

Now what does that mean? All i can read is "Address 0x436d29d is 13
bytes inside a block of size 15 free'd" but i have no clue where
0x436d29d is... Well there clearly seems to be issues in
gpsnmeareader.cpp but how do i dig further?

Click to expand...

The diagnostics seem quite good so the error should be obvious when you
look at the source code. Of course, there is no guarantee valgrind has
actually spot the problem, but in general I have found it quite up to the
task.

Hm okay, found out that atoi() isn't thread safe and replaced all
occurences with strtol() and now i'm seeing things like
==27829== 7 errors in context 5 of
13:
==27829== Thread
5:
==27829== Invalid read of size
1
==27829== at 0x41C36BD: ??? (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==27829== by 0x41C33CF: strtol (in /lib/tls/i686/cmov/
libc-2.10.1.so)

What's that about? Strtol is thread safe it says... I'm not sure...

Michael Doubez · Mar 23, 2010

On Thu, 2010-03-18, Michael Doubez wrote:
[snip]
Hey, Thanks for pointing that out! I replaced any occurences of
localtime() (there's plenty of them) with localtime_r() and back
it goes to QA, gotta see if that's leading to an improvment.
*crossing my fingers*
Thanks dude!
You make a random change and send back the program to QA ?
Wouldn't you prefer to locate the bug and be sure you made the
right change ?
Depends on how his QA is organized, I guess. But personally I'd
rather waste my own time than someone elses with long-shots.
I expect a good valgrind would have given you the answer.
I got valgrind goign - finally... but am having troubles reading its
output. I get stuff back like this one e.g.:
==18816== at 0x41C37DF: ??? (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x41C33CF: strtol (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== by 0x41C0740: atoi (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x8058469:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:306)

Click to expand...

Click to expand...

It seems you try to access some std::string data buffer here

Click to expand...

Which has been destroyed here (possibly at scope exit). I would take a
very careful look of the function residing in gpsnmeareader.cpp lines
302-306.

Click to expand...

The diagnostics seem quite good so the error should be obvious when you
look at the source code. Of course, there is no guarantee valgrind has
actually spot the problem, but in general I have found it quite up to the
task.

Click to expand...

Hm okay, found out that atoi() isn't thread safe and replaced all
occurences with strtol() and now i'm seeing things like

IMO you got it wrong. this kind of error has nothing to do with atoi()
being not thread safe in you implementation (I wonder why atoi is not
thread safe, because of the locale?).

What valgrind told you is that you are deleting a memory location that
as been free-ed. If it is a string you may have an issue with COW but
I doubt it.

Try other options of valgrind or even get gdb to break on the
suspicious line.

[snip]

cerr · Mar 23, 2010

On Thu, 2010-03-18, Michael Doubez wrote:
[snip]
Hey, Thanks for pointing that out! I replaced any occurences of
localtime() (there's plenty of them) with localtime_r() and back
it goes to QA, gotta see if that's leading to an improvment.
*crossing my fingers*
Thanks dude!
You make a random change and send back the program to QA ?
Wouldn't you prefer to locate the bug and be sure you made the
right change ?
Depends on how his QA is organized, I guess. But personally I'd
rather waste my own time than someone elses with long-shots.
I expect a good valgrind would have given you the answer.
I got valgrind goign - finally... but am having troubles reading its
output. I get stuff back like this one e.g.:
==18816== at 0x41C37DF: ??? (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x41C33CF: strtol (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== by 0x41C0740: atoi (in /lib/tls/i686/cmov/libc-2.10.1.so)
==18816== by 0x8058469:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:306)
It seems you try to access some std::string data buffer here
==18816== by 0x8057F89: gpsnmeareader::run() (gpsnmeareader.cpp:
257)
==18816== by 0x804E9F9: TSPThread::StartThread(void*)
(tspthread.cpp:37)
==18816== by 0x417F80D: start_thread (in /lib/tls/i686/cmov/
libpthread-2.10.1.so)
==18816== by 0x425F8DD: clone (in /lib/tls/i686/cmov/
libc-2.10.1.so)
==18816== Address 0x436d29d is 13 bytes inside a block of size 15
free'd
==18816== at 0x402454D: operator delete(void*)
(vg_replace_malloc.c: 346)
==18816== by 0x40D935C:
std::string::_Rep::_M_destroy(std::allocator<char> const&) (in /usr/
lib/libstdc++.so.6.0.13)
==18816== by 0x40DAD6B: std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::~basic_string() (in /
usr/lib/libstdc++.so.6.0.13)
==18816== by 0x80583B1:
gpsnmeareader::storeGPSdata(gpsnmeareader::NMEA_STR)
(gpsnmeareader.cpp:302)
Which has been destroyed here (possibly at scope exit). I would take a
very careful look of the function residing in gpsnmeareader.cpp lines
302-306.
==18816== by 0x8057F89: gpsnmeareader::run() (gpsnmeareader.cpp:
257)
==18816== by 0x804E9F9: TSPThread::StartThread(void*)
(tspthread.cpp:37)
==18816== by 0x417F80D: start_thread (in /lib/tls/i686/cmov/
libpthread-2.10.1.so)
==18816== by 0x425F8DD: clone (in /lib/tls/i686/cmov/
libc-2.10.1.so)
Now what does that mean? All i can read is "Address 0x436d29d is 13
bytes inside a block of size 15 free'd" but i have no clue where
0x436d29d is... Well there clearly seems to be issues in
gpsnmeareader.cpp but how do i dig further?
The diagnostics seem quite good so the error should be obvious when you
look at the source code. Of course, there is no guarantee valgrind has
actually spot the problem, but in general I have found it quite up to the
task.

Click to expand...

Click to expand...

Hm okay, found out that atoi() isn't thread safe and replaced all
occurences with strtol() and now i'm seeing things like

Click to expand...

IMO you got it wrong. this kind of error has nothing to do with atoi()
being not thread safe in you implementation (I wonder why atoi is not
thread safe, because of the locale?).

What valgrind told you is that you are deleting a memory location that
as been free-ed. If it is a string you may have an issue with COW but
I doubt it.

okay now I get tons of messages from valgrind what really worries me.
No one has ever ran this piece of code through valgrind as valgrind
hasn't compiled on the target platform till the other day. That's
scarry.
Ok, now the first error looks like this (above messages seem to be
init messages where stuff is read from or redirected to):
==30794== Conditional jump or move depends on uninitialised value(s)
==30794== at 0x40C0F88: std:

streambuf_iterator<char,

std::char_traits said:
::_M_insert_int<long>(std:streambuf_iterator<char,

std::char_traits<char> >, std::ios_base&, char, long) const (in /usr/
lib/libstdc++.so.6.0.13)
==30794== by 0x40C120C: std::num_put<char,

std::ostreambuf_iterator said:
::do_put(std:streambuf_iterator<char, std::char_traits<char> >,

std::ios_base&, char, long) const (in /usr/lib/libstdc++.so.6.0.13)
==30794== by 0x40D1434: std:

stream&
std:

stream::_M_insert<long>(long) (in /usr/lib/libstdc++.so.6.0.13)
==30794== by 0x40D15C3: std:

stream:

perator<<(int) (in /usr/lib/
libstdc++.so.6.0.13)
==30794== by 0x804CFFD: BlackBox:

repareFileandHeader()
(blackbox.cpp:208)
==30794== by 0x804D98F: BlackBox::start(std::string, logger*, int,
int, char const*) (blackbox.cpp:335)
==30794== by 0x806AE9E: PRGDaemon::work() (prgdaemon.cpp:359)
==30794== by 0x806ADB3: PRGDaemon::runWork() (prgdaemon.cpp:339)
==30794== by 0x804C608: main (prg.cpp:74)
Now, I understand "Conditional jump or move depends on uninitialised
value(s)" but how would i figure out where this ius happening? This is
just a whole lot of information and i'm not quite clear on how to read/
interpret this... a little support is appreciated!

Thanks a lot!

Jorgen Grahn · Mar 25, 2010

....
....
....

This message comes from deep inside standard library. You are calling it
through std:stream:perator<<(int) at blackbox.cpp line 208. You can
safely assume that the standard library works correctly, so all errors,
if any, must be in your code. In this case about the only thing which can
be uninitialized is the int you are passing. Find out where it is coming
from, and make sure it is initialized properly. For example, if it is
from a memory block allocated with malloc(), you can change the malloc()
call to calloc() (after finding out why this memory part is unused, of
course).

A few things to keep in mind:

- It's not a given that valgrind, on a platform to which it was ported
"the other day"? works perfectly, or that the libraries don't give
any warnings. I believe it comes with a big database of warnings which
are harmless and should not be shown to the user -- for a certain
libc, processor, OS and so on.

- I note that you've lost sight of the original bug -- the SIGKILL.
Fixing bugs valgrind shows you is honorable work, but it is rarely top
priority.

/Jorgen

Converting windows SYSTEMTIME to a standard struct tm	4	Feb 22, 2014
Problems with the date of modification of files on the flash drive inwindows	1	Jan 23, 2010
Array of structs function pointer	10	Jul 16, 2023
mktime	25	Dec 6, 2012
Creating a datetime object from a C Extention	1	Nov 30, 2009
problem with time.h	0	Jun 27, 2008
File Generation	9	Oct 27, 2009
The output for localtime() is not correct.	3	Aug 26, 2003

SIGKILL

cerr

Ian Collins

cerr

Fred

cerr

cerr

Ian Collins

AnonMail2005

cerr

Michael Doubez

James Kanze

James Kanze

Ian Collins

Jorgen Grahn

cerr

cerr

Michael Doubez

cerr

Jorgen Grahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads