Books for advanced C++ debugging

J

jacob navia

I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be completely
buggy.

The reason is that the C++ standard says that using a pointer of another
type to change memory as the type declared that this memory should be
is "undefined"...

Well, gcc has an option (that I discovered later)

-Wstrict-aliasing

that should warn you if you are doing something wrong. Since that
warning is enabled when -Wall is used, I thought we were protected,
and filed a bug report when we discovered that gcc generated code that
reads a value from uninitialized memory.


Of course, we wanted to make things easy to the gcc folks, and
wasted a lot of time isolating the bug, and then we sent it
to gcc's bug database.

The answer was simply

"Your code has aliasing problems. This is not the place to
educate you about C/C++"

We forgot to pass the code snippet through -Wall and the gcc folks
apparently did not understand that this snippet wasn't the problem

Great.

But how come that gcc doesn't emit the slightest warning?

Well, we discovered that -Wall does enable the -Wstrict-aliasing but
that option has a "scale" i.e. you can say

-Wstrict-aliasing=1
up to
-Wstrict-aliasing=5

When you set -Wall you enable -Wstrict-aliasing=3.

OK. We increased the level to 5 (lengthening the compile time of our
software that is already a staggering 20 minutes in a 4 core machine)

Still, gcc doesn't emit A SINGLE WARNING!

Now, how can we discover this?

The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
void *a;
void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.

Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?

I thought there could be a book with *advanced* C++ debugging but a
Google search, then an Amazon search yielded nothing
but books for beginners or user manuals of Visual C++ debugger
written in a book form.

Is there a combination of gcc warnings (that is NOT included in Wall
since we already have that) that could be useful here?

Is there a tool somewhere that could diagnose this problem?

And last but not least: Is there a good book in C++ debugging?

Thanks in advance



jacob navia
 
V

Vaclav Haisman

jacob navia wrote, On 9.7.2009 22:42:
I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be completely
buggy.

The reason is that the C++ standard says that using a pointer of another
type to change memory as the type declared that this memory should be
is "undefined"...

Well, gcc has an option (that I discovered later)

-Wstrict-aliasing

that should warn you if you are doing something wrong. Since that
warning is enabled when -Wall is used, I thought we were protected,
and filed a bug report when we discovered that gcc generated code that
reads a value from uninitialized memory.


Of course, we wanted to make things easy to the gcc folks, and
wasted a lot of time isolating the bug, and then we sent it
to gcc's bug database.

The answer was simply

"Your code has aliasing problems. This is not the place to
educate you about C/C++"

We forgot to pass the code snippet through -Wall and the gcc folks
apparently did not understand that this snippet wasn't the problem
From your description lower in the email, it seems to me that your code is
really the problem.
Great.

But how come that gcc doesn't emit the slightest warning?
Violating aliasing rules is UB, UB does not require diagnostics.
Well, we discovered that -Wall does enable the -Wstrict-aliasing but
that option has a "scale" i.e. you can say

-Wstrict-aliasing=1
up to
-Wstrict-aliasing=5

When you set -Wall you enable -Wstrict-aliasing=3.

OK. We increased the level to 5 (lengthening the compile time of our
software that is already a staggering 20 minutes in a 4 core machine)
Just 20 minutes? Thats nothing.
Still, gcc doesn't emit A SINGLE WARNING!
The levels exist because the higher levels can give false positives. And even
then they are not exhaustive. The compiler cannot see or recognize each and
every aliasing rules violation.
Now, how can we discover this?

The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
void *a;
void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.
That was never correct, not since the first C standard which is 1990. You
were just lucky getting away with it. And before you think about trying to
solve this using union of the structure and some 64bit integer, no, that is
not allowed either.
Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?
The standard has not changed since 1998. Aren't 10+ years enough to learn the
language you are using? :)
I thought there could be a book with *advanced* C++ debugging but a
Google search, then an Amazon search yielded nothing
but books for beginners or user manuals of Visual C++ debugger
written in a book form.

Is there a combination of gcc warnings (that is NOT included in Wall
since we already have that) that could be useful here?

Is there a tool somewhere that could diagnose this problem?

And last but not least: Is there a good book in C++ debugging?
I don't think there is anything specific about C++ debugging. You just need
to know the language well enough.
 
I

Ian Collins

jacob said:
I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be completely
buggy.

1) write decent unit tests.
2) compile (and run the tests) with more than one compiler.
The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
void *a;
void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.

This has nothing to do with the standard and everything to do with the
platform. 16 and 64 bit system any standard and your code would fail on
them.

3) compile (and run the tests) on more than one platform.
Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?

Don't hack.
 
I

Ian Collins

Ian said:
1) write decent unit tests.
2) compile (and run the tests) with more than one compiler.


This has nothing to do with the standard and everything to do with the
platform. 16 and 64 bit system any standard and your code would fail on
them.

Make that "16 and 64 bit systems pre-date any C++ standard"
 
P

Pascal J. Bourguignon

jacob navia said:
I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be
completely buggy.

The reason is that the C++ standard says that using a pointer of another
type to change memory as the type declared that this memory should be
is "undefined"...

Yes, that's what the standard says, but the compilers don't try to
catch this 'error'.

[...] The answer was simply

"Your code has aliasing problems. This is not the place to
educate you about C/C++"

Quite an "attitude", isn't it.

Now, how can we discover this?

The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
void *a;
void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.

Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?

Quite easily, by tagging the data.

I thought there could be a book with *advanced* C++ debugging but a
Google search, then an Amazon search yielded nothing
but books for beginners or user manuals of Visual C++ debugger
written in a book form.

Is there a combination of gcc warnings (that is NOT included in Wall
since we already have that) that could be useful here?

I wouldn't hold my breadth.

Is there a tool somewhere that could diagnose this problem?

It's done by the Zeta-C compiler (since the target is the
LispMachine). Of course, today it might be easier to build a time
machine than to find a LispMachine with the Zeta-C compiler, and
anyways, it doesn't solve the problem of C++.


Perhaps one of the C/C++ interpreters are doing this type check. Try
them.

C INTERPRETERS:
CINT - http://root.cern.ch/root/Cint.html
EiC - http://eic.sourceforge.net/
Ch - http://www.softintegration.com
[ MPC (Multi-Platform C -> Java compiler) - http://www.axiomsol.com ]


Otherwise, your best chance would be to patch them, or gcc (or
lcc-win32), to generate tagged data and implement run-time type
checks.

Notice that of the same sort of bug that should be checked at run-time
are the array overflows and invalid pointers dereferences. The C and
C++ standard explicitely say that derefering a pointer outside of its
pointed array is undefined, even holding a pointer outside of its
array limits (plus 1) is undefined...

char a[5];
char* p=a; // valid
p+=4; // valid
*p; // valid
p++; // valid
*p; // undefined
p++; // undefined

The problem is that C compiler writers don't bother writting the
run-time checks that would detect these bugs, much less doing the type
inference that would be needed to detecht a small number of them at
compilation-time.

And last but not least: Is there a good book in C++ debugging?

Well, before writing a good book for C++ debugging, writing a good C++
debugger would be in order, don't you think?
 
P

Pascal J. Bourguignon

Vaclav Haisman said:
That was never correct, not since the first C standard which is 1990.

Indeed. And it is the shame of the C compiler industry not having
produced in 19 years a single implementation allowing to detect this
error automatically.
You were just lucky getting away with it. And before you think about trying to
solve this using union of the structure and some 64bit integer, no, that is
not allowed either.

This is not what is asked here. What is asked is some help from the
compiler, so we can detect these errors (either at compilation time or
at run time).

That the standard says it's "Undefined behavior" to allow for small
barebone C implementations on small system such as a PDP-11, be it.
But this shouldn't prevent C compiler providers to offer more
sophisticated and helpful compilers on the multi-{giga-{byte,hetz},core}
machines we have today.
 
B

Bart van Ingen Schenau

I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be completely
buggy.

And that comes as a surprise to a compiler writer?
I could probably write completely buggy code that is nevertheless
accepted without any complaint by lcc-win.

But how come that gcc doesn't emit the slightest warning?

Probably because your code used a type-cast operation, which the
compiler rightfully interprets as a message saying "Don't bother
complaining about this. I know what I am doing."
Well, we discovered that -Wall does enable the -Wstrict-aliasing but
that option has a "scale" i.e. you can say

-Wstrict-aliasing=1
up to
-Wstrict-aliasing=5

When you set -Wall you enable -Wstrict-aliasing=3.

OK. We increased the level to 5 (lengthening the compile time of our
software that is already a staggering 20 minutes in a 4 core machine)

Still, gcc doesn't emit A SINGLE WARNING!

Now, how can we discover this?

The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
        void *a;
        void *b;

};

And you could manipulate that data as it would be a single 64 bit
integer.

That has never been legal, and at best resulted in implementation-
defined behaviour.
To perform the re-interpretation, you need a type-cast. And as stated
before, the compiler then assumes you are aware of all the potential
problems. Including the aliasing problems.
Thanks in advance

jacob navia

Bart v Ingen Schenau
 
L

LR

jacob said:
The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
void *a;
void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.


Can you show an example of how you do that? Because it's not clear to me
from what you posted what the exact problem was.

Have you tried lint? You can try it online.
http://www.gimpel-online.com/OnlineTesting.html

What other compilers have you tried your code on? Some may give better
diagnostics than others. You can try Comeau's compiler online too.
http://www.comeaucomputing.com/tryitout/


LR
 
J

jacob navia

jacob navia wrote, On 9.7.2009 22:42:








From your description lower in the email, it seems to me that your code is
really the problem.





Violating aliasing rules is UB, UB does not require diagnostics.







Just 20 minutes? Thats nothing.




The levels exist because the higher levels can give false positives. And even
then they are not exhaustive. The compiler cannot see or recognize each and
every aliasing rules violation.







That was never correct, not since the first C standard which is 1990. You
were just lucky getting away with it. And before you think about trying to
solve this using union of the structure and some 64bit integer, no, that is
not allowed either.




The standard has not changed since 1998. Aren't 10+ years enough to learn the
language you are using? :)







I don't think there is anything specific about C++ debugging. You just need
to know the language well enough.
 
J

jacob navia

The standard has not changed since 1998. Aren't 10+ years enough to learn the
language you are using? :)

I did not write this code, I am just trying to make it work.
Thanks for your helpful message. This confirms the attitude of
many here, as if they never had any bugs of course.

GCC emits code that reads from an uninitialized place. This means that
it took us 2 weeks to get to the root of this problem.

But (of course) we are the stupids that "do not know how to write
C++"

The code base is around 7-8MB of C++
 
J

jacob navia

Don't hack.

Sure sure. How helpful. This is a HUGE code base of MB and MB of
C++. I did not write this code. It is my job to make it work, that's
all.

Obviously I am being blamed for asking a question, since asking
questions is obviously a NO NO here.

(If you ask a question it means you do not know everything,
contrary to the gurus here)

"Don't hack"

And how can I know if in those MBs of code there is a hack?

That was my question. Now, please answer THAT, and if you can't
I hope you can at least keep your mouth SHUT!
 
V

Victor Bazarov

jacob said:
[..]
That was my question. Now, please answer THAT, and if you can't
I hope you can at least keep your mouth SHUT!

There is no need to crawl into the bottle, jacob. Let's not start
telling anybody whose answers we don't like to keep their mouths shut,
shall we? This is a free forum, folks post what they see fit, and
flames don't help accomplish your goals, do they?

You're frustrated beyond your usual level, and that's not so difficult
to discern. Every once in a while we get a piece of code to maintain
and it turns out to be a hack. Annoying? You bet. Infuriating
sometimes. Sometimes after beating my head against the wall, I ask,
"why me? What did I do to deserve it?" And it often turns out that I
was given that code because people trusted me to sort it out. And that
there was nobody else close-by who could do it.

I have no particular knowledge about debugging aliasing problems to
share with you, sorry. But my approach to debugging has usually been to
replace the pieces of code that don't work (or those I don't understand)
with something I do understand and know as working. Try that. I don't
know if there is a book of recipes like that, but I don't think you have
time to study. You need to get it working.

Divide and conquer. Figure our which pieces are OK, leave them as is.
You probably already know what causes problems, see if you can replace
it keeping the same interface. Perhaps you need to introduce some
interface (to emulate the behavior of the questionable piece of code).

Ask more specific questions about C++, and you will have better answers.
But you already knew that, didn't you?

V
 
N

Nick Keighley

I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be completely
buggy.

doesn't that apply to all languages and all compilers?
The reason is that the C++ standard says that using a pointer of another
type to change memory as the type declared that this memory should be
is "undefined"...

this applies to C as well. Can your compiler diagnose
such problems in C?

<snip>
 
I

Ian Collins

jacob said:
Sure sure. How helpful. This is a HUGE code base of MB and MB of
C++. I did not write this code. It is my job to make it work, that's
all.

Which is why I suggested three techniques I use, which you choose to snip.
Obviously I am being blamed for asking a question, since asking
questions is obviously a NO NO here.

No, you are just a little paranoid.
And how can I know if in those MBs of code there is a hack?

That was my question. Now, please answer THAT, and if you can't
I hope you can at least keep your mouth SHUT!

I seldom type with my mouth open, it attracts unwelcome bugs!

As other have said, your code base must contain casts to do what you say
it's doing. Finding those would be a good start. Then try the three
techniques I suggested, which you choose to snip.
 
J

Jerry Coffin

[ ... ]
Now, how can we discover this?

It's certainly going to be nontrivial.
The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
void *a;
void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.

Sorry, but there never was such a time. Even the original C standard
specified that (for example) there could be padding between members
of a struct, so your code would give undefined results. The same was
true before there was a standard for C, though obviously there wasn't
any "official" document to state it.
Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?

The standard has never changed in this respect, and it follows the
example of the C standard, which codified the existing practice that
your code gave undefined results.

[ ... ]
Is there a combination of gcc warnings (that is NOT included in
Wall since we already have that) that could be useful here?

I use don't use gcc much, and use gdb even less, so I can't give you
much help that's specific to it.

If I had to do this, I think I'd insert another member between the
two you have right now:

struct twoPointers {
void *a;
int ignore;
void *b;
};

Then in the debugger I'd set a breakpoint on any write to 'ignore'.
No existing code should use that member directly, so anything that
writes to it is essentially certain to be doing so via some sort of
undefined behavior, and needs to be fixed.
 
A

Anand Hariharan

On Jul 10, 5:05 am, (e-mail address removed) (Pascal J. Bourguignon)
wrote:
(...)
Notice that of the same sort of bug that should be checked at run-time
are the array overflows and invalid pointers dereferences.  The C and
C++ standard explicitely say that derefering a pointer outside of its
pointed array is undefined, even holding a pointer outside of its
array limits (plus 1) is undefined...

Trying to read the value of an uninitialised variable results in UB as
well.

char a[5];
char* p=a; // valid
p+=4; // valid
*p;   // valid
p++;  // valid
*p;   // undefined
p++;  // undefined

The first *p that you state as valid results in undefined behaviour
because 'a' is not initialised.

- Anand
 
I

Ian Collins

Jerry said:
If I had to do this, I think I'd insert another member between the
two you have right now:

struct twoPointers {
void *a;
int ignore;
void *b;
};

Then in the debugger I'd set a breakpoint on any write to 'ignore'.
No existing code should use that member directly, so anything that
writes to it is essentially certain to be doing so via some sort of
undefined behavior, and needs to be fixed.

Good tip Jerry!

Putting the dummy member first should also work and this would also help
debug incorrect size assumptions when moving from 32 to 64 bit.
 
J

Jerry Coffin

Good tip Jerry!

Putting the dummy member first should also work and this would also help
debug incorrect size assumptions when moving from 32 to 64 bit.

Thanks. For that matter, you could perfectly well add both...
 
J

Joshua Maurice

jacob navia said:
"Your code has aliasing problems. This is not the place to
educate you about C/C++"
[snip]

Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?

I'm sorry that you're working with code which violates the C89
standard and the C++ standard. Someone has to fix it, and that someone
appears to be you. I don't have much to add beyond that which has been
mentioned already in this thread as to strategies to accomplish this.
Either way, expecting help from gcc developers from a false bug report
is unreasonable. Maybe in a feature request ... :)

jacob navia said:
I thought there could be a book with *advanced* C++ debugging but a
Google search, then an Amazon search yielded nothing
but books for beginners or user manuals of Visual C++ debugger
written in a book form.
Is there a combination of gcc warnings (that is NOT included in Wall
since we already have that) that could be useful here?

I wouldn't hold my breadth.
Is there a tool somewhere that could diagnose this problem?

It's done by the Zeta-C compiler (since the target is the
LispMachine).  Of course, today it might be easier to build a time
machine than to find a LispMachine with the Zeta-C compiler, and
anyways, it doesn't solve the problem of C++.

Perhaps one of the C/C++ interpreters are doing this type check.  Try
them.

C INTERPRETERS:
    CINT -http://root.cern.ch/root/Cint.html
    EiC -http://eic.sourceforge.net/
    Ch -http://www.softintegration.com
    [ MPC (Multi-Platform C -> Java compiler) -http://www.axiomsol.com]

Otherwise, your best chance would be to patch them, or gcc (or
lcc-win32), to generate tagged data and implement run-time type
checks.

Notice that of the same sort of bug that should be checked at run-time
are the array overflows and invalid pointers dereferences.  The C and
C++ standard explicitely say that derefering a pointer outside of its
pointed array is undefined, even holding a pointer outside of its
array limits (plus 1) is undefined...
[snip]

The problem is that C compiler writers don't bother writting the
run-time checks that would detect these bugs, much less doing the type
inference that would be needed to detecht a small number of them at
compilation-time.

You seem to be taking the opinion that compilers should catch all
undefined behavior. C++ is not Java. C++'s stated primary design goals
include
- runtime performance comparable with assembly
- don't pay for what you don't use
- portable
- easy to write code / programmer productivity (with less relative
emphasis on this one IMHO)

With these design goals in mind, it is not reasonable to expect a
compiler to catch all possible undefined behavior or errors. To do
that would necessarily restrict the language so that it's less
comparable to assembly in speed and/or you start paying for things you
don't use.

In the C and C++ community, the assumption is that the programmer
knows what he's doing, and with that assumption, you can (relatively)
easily write really fast and portable code.

That someone hasn't written a "debugging" compiler which catches all
possible violations of the standard, as a debugging tool only, is
indeed a shame if true. However, Valgrind comes to mind as useful tool
in this area. Also, various versions MSVC do have optional runtime
bounds checking and other runtime checking. Finally, C interpreters
can catch all such misuse which occurs at runtime, the existence of
which you reference in your post. Thus, it appears the tools which you
bemoan do not exist, do indeed exist, and thus I am confused by your
self contradictions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top