gcc 4.8 and SPEC benchmark

J

Johannes Bauer

Hi group,

with the 4.8 release gcc announces that it'll break some code of which
the correct compilation relied on UB:
http://gcc.gnu.org/gcc-4.8/changes.html

Namely, the SPEC 2006 is broken in that revision. An explanation of the
assumptions that gcc makes is given here:
http://blog.regehr.org/archives/918

From a language-standpoint, gcc is doing perfectly fine: Garbage in,
garbage out. My question is: why does a benchmark like SPEC (which is
quite popular) consist of code that actually includes UB? It sounds like
a recipe for disaster. This apparently also affects some real-world
H.264-code (ffmpeg et al). Is there a reason for this? Why?

Best regards,
Johannes

--
Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <[email protected]>
 
E

Eric Sosman

[...]
From a language-standpoint, gcc is doing perfectly fine: Garbage in,
garbage out. My question is: why does a benchmark like SPEC (which is
quite popular) consist of code that actually includes UB? [...]

My question is, "Why would bugs in benchmarks surprise anyone?"
Or, "Why would anyone expect that turning code into a benchmark
would rid it of bugs?"

The SPEC benchmarks were derived from real-world programs, those
real-world programs were probably not bug-free, some of those bugs
survived SPECification (and others may have been introduced in the
process), so the SPEC benchmarks have bugs. <Shrug.>

It seems to me your question is just a special case of "Why
does code have bugs?"
 
J

Johannes Bauer

It seems to me your question is just a special case of "Why
does code have bugs?"

Not really -- with normal bugs, you don't "fix" the compiler, you fix
the buggy code. Not in this case:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53265

Basically "fixes" the compiler to not optimize. This blows my mind.

Regards,
Johannes

--
Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <[email protected]>
 
E

Eric Sosman

Not really -- with normal bugs, you don't "fix" the compiler, you fix
the buggy code. Not in this case:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53265

Basically "fixes" the compiler to not optimize. This blows my mind.

Sorry for misunderstanding the thrust of your question. As
to why the GCC folks chose to kludge the compiler to let the buggy
code slide by, it seems more of a "marketing" decision than a
technical one. The new compiler exposed bugs that had (apparently)
gone undetected for a long time, in code that is both popular and
very difficult to change (fixing SPEC might invalidate a huge
pile of already-published measurements, or at least make them
incomparable with new ones). Or, "Hey, everybody: This new GCC
produces faster code than the old one. To see how much faster,
just run the SPEC benchmarks, oh, hey, wait, ..."

Compilers, like other programs, exist to satisfy a set of
needs, both technical and non-technical. Sometimes those needs
are in conflict.
 
K

Ken Brody

Sorry for misunderstanding the thrust of your question. As
to why the GCC folks chose to kludge the compiler to let the buggy
code slide by, it seems more of a "marketing" decision than a
technical one. The new compiler exposed bugs that had (apparently)
[...]

Consider, for example, when the IBM-PC was first introduced, there were bugs
in the BIOS. Clones, for the sake of "100% compatibility", purposely
included those same bugs in their BIOS, lest some program fail to run
because it depended on the buggy behavior.
 
G

glen herrmannsfeldt

Well, at the higher optimization levels it is not unusual for there
to be cases where the optimization shouldn't be done, but the compiler
has a difficult time detecting those cases.

A popular optimization is to move invariant expressions out of loops,
assuming that the expression will be executed a large number of times.

It is not always possible to decide if the move is safe.
Sorry for misunderstanding the thrust of your question. As
to why the GCC folks chose to kludge the compiler to let the buggy
code slide by, it seems more of a "marketing" decision than a
technical one. The new compiler exposed bugs that had (apparently) [...]

Consider, for example, when the IBM-PC was first introduced, there were bugs
in the BIOS. Clones, for the sake of "100% compatibility", purposely
included those same bugs in their BIOS, lest some program fail to run
because it depended on the buggy behavior.

Q: What is the difference between a bug and a feature?

A: A feature is documented.

Since the BIOS commented assembly code was published (and copyrighted)
it was well documented. Still, some of the features were less useful
than they might have been.

There have been a number of cases where new hardware (or emulation in
hardware or software) had to implement bugs in the original, or failed
due to lack of such.

Some stories I remember had to do with incompatibilities between the
8080, Z80, and NEC V20 in 8080 emulation mode. Some I believe had to
do with flag bits that were left undefined in the 8080, but that some
programs (especially the early MS BASIC) depended on the undocumented
implementation on the 8080.

In the case of the IBM PC, many programs didn't use the BIOS calls, but
went directly to hardware, mostly for speed reasons. Microsoft Flight
Simulator was the favorite test case for clone hardware. So, not only do
you have to implement the BIOS bugs, but the hardware bugs as well.

The 20 bit address space of the 8088, would wrap when the
segment<<4+offset was too big, but on the 80286 and later, with a 24
bit address bus, would not wrap. Hardware was added to 80286 and later
machines to zero address bit 20 when running in the appropriate mode.

Certainly seems to me a bug for software to depend on address wrapping,
but a hardware fix was needed.

-- glen
 
J

Jorgen Grahn

Well, at the higher optimization levels it is not unusual for there
to be cases where the optimization shouldn't be done, but the compiler
has a difficult time detecting those cases.

A popular optimization is to move invariant expressions out of loops,
assuming that the expression will be executed a large number of times.

It is not always possible to decide if the move is safe.

Simple -- if it's not possible to decide, you don't do the optimization.

But I think you're really talking about optimizations which may not be
optimizations. The code still works, but performance-wise it was a
bad idea. *That* will always be something compilers suffer from.

/Jorgen
 
K

Ken Brody

Consider, for example, when the IBM-PC was first introduced, there were bugs
in the BIOS. Clones, for the sake of "100% compatibility", purposely
included those same bugs in their BIOS, lest some program fail to run
because it depended on the buggy behavior.

Q: What is the difference between a bug and a feature?

A: A feature is documented.
:)

[...]
Some stories I remember had to do with incompatibilities between the
8080, Z80, and NEC V20 in 8080 emulation mode. Some I believe had to
do with flag bits that were left undefined in the 8080, but that some
programs (especially the early MS BASIC) depended on the undocumented
implementation on the 8080.
[...]

Then there's the case where Intel specifically said "reserved for future
use" on hardware interrupt 5. Microsoft decided to use that interrupt for
the "print screen" function.

Lo and behold, the 286 came out, and used that interrupt for a hardware
fault. Now, every MS-DOS system had to put code at that interrupt trap that
would check the instruction just executed, and if it was an "INT 5" opcode,
go to the "print screen" code, otherwise, go to the fault handler.

Given that Microsoft documented their use of INT 5, I guess this qualifies
as a "feature"? :)
 
R

russell.gallop

It seems to me your question is just a special case of "Why

does code have bugs?"

I think it's even worse than benchmarks are code so have bugs. Code has bugs but the fact that benchmark code is preserved and kept the same for a long period of time means that the code lags the tools for finding bugs in a way that actively developed code doesn't.

I'd like to pretend I can spot all bugs in code I write but I'm probably more dependent on the tools I use.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top