Who can explain this bug?

T

Tim Rentsch

Noob said:
You're quick to castigate gcc!

Only after running some test cases.
Can you provide the test-case showing the problem you speak of?

I ran several simple test cases, including the ones
involving use of volatile that I showed else-thread. The
version of gcc I was using is not as recent as the one
mentioned up-thread, but recent enough so extrapolation
seemed reasonable. (And I note with satisfaction that my
advice about using volatile was proved correct.) I think if
you try to generate some test cases yourself they will be
easy enough to find, although that may depend on exactly
which version of gcc is used. It's possible newer versions
of gcc have eliminated this class of problems entirely;
however, knowing that gcc has had this problem historically,
and verifying a particular test case myself, was I thought
sufficient evidence to focus attention on that possibility.
 
T

Tim Rentsch

mathog said:
This little test program shows the same issue:

#include <stdio.h>
#include <stdlib.h>

void atest(double *ymax, double a, double b){
double tmp;
tmp=a/b;
if(*ymax <= tmp){
*ymax = tmp;
printf("True\n");
}
else {
printf("False\n");
}
}

int main(void){
double ymax=0;
double a, b;

while(1){
(void) fscanf(stdin,"%lf %lf",&a,&b);
if(!b)break;
atest(&ymax,a,b);
}
}

11 23
True
11 23
False
11 19
True
11 19
False
etc.

As suggested elsewhere in this thread, making tmp "volatile"
does resolve the issue. Volatile is supposed to prevent
compilers from optimizing away calculations that need to be
repeated. Here for some reason it forces the result of tmp
into a variable. Why? The compiler then knows that 'tmp" may
change unpredictably between where it is set and where it is
read, but why does that imply that the desired value at the
second position is the calculated value rather than something
written into tmp from a register in a hardware device (or
whatever) outside of this program's control?

The other answers you got weren't especially informative,
so I'll take a shot at it.

First, I recommend forgetting what you think you know about
volatile. It may be right or it may be wrong, but either
way it isn't helping you understand what is happening here.

In a nutshell, what is happening is this. The variable 'tmp'
has a location in storage. The code generator notices that a
value is stored (from a register) into tmp, then later loaded
again (let's say into the same register). The optimizer trys
to improve the code by not bothering to load the value again,
since it's already there. This kind of optimization is very
common, and happens all the time with other variable types.
(Probably you know all this part already, but I thought I
would put it in for context.)

However, for floating point types, including in particular
the type double, the register is wider than the place in
memory where 'tmp' resides. So the register can retain
extra information which could not be retained if the value
were put into the 'tmp' memory location. So re-using the
value in the register can behave differently than storing
and reloading.

Normally the difference between these choices is small enough
so it usually doesn't matter. Moreoever, on the x86, the
cost of doing "the right thing" is high enough so there is a
signifcant motivation to "cut corners", as it were, especially
since "it hardly ever matters". That kind of reasoning is
what led gcc to make the decisions it did regarding how to
optimize this sort of code.

To return to your question -- the reason 'volatile' makes a
difference is that volatile INSISTS the compiler deal with the
actual memory location for 'tmp', no matter what the compiler
may think otherwise. There are various reasons for having
'volatile' in the language, and also various reasons for using
it in certain circumstances, but as far as the compiler goes
the primary consequence of a volatile access is this: the
generated code must reference the actual memory location
involved, no matter _what_ the compiler might think about
what other code might have "the same effect".

So that's why using 'volatile' forces the value to be (stored
and then) fetched from the variable rather than being kept in
a register.
 
G

glen herrmannsfeldt

(snip on keeping extra precision)
There's an implication there (perhaps unintended) that discarding
excess precision wasn't required until C99. That conclusion
doesn't jibe with other, pre-C99, documents.
In particular, if we look at comments in Defect Reports, and
also how the wording in this area has changed over time (note
for example n843.htm, and compare n1256 to C99), it's clear that
narrowing behavior was intended all along.
What's more, assignment has _always_ implied removing any extra
range and precision, because in the abstract machine we actually
store the value in an object, and then on subsequent accesses
convert the stored representation to a value. Under the as-if
rule, any information beyond what the object representation can
represent must be lost.

I agree for -O0, and probably -O1, but at the higher optimization
levels one of the things you expect is to keep values in registers
throughout loops, and in general don't evaluate expressions more
often than necessary.

One idea behind optimization is that the compiler knows more than
the programmer, such that the programmer can write things in the
most readable form without worrying about how slow it is.
Bottom line: assignment of floating point types always requires
removing any extra range and precision, even in C90.

-- glen
 
T

Tim Rentsch

glen herrmannsfeldt said:
(snip on keeping extra precision)


I agree for -O0, and probably -O1, but at the higher optimization
levels one of the things you expect is to keep values in registers
throughout loops, and in general don't evaluate expressions more
often than necessary.

Whatever you might expect, such optimizations (with wide
floating-point registers) are not allowed in conforming
implementations of ISO C, and that has been true since 1990.
One idea behind optimization is that the compiler knows
more than the programmer, such that the programmer can
write things in the most readable form without worrying
about how slow it is.

Even if the qualifying assumption is true, the reasoning is
irrelevant, because optimized code is still required to
observe the abstract semantics of the language. The
"optimization" you are talking about is not faithful to the
abstract semantics of ISO C, and ergo is not conforming.
 
G

glen herrmannsfeldt

(snip)
Whatever you might expect, such optimizations (with wide
floating-point registers) are not allowed in conforming
implementations of ISO C, and that has been true since 1990.

I suppose reordering statements is also disallowed by the
standard. Moving things outside loops, and such. Fine, then
don't use the high optimization modes.
Even if the qualifying assumption is true, the reasoning is
irrelevant, because optimized code is still required to
observe the abstract semantics of the language. The
"optimization" you are talking about is not faithful to the
abstract semantics of ISO C, and ergo is not conforming.

I first knew about optimizing compilers in Fortran, where they
were doing it since before C existed. Now, Fortran does seem
to be less restrictive on floating point, but much of it people
live with because the speed-up is worthwhile. If it reduces the
run time from six days to five days, people will do it.

Some people like to do scientific programming in C that would
otherwise be done in Fortran. They also expect reasonable optimization.

Seems to me that compilers should state that with -O3 that it doesn't
conform, and you have the choice to use it or not.

-- glen
 
S

Stephen Sprunk

I don't know the ins and outs of the standards here, but the
intention is that gcc will follow the standards for floating point
regardless of the -O level, unless you specifically tell it that you
are happy with non-conforming behaviour through "-ffast-math" or
related flags.

It has been known for a decade or two that GCC does this even without
-ffast-math, yet nobody has fixed it, which makes your statement of
intent suspect.
Optimisation level flags in themselves should never change the
behaviour of a program - just its size and speed.

It is also well-known that optimization often results in different
manifestations of undefined behavior. ITYM that optimization shouldn't
affect conformance, which I think we can all agree with.

S
 
J

James Kuyper

(snip on keeping extra precision)

I've never owned a copy of C89; they didn't become cheap enough to
justify buying one until after I'd already switched to C99, and
therefore no longer needed a copy of C89. Therefore, I can't be sure
whether or not C89 can be read as permitting or prohibiting excess
precision. However, I'm fairly certain, from what I've read, that it did
not explicitly address the issue. That would explain why Tim has to use
such subtle arguments as he has, rather than being able to simply cite a
specific clause that would clearly be violated by retaining excess
precision.

I'd be interested in seeing an example of code that, when compiled by
gcc with -std=c99 (and without -fexcess_precision=fast), fails to
satisfy the C99 standard's very lax requirements on the accuracy of
floating point operations, due to incorrect use of excess precision. gcc
does not pre#define __STDC_IEC_559__, even when compiling for hardware
compliant with IEC 60559, thereby exempting it from all but the most
basic of the standard's accuracy requirements. I would be surprised if
gcc's developers mishandled any case that was simple enough to be
meaningfully constrained by such lax requirements.
 
M

mathog

Tim said:
There are various reasons for having
'volatile' in the language, and also various reasons for using
it in certain circumstances, but as far as the compiler goes
the primary consequence of a volatile access is this: the
generated code must reference the actual memory location
involved, no matter _what_ the compiler might think about
what other code might have "the same effect".

That was the explanation I was after.

Thanks,

David Mathog
 
J

James Kuyper

Undefined behaviour is, by definition, undefined. It should not be a
surprise if the actual effect varies according to optimisation level -
it should not be a surprise if it varies from run to run, or in any
other way.

But we do agree that optimisation levels should not affect the
observable behaviour of well-defined code.


In general, all unspecified behavior can change with optimization level
- which is the whole point, since one key thing that the standards
leaves unspecified is the speed with which your code executes. However,
unspecified behavior also includes implementation-defined behavior.
There's a lot of room for unexpected consequences, without having to
invoke undefined behavior. In principle, INT_MAX could change with
optimization level. This all follows from the simple fact that the
standard doesn't talk about optimization levels.
 
T

Tim Rentsch

glen herrmannsfeldt said:
(snip)

I suppose reordering statements is also disallowed by the
standard. Moving things outside loops, and such. Fine, then
don't use the high optimization modes.

Rather than opine in ignorance, why not read what the language
definition actually admits in this regard?

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

Not official standard documents, but either is close enough
to give accurate answers about source transformations.
I first knew about optimizing compilers in Fortran, where they
were doing it since before C existed.

The original Fortran didn't have a defining document. Whatever
the compiler did was how the language worked. The orginal C was
that way too, and compounded by the lack of a standard, because
of the variety of choices made by different implementations. If
you yearn for "the good old days", you might read Fred Brooks's
comments in The Mythical Man-Month about System/360 and its
architectural definition, contrasted with the previous generation
of IBM machines.
Now, Fortran does seem to be less restrictive on floating
point, but much of it people live with because the speed-up is
worthwhile. If it reduces the run time from six days to five
days, people will do it.

What you mean is some people will do it. No doubt some will.
Others want more reliable guarantees.
Some people like to do scientific programming in C that would
otherwise be done in Fortran. They also expect reasonable
optimization.

Another over-generalization. Plus you're implying that what
optimization ISO C does allow is not "reasonable". It seems
more likely that your comment springs from ignorance of just
what ISO C does allow, and also why.
Seems to me that compilers should state that with -O3 that it
doesn't conform, and you have the choice to use it or not.

As it is stated, this suggestion is hopelessly naive.
 
T

Tim Rentsch

James Kuyper said:
(snip on keeping extra precision)

I've never owned a copy of C89; they didn't become cheap enough
to justify buying one until after I'd already switched to C99, and
therefore no longer needed a copy of C89. Therefore, I can't be
sure whether or not C89 can be read as permitting or prohibiting
excess precision. However, I'm fairly certain, from what I've
read, that it did not explicitly address the issue. That would
explain why Tim has to use such subtle arguments as he has, rather
than being able to simply cite a specific clause that would
clearly be violated by retaining excess precision. [snip]

A few comments..

C90 is explicit in identifying operations that behave differently
with respect to extra range and precision. What it is not is clear
in spelling out the consequences of that distinction in plain,
impossible-to-misunderstand language of the kind that M. Kuyper
prefers.

Second, part of my comments were complicated by having to look at
other documents, because a larger issue is being addressed, namely,
the general question of conversion, not just storage. Here the C90
document is more ambiguous, and to produce support for the general
statement I think other sources need to be looked at besides just
the C90 document. In that sense I agree with the statements above,
at least in spirit, for the issue of conversion generally. But
only for that, not for what happens when storing during assignment.

Third, specifically with regard to storing and accessing, actually
my reasoning there was more subtle than it needed to be. Only
values and the results of expressions fall under the umbrella of
possibly having extra range and precision; objects do not. Because
storing into an object is not part of the universe of cases that
might have extra range and precision, it simply cannot happen,
and there is no reason to redundantly exclude it.
 
T

Tim Rentsch

David Brown said:
It has been known for a decade or two that GCC does this even without
-ffast-math, yet nobody has fixed it, which makes your statement of
intent suspect.

[snip]

In the case of the standard's requirements about precision, I expect
it would be a significant effort trying to follow these requirements
while still generating reasonable code. The only way that would work
reliably on all hardware that supports higher internal precision is to
have lots of extra stores to memory then re-load the values.
Obviously this is going to mean a serious hit on floating point
performance.

These comments seem overly simplistic. Given everything else a
good optimizing compiler has to do, putting in a few extra forced
register spills shouldn't be very hard. Furthermore extra stores
would be necessary only on processors that don't have a suitable
rounding instruction -- any serious compiler that targets multiple
platforms will do processor-specific code generation. The idea
that the performance cost will be "serious" may not even be true,
let alone obvious. I think you're assuming that the cost of doing
the data movement is what will dominate, but I don't think that is
necessarily true; local caches cut the cost of data movement way
down, and the rounding operation itself is not a trivial operation
(mainly I think because some edge cases need some care).

Also it's worth pointing out that when these things (necessarily)
happen is subject to programmer control, by changing how the code
in question is written.
So what you have here is a standards requirement to produce poorer
quality results that means more work for the compiler to generate
bigger and slower code. You can argue that following the standards
gives a bit more consistency in results across architectures, but as
far as I understand it the IEEE specs for floating point do not
require bit-perfect repeatability in all cases - so even with full
standards conformance you do not have 100% consistency.

I think you're operating from a couple of false assumptions here.
One is that enforcing a standard precision (as opposed to some
unknown "extended" precision) on assignment necessarily gives
poorer quality results. The other is that conformance to floating
point standards is motivated (primarily) by reproducibility. Often
times it is not nearly as important that calculations "do the right
thing" as that they "do NOT do the wrong thing". The rule for
assignment may not produce a more _accurate_ result, but I believe
it produces a more _dependable_ result.
In my opinion (which may not count for much, as I seldom have need
for high precision floating point work), code that relies on
limiting precision or "unreliable" operations such as floating point
equality tests, is risky and should be avoided. If you need such
features, you are better of using a library such as gnu mpfr (of
which I have no experience).

I think you've learned the wrong lesson. Operations on floating
point values don't behave the same way as they do in mathematics
on real values, but they do have rules for what behavior(s) are
allowed. The people who decided how these things should work
aren't stupid people: the rules are not arbitrary but are there
to allow computation that is fast, accurate, and dependable. To
get that, however, it's better to learn the rules, and work within
that framework, rather than treating floating point operations as
inherently mysteriously fuzzy.
Undefined behaviour is, by definition, undefined. It should not be a
surprise if the actual effect varies according to optimisation level -
it should not be a surprise if it varies from run to run, or in any
other way.

But we do agree that optimisation levels should not affect the
observable behaviour of well-defined code.

The comment about conformance is slightly different, and in
particular a slightly stronger statement. That is so because
defined-ness is not a binary condition. Do you see that?
 
T

Tim Rentsch

James Kuyper said:
In general, all unspecified behavior can change with optimization
level - which is the whole point, since one key thing that the
standards leaves unspecified is the speed with which your code
executes. However, unspecified behavior also includes
implementation-defined behavior. There's a lot of room for
unexpected consequences, without having to invoke undefined
behavior. In principle, INT_MAX could change with optimization
level. This all follows from the simple fact that the standard
doesn't talk about optimization levels.

The comment about changing INT_MAX is silly. INT_MAX must be
constant across a given implementation: a different value of
INT_MAX means a different implementation. Certainly a compiler
could take an "optimization" flag to mean a different value of
INT_MAX and become a different implementation, but that is no
different from saying it could take an "optimization" flag and
choose to compile Pascal rather than C. Neither one has anything
to do with variability due to implementation-defined behavior.
Unspecified (but not implementation-defined) behavior may change
within an implementation. Implementation-defined behavior may
change between implementations, but not within a particular
implementation.
 
T

Tim Rentsch

David Brown said:
David Brown said:
[snip]

But we do agree that optimisation levels should not affect the
observable behaviour of well-defined code.

The comment about conformance is slightly different, and in
particular a slightly stronger statement. That is so because
defined-ness is not a binary condition. Do you see that?

Yes, I can see that difference. And I have also had it pointed out to
me that there are things defined by the standards, things declared as
undefined, and things that are "implementation-defined" - i.e., they are
well-defined and the compiler must be consistent in generating correct
code for them, but the standards don't give all the details of /how/
they are defined.

I don't know how to make sense of the last clause there. Behavior
of C programs (as delineated by the ISO standard) falls into one of
four categories: undefined, unspecified, implementation-defined,
defined. Undefined behavior imposes no requirements. Defined
behavior is completely specified. Implementation-defined behavior
means the implementation must make a choice from a set of allowed
(and defined) behaviors, and then stick to that choice. Unspecified
behavior means there is a set of allowed, and defined behaviors,
and any of those behaviors may be adopted in any given instance.
Do you mean the ISO standard specifies only behavior, not how the
behavior is to be carried out? That is true, but that's the point
of language definition - to define the behavior of programs, not
how that behavior will be actualized.

(I snipped most of the earlier context, thinking it's better
to respond to these comments standalone.)
Regarding floating point, I realise the IEEE folks made the rules
the way they are for reasons (presumably, but not necessarily, for
good technical reasons). And I understand that for some types of
work, it is more important that an implementation follows a given
set of rules exactly than what these rules actually are.

But for other types of work, you do want to treat floating point
as "mysteriously fuzzy". It's a "best effort" sort of thing. You
don't care about NaNs, denormals, rounding modes, etc. You just
want the best approximation you can reasonably get at reasonable
speed, and you accept that there can be rounding errors on the
last couple of digits on big calculations. When you want more
accurate results, you switch to "long doubles", when you want
faster results you use "single" (depending on the platform, of
course). You get used to oddities like "2.0 * 0.5" being not
quite equal to "1.0". And if you want to do complex work where
rounding, order of calculations, etc., can have a significant
effect (such as inverting a large matrix), you use a pre-written
library by someone who understands these details.

I believe that most floating point code, and most programmers,
fall into this category - and would get better results with
"-ffast-math" than with strict IEEE conformance.

I understand what you're saying. I don't agree with it. More to
the point though, you haven't offered any kind of evidence or
supporting arguments for why anyone else should agree with it.
Are you offering anything more than an opinion? Why should
anyone accept your opinion of how floating point should behave
over, say, the opinions of the IEEE 754 group? To offer an
analogy, your views sound like saying novice drivers shouldn't
bother with seatbelts. On the contrary, it is only people who
know what they are doing who should work outside what are thought
(by experts) to be best practices.
But of course I also think the compiler should be able to produce
code that follows the C and IEEE rules as closely as possible.

The question is what should the compiler do by default, in the
absence of explicit flags? Should it make "fast and easy" code to
suit most uses? Should it follow the C standards as closely as
possible? Should it find a reasonable middle ground, and make
everyone equally unhappy?

Most developers are better served by compilers that are completely
conforming rather than partially conforming. For floating-point,
the benefits of more portable and more reliable behavior far
outweigh the difference in performance or accuracy (taking 64-bit
doubles and 80-bit internals as representative). Don't take my
word for it -- I encourage anyone who is interested to do some
experiments measuring how different compiler options affect
accuracy and/or performance. (It's much harder to measure the
costs of using using non-standard behavior. Often though the
effects in such cases are closer to "catastrophic" than "minor".)

Consequent to the above, I believe the community is better served
by compilers that are completely conforming -- not partially
conforming -- in their default mode. If someone wants to work
outside the standard language definition, then by all means, more
power to them, and I'm all in favor of compiler options to give a
variety of non-standard choices. But for the community as a
whole, it's better if the choice of using non-standard behavior
is the exception, not the rule.
 
G

glen herrmannsfeldt

Tim Rentsch said:
(snip)

(I snipped most of the earlier context, thinking it's better
to respond to these comments standalone.)
I understand what you're saying. I don't agree with it. More to
the point though, you haven't offered any kind of evidence or
supporting arguments for why anyone else should agree with it.
Are you offering anything more than an opinion?

Cray sold many machines for many millions of dollars that gave
approximate answers to many floating point operations.

The even sold one that had a non-commutative floating point multiply.

Now, I suppose if you add up the value of all the IEEE 754 compatible
machines it will total more than all the Cray machines, but how many
are actually used for floating point number crunching?
Why should
anyone accept your opinion of how floating point should behave
over, say, the opinions of the IEEE 754 group? To offer an
analogy, your views sound like saying novice drivers shouldn't
bother with seatbelts. On the contrary, it is only people who
know what they are doing who should work outside what are thought
(by experts) to be best practices.

OK, how about the opinion of the market? See what people actually
buy, and assume that is related to what they want.

(snip)

-- glen
 
T

Tim Rentsch

glen herrmannsfeldt said:
Cray sold many machines for many millions of dollars that gave
approximate answers to many floating point operations.

The even sold one that had a non-commutative floating point multiply.

Now, I suppose if you add up the value of all the IEEE 754 compatible
machines it will total more than all the Cray machines, but how many
are actually used for floating point number crunching?

Irrelevant to the point under discussion, which has to do with
conformance to the C standard, not IEEE 754 specifically (that
is an example, but only an example).
OK, how about the opinion of the market? See what people actually
buy, and assume that is related to what they want.

Based on that we should all use the Microsoft dog**** compilers.
No thanks.
 
G

glen herrmannsfeldt

(snip)
(snip, then I wrote)
Irrelevant to the point under discussion, which has to do with
conformance to the C standard, not IEEE 754 specifically (that
is an example, but only an example).

It isn't just that they aren't IEEE 754.

http://ed-thelen.org/comp-hist/CRAY-1-HardRefMan/CRAY-1-HRM.html

read especially about the floating point multiply and reciprocal
approximation. (There is no divide.)

Instead of generating the full product and rounding, they only generate
part of the product and round. Reasonably often there is no carry from
the part not generated, but not always.

As the product term truncation is not symmetrical, mutliply might not
be commutative.

(There is only 64 bit single precision with a 48 bit significand in
hardware. Double precision with 95 bits is available in software.)

Reciprocal approximate is accurate to 30 bits. An additional refinement
step gets to 47 bits. As that is one less than the 48 bits available, it
seems that the last bit is sometimes wrong.

-- glen
 
T

Tim Rentsch

glen herrmannsfeldt said:
(snip)
(snip, then I wrote)
Irrelevant to the point under discussion, which has to do with
conformance to the C standard, not IEEE 754 specifically (that
is an example, but only an example).

It isn't just that they aren't IEEE 754.

http://ed-thelen.org/comp-hist/CRAY-1-HardRefMan/CRAY-1-HRM.html

read especially about the floating point multiply and reciprocal
approximation. (There is no divide.) [snip elaboration]

Still irrelevant to the point under discussion: what matters is
not what the hardware does, but what C compilers do.

Incidentally, the Cray-1 came out before K&R was published,
which is to say 14 years before the first ISO C standard.

The Cray-1 was a marvel of design and floating-point performance
when it came out. But it isn't just a coincidence this approach
to doing floating-point has fallen by the wayside.
 
S

Seebs

However, for floating point types, including in particular
the type double, the register is wider than the place in
memory where 'tmp' resides. So the register can retain
extra information which could not be retained if the value
were put into the 'tmp' memory location. So re-using the
value in the register can behave differently than storing
and reloading.

There was a really lovely example of this I encountered recently. Someone
had come up with a test case which failed only on one or two CPUs at a
particular optimization level.

It ultimately turned out that the problem was that a test was computing
a value that was roughly equal to x / (x * 20) for some smallish x, thus, very
close to 0.05, then multiplying it by 200, which might or might not yield
10. Then this value was subtracted from 15 and converted to int. The result
was usually 5, but occasionally 4. The problem, it turns out, is that the
value was more than .05 by an amount small enough that it got rounded away
by the store.

So if you changed it from 15 - N == 5 to 25 - N == 15, it "worked". And I
suspect that if the test case had been changed from 15 - N == 5 to 10 - N ==
0, it would have "failed" on many more systems.

But fundamentally, the test case was broken. This was completely obvious
once I'd spent an hour or so studying it.

-s
 
T

Tim Rentsch

David Brown said:
David Brown said:
On 25/04/13 07:58, Tim Rentsch wrote:
[substantial snipping]

Are you offering anything more than an opinion? [snip]

I offer only my opinion here - I have no direct evidence.
[snip elaboration]

I'm all for people offering their opinions (distinguished as
such). I do so myself. But people who offer only their opinion,
and nothing in the way of evidence or some sort of supporting
reasoning, just aren't very interesting, because they don't
really contribute anything to the discussion.
I agree on what you say about experts. /I/ can choose to use
"-ffast-math" with my code, and I can offer that as an
/opinion/ to other people. But you would not want me to decide
on the default policy on gcc compiler flags!

I'm pretty sure I agree with that last sentence. :)
Most developers are better served by compilers that are completely
conforming rather than partially conforming. For floating-point,
the benefits of more portable and more reliable behavior far
outweigh the difference in performance or accuracy (taking 64-bit
doubles and 80-bit internals as representative). Don't take my
word for it -- I encourage anyone who is interested to do some
experiments measuring how different compiler options affect
accuracy and/or performance. (It's much harder to measure the
costs of using using non-standard behavior. Often though the
effects in such cases are closer to "catastrophic" than "minor".)

This all depends on the type of program and the type of target in
question. For the sorts of systems I work with, /enforcing/
conforming behaviour in floating point work would be catastrophic.
I work on embedded systems, and don't often use floating point. But
when I do use it, speed and code size is vital - following IEEE
standards regarding rounding, [snip elaboration]

You are using 'conforming' here in a way that makes me think you
don't understand the different ways the Standard uses the term.
A conforming _implementation_ is one that follows all the rules
laid out by the ISO standard. These rules allow great latitude;
they don't require IEEE floating point, for example. There are
two uses of 'conforming' as applied to programs. There is a very
restricted class of programs called 'strictly conforming'. A C
program being strictly conforming basically means it will produce
identical output on ANY conforming implementation. This class is
very narrow. In fact it is so narrow that I'm not sure it is
even possible to write a strictly conforming program that uses
floating-point in any non-trivial way, and if it is possible it
certainly isn't easy. At the other end of the spectrum is a
'conforming' program (ie, without the 'strictly' adverb). A
conforming program is one that is accepted by SOME conforming
implemention (not all, but at least one). This class is very
wide, and admits things that look almost nothing like C programs,
to say nothing about what their outputs might be. For example,
the program source

#define __PASCAL__ 1

... here the source continues written in the Pascal
programming language ...

could be a conforming C program. (It isn't, because no conforming
implementation accepts it, but an implementation could be written
that accepts it, and still is a conforming implementation.)

My comment above is about conforming _implementations_. Your
comment is about _programs_. These two things (ignoring the
semantic glitch around "conforming") are not incompatible.
For example, your program source could have

#pragma floating point ala David Brown

and then do floating-point operations however you want, yet still
be within the bounds of what a conforming implementation is allowed
to do. Do you see how important this difference is? My comment is
about the behavior of implementations, not the programs they
translate; we can use only conforming implementations without
being forced to use one floating-point system or another, or even
_any_ pre-defined set of floating-point systems.
For more mainstream programming, I think it is common to treat
floating point as approximate numbers with a wide dynamic
range. For such applications, the differences in functionality
between strictly conforming floating point and "-ffast-math"
are going to be very minor - unless there are a lot of
calculations, in which case the speed difference might be
relevant. (But as noted above, this is just an opinion.)

Here again I think you are confusing the ideas of an implementation
being conforming and a program being "conforming". Depending on IEEE
floating-point (not counting depending on it in a way that makes no
difference) is guaranteed to make a program NOT be 'strictly
conforming", as the Standard defines the term. An implementation can
be conforming yet still offer a wide variety of floating-point
systems.
Obviously there are other types of application for which
non-conforming behaviour could be catastrophic. If there were
no need of the full set of IEEE floating point rules, then the
rules would not be there in the first place.

Now that I've written out this more careful differentiation of
how "conforming" is used, it might be helpful to go back and
re-read the earlier comments. Probably that will clear up
most of the confusion but if some further thoughts come up
I'd be interested to hear them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,076
Messages
2,570,565
Members
47,201
Latest member
IvyTeeter

Latest Threads

Top