Only sometimes; and it's a valid optimization.
No, that is completely wrong. Valid optimizations must not break correct
code as -ffast-math does. This is why -ffast-math is not enabled by any -On
setting. Read the documentation.
Specifically, in this case, the results are identical.
They may appear identical on your machine with your compiler but that is
certainly not true in general. In particular, the -ffast-math option breaks
this program on my computer, giving incorrect results.
Mostly, in my experience, you start
to lose precision with -ffast-math when you start doing things beyond
simple arithmetic, such as sqrt() and cos(), or when you get into the
realm of overflows and NaNs.
In case anybody is curious the Intel compiler yields similar results
to VS, and to GCC with SSE3 enabled (but no -ffast-math), which is the
expected results:
icl /Ox /QxP /Qipo /Qunroll-aggressive smooth.cpp
Was about 7400 ms for me. With:
icl /Ox /QxP /Qipo /Qprec-div- /Qunroll-aggressive smooth.cpp
Dropping it down to 1100 ms (ICC's /Qprec-div- is similar in spirit to
GCC's -ffast-math).
Following are 3 source files and a Makefile, I used MinGW GCC 3.4.5;
you will want to implement your own tick()/tock() functions; the
windows.h #include is only for those. The output, for me, is:
$ ./smooth.exe
no -ffast-math: 8796.27
-ffast-math: 923.052
1e-014
delta: 0
they are precisely equal.
Compiling with -ffast-math gives 25% incorrect results on this machine (AMD
Athlon(tm) 64 X2 Dual Core Processor 4400+, g++ 4.2.3):
$ g++ -O3 test1.cpp -o test1
$ ./test1 >output.txt
$ g++ -O3 -ffast-math test1.cpp -o test1
$ ./test1 >output2.txt
$ diff output.txt output2.txt | wc
21242 42480 615958
Note: I replaced "rand()" in "fill" with "i" to make the program
deterministic.
Here are some of the differing results (correct results first):
49361.00000000000000000000
49362.00000000000000000000
49363.00000000000000000000
49364.00000000000000000000
49365.00000000000000000000
49360.99999999998544808477
49361.99999999997817212716
49362.99999999995634425431
49363.99999999992724042386
49364.99999999989813659340
As you can see, enabling -ffast-math really does break this program. As I
said, this is not a valid optimization.