c compilation - gcc vs visual c

kumarchi · May 9, 2008

hello:

I recently compiled a numerically intensive c project under cygwin gcc
3.4.4 and microsoft visual c. The platform is intel T2400 1.83 ghz
dual core lap top.

the numerical stuff is both floating point and integer intensive

The gcc optimized (-O3) vs non optimized improved the performance 30 %

visual c optimized (standard , check 'release; under visual c 2005)
vs non optimized ('build') was a whopping 8x performance gain

but the most surprising thing was visual c optimized was 2x
performance over gcc optimized.

is anybody else seeing the same thing. if this is true microsoft c
compiler is in a different league altogether

i was not successful so far compiling under mingw. will it make dent ?
2x is hard to overcome

jacob navia · May 9, 2008

hello:

I recently compiled a numerically intensive c project under cygwin gcc
3.4.4 and microsoft visual c. The platform is intel T2400 1.83 ghz
dual core lap top.

the numerical stuff is both floating point and integer intensive

The gcc optimized (-O3) vs non optimized improved the performance 30 %

The more you go with higher optimizations, the slower
it goes. Use -O2.

visual c optimized (standard , check 'release; under visual c 2005)
vs non optimized ('build') was a whopping 8x performance gain

This is because the non optimized version injects a lot of
checking code to avoid bugs. It is slower than strictly necessary.
For instance they check at function exit if there wasn't a stack overwrite.

but the most surprising thing was visual c optimized was 2x
performance over gcc optimized.

Intel compiler is even better than Microsoft. It is the best compiler
for the intel architecture. Period.

is anybody else seeing the same thing. if this is true microsoft c
compiler is in a different league altogether

Obviously Microsoft leaves gcc far behind, and I have been seeing this
since at least 8-9 years.

i was not successful so far compiling under mingw. will it make dent ?
2x is hard to overcome

mingw is just gcc using Microsoft run time library. Do not expect
anything better/worst.

It is not surprising that gcc is slower than Microsoft since the
people behind each project have vastly different objectives and
budgets to implement them.

Gcc is running in many platforms and architectures.
Microsoft is running in one platform exclusively.

Gcc implements standards like C99 (module small problems), Microsoft
implements only Microsoft environments (.net, etc). Microsoft is
still at C89 level.

Ulrich Eckhardt · May 9, 2008

jacob said:
Microsoft is running in one platform exclusively.

Sorry, but that's untrue. The platforms I know are IA32, Intel's and AMD's
64 bit platforms, MIPS, ARM, SH and maybe some more. Note that the latter
are used for MS' embedded platform.

Uli

micans · May 9, 2008

[email protected] said:
hello:

I recently compiled a numerically intensive c project under cygwin gcc
3.4.4 and microsoft visual c. The platform is intel T2400 1.83 ghz
dual core lap top.

the numerical stuff is both floating point and integer intensive

The gcc optimized (-O3) vs non optimized improved the performance 30 %

It's probably not going to help you, but as a point of interest I have
sometimes found -Os
to work better (optimize for size) with gcc.

Stijn

kumarchi · May 9, 2008

The more you go with higher optimizations, the slower
it goes. Use -O2.

This is because the non optimized version injects a lot of
checking code to avoid bugs. It is slower than strictly necessary.
For instance they check at function exit if there wasn't a stack overwrite..

Intel compiler is even better than Microsoft. It is the best compiler
for the intel architecture. Period.

Obviously Microsoft leaves gcc far behind, and I have been seeing this
since at least 8-9 years.

mingw is just gcc using Microsoft run time library. Do not expect
anything better/worst.

It is not surprising that gcc is slower than Microsoft since the
people behind each project have vastly different objectives and
budgets to implement them.

Gcc is running in many platforms and architectures.
Microsoft is running in one platform exclusively.

Gcc implements standards like C99 (module small problems), Microsoft
implements only Microsoft environments (.net, etc). Microsoft is
still at C89 level.

the problem is in my type of product i have to recommend windows
platform because based on gcc performance an apple to apple linux
platform will run 2x slower

moi · May 9, 2008

the problem is in my type of product i have to recommend windows
platform because based on gcc performance an apple to apple linux
platform will run 2x slower

Did you enable the -march=cpu-type -msse[2] code generation option ?

HTH,
AvK

Ian Collins · May 9, 2008

the problem is in my type of product i have to recommend windows
platform because based on gcc performance an apple to apple linux
platform will run 2x slower

Then try the Intel compiler, which is also cross platform.

jacob navia · May 9, 2008

the problem is in my type of product i have to recommend windows
platform because based on gcc performance an apple to apple linux
platform will run 2x slower

You can use Intel compiler under linux. Your code will be
faster than under windows/MSVC.

Of course do not tell your customers about Intel/Windows.

Antoninus Twink · May 9, 2008

the problem is in my type of product i have to recommend windows
platform because based on gcc performance an apple to apple linux
platform will run 2x slower

If speed is that important to you, why don't you hand-optimize the
assembly?

cr88192 · May 9, 2008

hello:

I recently compiled a numerically intensive c project under cygwin gcc
3.4.4 and microsoft visual c. The platform is intel T2400 1.83 ghz
dual core lap top.

the numerical stuff is both floating point and integer intensive

The gcc optimized (-O3) vs non optimized improved the performance 30 %

visual c optimized (standard , check 'release; under visual c 2005)
vs non optimized ('build') was a whopping 8x performance gain

but the most surprising thing was visual c optimized was 2x
performance over gcc optimized.

likely reason:
MS focuses a lot more on specific optimizations, and tweaking performance
for specific targets;
gcc, however, targets many targets, and tends to use far more generic code
generation (they try more to leverage fancy general purpose optimizations,
rather than arch-specific tweaks for various special cases).

in any case, gcc tends to, fairly often, produce fairly silly code (even
with optimizations), and, sadly, even with a very braindead lower-compiler
design (a hacked over stack machine), and optimizations focusing more on
"common special cases", it is not too hard to match or somewhat exceed gcc's
performance...

IMO, the 'O' options may well be Obfuscate rather than Optimize...

actually, one of the better ways at optimizing, would be likely to implement
a kind of abstract combinatorial tester, which would basically search the
space of possible optimizations and look for the ones with the lowest
simulated cost. sadly though, this will not work so well in the face of
usage patterns, which require actually using the code (the general option
could treat a very common case like an uncommon case, ...).

in something like a VM, it could be possible to use a kind of genetic
evolver for adapting functions (initially, it compiles functions
generically, and any functions it detects are using a significant portion of
the time, it starts mutating in an attempt to improve the general
performance). later, if/when a "final" version is desired, it uses the
versions of the functions found to be most effective.

note that this would likely be confined to the realms of low-level
optimization, with what are typically the biggest time wasters (general
algorithmic issues), being beyond the scope of such a tool...

the simplest approach, however (and the one I currently use in my compiler),
is to basically just test the compiler, and any obvious issues in the output
(silly code), are ones I focus on fixing.

the compiler machinery itself in my case, as this level, is little more than
just a very large and elaborate mass of decision trees (no fancy transforms
or general optimizer machinery, more just operations dispatched through a
maze of function calls).

this approach seems to work good enough IME...

is anybody else seeing the same thing. if this is true microsoft c
compiler is in a different league altogether

that, or, most of us are not that concerned with raw performance (vs having
a compiler we are not obligated to pay for...).

none the less, MS has at least a decent compiler in these regards...

i was not successful so far compiling under mingw. will it make dent ?
2x is hard to overcome

well, with gcc, it is hard to do much better...

as noted, MSVC and Intel are good options...

kumarchi · May 10, 2008

likely reason:
MS focuses a lot more on specific optimizations, and tweaking performance
for specific targets;
gcc, however, targets many targets, and tends to use far more generic code
generation (they try more to leverage fancy general purpose optimizations,
rather than arch-specific tweaks for various special cases).

in any case, gcc tends to, fairly often, produce fairly silly code (even
with optimizations), and, sadly, even with a very braindead lower-compiler
design (a hacked over stack machine), and optimizations focusing more on
"common special cases", it is not too hard to match or somewhat exceed gcc's
performance...

IMO, the 'O' options may well be Obfuscate rather than Optimize...

actually, one of the better ways at optimizing, would be likely to implement
a kind of abstract combinatorial tester, which would basically search the
space of possible optimizations and look for the ones with the lowest
simulated cost. sadly though, this will not work so well in the face of
usage patterns, which require actually using the code (the general option
could treat a very common case like an uncommon case, ...).

in something like a VM, it could be possible to use a kind of genetic
evolver for adapting functions (initially, it compiles functions
generically, and any functions it detects are using a significant portion of
the time, it starts mutating in an attempt to improve the general
performance). later, if/when a "final" version is desired, it uses the
versions of the functions found to be most effective.

note that this would likely be confined to the realms of low-level
optimization, with what are typically the biggest time wasters (general
algorithmic issues), being beyond the scope of such a tool...

the simplest approach, however (and the one I currently use in my compiler),
is to basically just test the compiler, and any obvious issues in the output
(silly code), are ones I focus on fixing.

the compiler machinery itself in my case, as this level, is little more than
just a very large and elaborate mass of decision trees (no fancy transforms
or general optimizer machinery, more just operations dispatched through a
maze of function calls).

this approach seems to work good enough IME...

that, or, most of us are not that concerned with raw performance (vs having
a compiler we are not obligated to pay for...).

none the less, MS has at least a decent compiler in these regards...

well, with gcc, it is hard to do much better...

as noted, MSVC and Intel are good options...

thanx all of you for responding. I was totally unprepared for such a
vast performance difference (2x msvc vs gcc) and my code is not at all
special(no UI, complicated classes etc). it simply does lots of
floating point array(mainly through fft) and normal integer
operations

I used -O3 flag and in my case so far it seems to be better than O2.
msvc by default uses their own O2.

I cannot believe such a blatant difference will go unnoticed for long

in our type of situation the intel compiler is not an option.
primarily because we have a standaradized dll plug in architecture
and so (if windoz do msvc) applies.

kumarchi · May 10, 2008

likely reason:
MS focuses a lot more on specific optimizations, and tweaking performance
for specific targets;
gcc, however, targets many targets, and tends to use far more generic code
generation (they try more to leverage fancy general purpose optimizations,
rather than arch-specific tweaks for various special cases).

in any case, gcc tends to, fairly often, produce fairly silly code (even
with optimizations), and, sadly, even with a very braindead lower-compiler
design (a hacked over stack machine), and optimizations focusing more on
"common special cases", it is not too hard to match or somewhat exceed gcc's
performance...

IMO, the 'O' options may well be Obfuscate rather than Optimize...

actually, one of the better ways at optimizing, would be likely to implement
a kind of abstract combinatorial tester, which would basically search the
space of possible optimizations and look for the ones with the lowest
simulated cost. sadly though, this will not work so well in the face of
usage patterns, which require actually using the code (the general option
could treat a very common case like an uncommon case, ...).

in something like a VM, it could be possible to use a kind of genetic
evolver for adapting functions (initially, it compiles functions
generically, and any functions it detects are using a significant portion of
the time, it starts mutating in an attempt to improve the general
performance). later, if/when a "final" version is desired, it uses the
versions of the functions found to be most effective.

note that this would likely be confined to the realms of low-level
optimization, with what are typically the biggest time wasters (general
algorithmic issues), being beyond the scope of such a tool...

the simplest approach, however (and the one I currently use in my compiler),
is to basically just test the compiler, and any obvious issues in the output
(silly code), are ones I focus on fixing.

the compiler machinery itself in my case, as this level, is little more than
just a very large and elaborate mass of decision trees (no fancy transforms
or general optimizer machinery, more just operations dispatched through a
maze of function calls).

this approach seems to work good enough IME...

the one silver lining in this affair s because my code is simple, it
shoudl be possible for a compiler guru to zero in on the fundamental
issues and fix gcc

msvc cannot have some voodoo magic on a simple code like that

pardon my layman type of understanding!!

cr88192 · May 10, 2008

Niz said:
Your using a version of GCC which is 3 years old. Perhaps try with one of
the newer versions with all the relevant optimisations and then see if the
difference is so great. Macs have GCC 4.0.1 (soon to be 4.2.2) as their
base compiler under Leopard.

sadly, I don't expect gcc to be steadily and rapidly picking up efficiency.
this is by no means a new project, and so likely any performance improvement
is likely to be fairly minor.

Although as others have said, the Intel compiler is king of the hill for
producing fast Intel x86 and x86_64 code.

wonder why that is?...

I think:
they make CPUs, so they have more than a few good ideas for how to optimize
them;
they only have to worry about a select few archs (x86, x86-64, and IA64), of
which, they can likely get away using very optimized backends (namely: a
specialized backend for each arch);
they get good money for all this, and have plenty of funds to devote;
....

so, it would be saying a lot if they did not have a compiler which produced
good output...

meanwhile, as for gcc:
it is written by people with apparently more than a few weird ideas WRT
processor efficiency;
it has to target many archs, and use most of the same machinery between a
variety of them;
a good portion of the developers are hobbyists (not that hobbyists can't be
motivated, but many of them likely have other concerns as well);
....

none the less, they are doing pretty well, and gcc is still a fairly good
compiler...

Willem · May 10, 2008

(e-mail address removed) wrote:
) thanx all of you for responding. I was totally unprepared for such a
) vast performance difference (2x msvc vs gcc) and my code is not at all
) special(no UI, complicated classes etc). it simply does lots of
) floating point array(mainly through fft) and normal integer
) operations
)
) I used -O3 flag and in my case so far it seems to be better than O2.
) msvc by default uses their own O2.

Have you tried gcc's -march=native setting, along with possibly
-msse2 and/or -mfpmath=sse, or some other i386-specific settings ?

Normally, gcc will compile a binary so that it will run on any i386,
not only the current machine.

) I cannot believe such a blatant difference will go unnoticed for long
)
) in our type of situation the intel compiler is not an option.
) primarily because we have a standaradized dll plug in architecture
) and so (if windoz do msvc) applies.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Chris H · May 10, 2008

jacob navia <[email protected]> said:
It is not surprising that gcc is slower than Microsoft since the
people behind each project have vastly different objectives and
budgets to implement them.

Gcc is running in many platforms and architectures.
Microsoft is running in one platform exclusively.

So Gcc is not likely to be good on any platform as the people who
develop for other platforms will specialise in that and beat GCC?

Gcc implements standards like C99

CRAP

GCC implements GNU-c and adds extensions for SOME parts of C99

(module small problems), Microsoft
implements only Microsoft environments (.net, etc). Microsoft is
still at C89 level.

Microsoft along with every other compiler INCLUDING GCC is at C95 with
SOME parts of C99 implemented.

GCC is no more C99 than any other compiler.

Walter Roberson · May 10, 2008

Chris H said:
Microsoft along with every other compiler INCLUDING GCC is at C95 with
SOME parts of C99 implemented.

GCC is no more C99 than any other compiler.

In past posts, people have said that Comeau's compiler with
the Dinkumware libraries are true C99. Certainly dinkumware.com
advertises their library as being fully conforming to standard C99.

jacob navia · May 10, 2008

Chris said:
So Gcc is not likely to be good on any platform as the people who
develop for other platforms will specialise in that and beat GCC?

This is very likely indeed. A specialized compiler for a given platform
has less problems and can take advantage of many particular
optimizations that a more general purpose compiler can't use.

CRAP

GCC implements GNU-c and adds extensions for SOME parts of C99

Apparently you can't just say

"I disagree". No. You have to yell

CRAP!!!

Obviously you are right with "Some parts of C99" but those "some parts"
are almost 99% of the job...

Microsoft along with every other compiler INCLUDING GCC is at C95 with
SOME parts of C99 implemented.

I disagree. Microsoft has done no effort at all to implemnt C99. The only
parts they did was // comments and accepting "long long". I am not aware
of any other parts of C99 that they implement.

GCC is no more C99 than any other compiler.

It is more advanced in its implementation of C99 than lcc-win.

kumarchi · May 10, 2008

This is very likely indeed. A specialized compiler for a given platform
has less problems and can take advantage of many particular
optimizations that a more general purpose compiler can't use.

Apparently you can't just say

"I disagree". No. You have to yell

CRAP!!!

Obviously you are right with "Some parts of C99" but those "some parts"
are almost 99% of the job...

I disagree. Microsoft has done no effort at all to implemnt C99. The only
parts they did was // comments and accepting "long long". I am not aware
of any other parts of C99 that they implement.

It is more advanced in its implementation of C99 than lcc-win.

guys:
i have zeroed in and created a simple test program. This progrma just
has floating point addition and integer addition. it does 20 loops x
1million times. in relase version of visual c it takes 0 time. gcc O3
takes 6 secs in my machine.

this cannot be rocket science; there seems to some fundamental
deficiency in gcc. i will treat this as a bug. This should have
serious implications for linux platforms

here is the code; test it for yourself

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

static double loop (long times)
{
long i=0;
double a=0;

for (i=1; i<times; i++)
{
double x1 = i-1;
double x2 = i;
double y = 0;
long n=0;

y = x1+x2;
n = i+ i -1;

y=x1*x2*y;

a=y;
}

return a;

}

int main (int argc, char **argv)
{
unsigned long times = 0;
long i=0;
time_t t=0;
time_t t1=0;
double dt=0;
long lcnt=20;
double a=0;

times = (long) (1e9);

/*
if(argc > 1)
{
times = atoi (argv[1]);

times *= 1e6;
}

if(argc > 2)
lcnt = atoi (argv[2]);

if(lcnt < 20)
lcnt = 20;
*/
time (&t);

for (i=0; i<20; i++)
{
a = loop (times);
/* you need this for visual c show any elapsed time
printf ("\n %lg \n", a);
*/
}

time (&t1);

dt = difftime (t1, t);

printf ("\n times=%ld loops=%ld dtime = %lg \n", times, lcnt, dt);

exit (0);
}

moi · May 10, 2008

guys:
i have zeroed in and created a simple test program. This progrma just
has floating point addition and integer addition. it does 20 loops x
1million times. in relase version of visual c it takes 0 time. gcc O3
takes 6 secs in my machine.
:

this cannot be rocket science; there seems to some fundamental
deficiency in gcc. i will treat this as a bug. This should have serious
implications for linux platforms

here is the code; test it for yourself

In gcc 4.1.2, with -O3 , on a 686,
the whole function is elimated and inlined,
leading to :

$ time ./a.out
times=1000000000 loops=20 dtime = 0

real 0m0.001s
user 0m0.000s
sys 0m0.003s
$
In this case, there is no difference in generated code when
-march=i686 -msse2 are added to the -O3 flag.

I guess, you'll have to invent a better benchmark

AvK

Sean G. McLaughlin · May 10, 2008

i have zeroed in and created a simple test program. This progrma just
has floating point addition and integer addition. it does 20 loops x
1million times. in relase version of visual c it takes 0 time. gcc O3
takes 6 secs in my machine.

Here, "cc -O2 try.c" resulted in dtime=0. But "cc try.c" took dtime=280.
Mind you, here "cc" is GCC 4.2.3.

this cannot be rocket science; there seems to some fundamental
deficiency in gcc. i will treat this as a bug.

Be a good idea to raise this issue with the GCC developers, in that case.

C/DataDraw smokes C++/STL on EDA-like benchmarks	0	Jun 10, 2008
[semi OT] - Lack of long double implementation in VS	10	Oct 23, 2011
I have troubles with GCC 4.3.3 installation	0	Mar 23, 2009
GCC/MSVC++ difference	6	Mar 12, 2007
python and visual C++	4	Feb 2, 2005
Computing a distance matrix using SSE2	15	Aug 4, 2009
compiling iso c++ code in Visual Studio environment	3	Aug 26, 2006
Microsoft Visual C++ 2008 compiler bug on access specifier - leftalone for long time	1	Feb 20, 2008

c compilation - gcc vs visual c

kumarchi

jacob navia

Ulrich Eckhardt

micans

kumarchi

moi

Ian Collins

jacob navia

Antoninus Twink

cr88192

kumarchi

kumarchi

cr88192

Willem

Chris H

Walter Roberson

jacob navia

kumarchi

moi

Sean G. McLaughlin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads