Don said:
On another issue, the code was adapted from Stroustrup's fine article
"Learning Standard C++ as a New Language", and running the C code against
the C++ code, my average running times on the program were as follows:
C version:
Unoptimized: 25 secs.
Optmized: 26 secs.
C++ version:
Unoptimized: 75 secs.
Optimized: 35 secs.
Take these numbers with a grain-o-salt. A number of other optimizations
should be considered. e.g. When both C and C++ versions are linked
static on the amd64 version, the times are the same.
AMD Athlon(tm) MP 2400+
gcc version 4.0.0 20050102
C++
Unoptimized: 11.4
Optimized: 7
C
Unoptimized: 8.6
Optimized: 7.6
model name : AMD Opteron(tm) Processor 248
gcc-3.4.2 amd64
C++
Unoptimized: 6.9
Optimized: 3.4
C
Unoptimized: 3.8
Optimized: 2.9
model name : AMD Athlon(tm) MP 2400+
stepping : 1
cpu MHz : 2000.085
gcc version 4.0.0 20050102 (experimental)
$ text_rdr_mkr #integers
Enter a number: 5000000
$ g++ -O0 -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 48.9781
11.430u 0.240s 0:11.66 100.0% 0+0k 0+0io 262pf+0w
$ g++ -O2 -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 48.9781
7.040u 0.270s 0:07.30 100.1% 0+0k 0+0io 259pf+0w
$ g++ -O3 -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 48.9781
6.900u 0.320s 0:07.19 100.4% 0+0k 0+0io 259pf+0w
$ gcc -O0 -o text_rdr_c text_rdr_c.c
$ time text_rdr_c
number of elements = 5000000, median = 49, mean = 88
8.560u 0.180s 0:08.74 100.0% 0+0k 0+0io 130pf+0w
$ gcc -O2 -o text_rdr_c text_rdr_c.c
$ time text_rdr_c
number of elements = 5000000, median = 49, mean = 88
7.630u 0.240s 0:07.89 99.7% 0+0k 0+0io 130pf+0w
$ gcc -O3 -o text_rdr_c text_rdr_c.c
$ time text_rdr_c
number of elements = 5000000, median = 49, mean = 88
7.580u 0.280s 0:07.85 100.1% 0+0k 0+0io 130pf+0w
model name : AMD Opteron(tm) Processor 248
stepping : 10
cpu MHz : 2191.059
gcc-3.4.2
$ g++ -O0 -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 49.0195
6.866u 0.130s 0:07.01 99.7% 0+0k 0+0io 0pf+0w
$ g++ -O2 -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 49.0195
3.365u 0.105s 0:03.47 99.7% 0+0k 0+0io 0pf+0w
$ g++ -O3 -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 49.0195
3.342u 0.105s 0:03.44 100.0% 0+0k 0+0io 0pf+0w
$ gcc -O0 -o text_rdr_c text_rdr_c.c
$ time text_rdr_c
number of elements = 5000000, median = 49, mean = 48
3.787u 0.084s 0:03.87 99.7% 0+0k 0+0io 0pf+0w
$ gcc -O2 -o text_rdr_c text_rdr_c.c
$ time text_rdr_c
number of elements = 5000000, median = 49, mean = 48
2.908u 0.076s 0:02.98 99.6% 0+0k 0+0io 0pf+0w
$ gcc -O3 -o text_rdr_c text_rdr_c.c
$ time text_rdr_c
number of elements = 5000000, median = 49, mean = 48
2.908u 0.080s 0:02.99 99.6% 0+0k 0+0io 0pf+0w
Other optimizations
32bit
$ g++ -fPIC -O3 -finline-limit=5000 -static -o text_rdr text_rdr.cpp
$ time ./text_rdr
Number of elements = 5000000, median = 49, mean = 48.9781
5.240u 0.270s 0:05.50 100.1% 0+0k 0+0io 102pf+0w
$ gcc -fPIC -O3 -finline-limit=5000 -static -o text_rdr text_rdr_c.c
$ time ./text_rdr_c
number of elements = 5000000, median = 49, mean = 88
7.740u 0.250s 0:07.97 100.2% 0+0k 0+0io 130pf+0w
64bit
$ g++ -fPIC -O3 -finline-limit=5000 -static -o text_rdr text_rdr.cpp
Number of elements = 5000000, median = 49, mean = 49.0195
3.007u 0.106s 0:03.11 99.6% 0+0k 0+0io 0pf+0w
$ gcc -fPIC -O3 -finline-limit=5000 -static -o text_rdr text_rdr_c.c
$ time ./text_rdr_c
number of elements = 5000000, median = 49, mean = 48
3.107u 0.105s 0:03.21 99.6% 0+0k 0+0io 0pf+0w