(e-mail address removed) <
[email protected]> wrote:
Thanks. Useful to compare out results.
Adding results for Chris Thomasson's code, from pastebin as referenced
in
Message-ID: <
[email protected]>
I've done the same and I get very different results for Chris's code
and for yours. Yours is the fastest on long strings, presumably
because you use strstr. Chris's is next fastest on long strings.
Yes, though it is reasonably predictable. For example, since Chris's
code is good for long strings and slow for short ones.
I used -O3 throughout. I'll post the -O2 with -O2 as well but in my
case I get faster time in all cases using -O3 (gcc 4.4.1).
<snip>
I assume these results are for 4004 byte long strings with 2 short
replacements. If so, here are my comparative timings:
thomasson (O2) 4.08 seconds
cmt_replace(`cat d4004`, "[]", "xx"):
4162900 calls in 4.000s is 960.8ns/call (9.608e-07s/call)
thomasson (O3) 4.08 seconds
cmt_replace(`cat d4004`, "[]", "xx"):
4202100 calls in 4.000s is 951.9ns/call (9.519e-07s/call)fast_replace(`cat d4004`, "[]", "xx"):
1133100 calls in 4.000s is 3.53µs/call (3.53e-06s/call)
fast_replace(`cat d4004`, "[]", "xx"):
1143900 calls in 4.000s is 3.497µs/call (3.497e-06s/call)
You get little difference between Chris's code and mine, but I get a
huge factor. Could just be a difference in hardware or cache/memory
configuration.
rh_replace2(`cat d4004`, "[]", "xx"):
515300 calls in 4.000s is 7.763µs/call (7.763e-06s/call)
rh_replace2(`cat d4004`, "[]", "xx"):
517900 calls in 4.000s is 7.723µs/call (7.723e-06s/call)
io_x (O2) 18.05 seconds [1]
io_x (O3) **** seconds (segfault) [2]
nilges (O2) 7.72 seconds
en_replace(`cat d4004`, "[]", "xx"):
499900 calls in 4.000s is 8.002µs/call (8.002e-06s/call)
en_replace(`cat d4004`, "[]", "xx"):
486300 calls in 4.000s is 8.226µs/call (8.226e-06s/call)
w_replace(`cat d4004`, "[]", "xx"):
492100 calls in 4.001s is 8.13µs/call (8.13e-06s/call)
w_replace(`cat d4004`, "[]", "xx"):
493100 calls in 4.000s is 8.112µs/call (8.112e-06s/call)
[2] These results may be misleading -- I wasn't sure how to
generate an executable from the mix of C and assembly and more or
less tried things until I got a clean compile/link, with:
nasm -felf -o replacer.o replacer.s
gcc -o tester -Wall -pedantic -std=c99 -On tester.c replacer.o
(n=2 or n=3)
blmblm-1-C (O2) 10.78 seconds
blmblm-1-user (O2) 35.97 seconds
blmblm-2-C (O2) 9.60 seconds
blmblm-2-user (O2) 35.10 seconds
blmblm-3-C (O2) 7.86 seconds
blm_replace(`cat d4004`, "[]", "xx"):
3162500 calls in 3.998s is 1.264µs/call (1.264e-06s/call)
blm_replace(`cat d4004`, "[]", "xx"):
3212200 calls in 3.998s is 1.245µs/call (1.245e-06s/call)
Again a difference. You code is super fast, at least on my hardware.
Full results using -O2 this time and the same set of tests to see
variation in times:
blm_replace(`cat d4004`, "[]", "xx"):
3162500 calls in 3.998s is 1.264µs/call (1.264e-06s/call)
rh_replace2(`cat d4004`, "[]", "xx"):
516300 calls in 4.000s is 7.748µs/call (7.748e-06s/call)
cmt_replace(`cat d4004`, "[]", "xx"):
4181100 calls in 4.000s is 956.6ns/call (9.566e-07s/call)
w_replace(`cat d4004`, "[]", "xx"):
492100 calls in 4.001s is 8.13µs/call (8.13e-06s/call)
en_replace(`cat d4004`, "[]", "xx"):
499900 calls in 4.000s is 8.002µs/call (8.002e-06s/call)
fast_replace(`cat d4004`, "[]", "xx"):
1140500 calls in 4.000s is 3.507µs/call (3.507e-06s/call)
blm_replace(`cat d4004`, "{}", "xx"):
3774700 calls in 3.999s is 1.06µs/call (1.06e-06s/call)
rh_replace2(`cat d4004`, "{}", "xx"):
518400 calls in 4.000s is 7.716µs/call (7.716e-06s/call)
cmt_replace(`cat d4004`, "{}", "xx"):
4253000 calls in 4.000s is 940.5ns/call (9.405e-07s/call)
w_replace(`cat d4004`, "{}", "xx"):
495700 calls in 4.000s is 8.07µs/call (8.07e-06s/call)
en_replace(`cat d4004`, "{}", "xx"):
511000 calls in 4.000s is 7.829µs/call (7.829e-06s/call)
fast_replace(`cat d4004`, "{}", "xx"):
1040400 calls in 4.000s is 3.844µs/call (3.844e-06s/call)
blm_replace(`cat wap.txt`, "and", "xx"):
200 calls in 5.107s is 2.553e+04µs/call (0.02553s/call)
rh_replace2(`cat wap.txt`, "and", "xx"):
400 calls in 4.125s is 1.031e+04µs/call (0.01031s/call)
cmt_replace(`cat wap.txt`, "and", "xx"):
200 calls in 4.542s is 2.271e+04µs/call (0.02271s/call)
w_replace(`cat wap.txt`, "and", "xx"):
400 calls in 4.273s is 1.068e+04µs/call (0.01068s/call)
en_replace(`cat wap.txt`, "and", "xx"):
400 calls in 4.915s is 1.229e+04µs/call (0.01229s/call)
fast_replace(`cat wap.txt`, "and", "xx"):
400 calls in 3.420s is 8550µs/call (0.00855s/call)
blm_replace(`cat wap.txt`, "ZZZ", "xx"):
200 calls in 4.479s is 2.24e+04µs/call (0.0224s/call)
rh_replace2(`cat wap.txt`, "ZZZ", "xx"):
600 calls in 3.821s is 6369µs/call (0.006369s/call)
cmt_replace(`cat wap.txt`, "ZZZ", "xx"):
200 calls in 4.383s is 2.192e+04µs/call (0.02192s/call)
w_replace(`cat wap.txt`, "ZZZ", "xx"):
500 calls in 3.379s is 6758µs/call (0.006758s/call)
en_replace(`cat wap.txt`, "ZZZ", "xx"):
600 calls in 3.945s is 6576µs/call (0.006576s/call)
fast_replace(`cat wap.txt`, "ZZZ", "xx"):
1000 calls in 4.105s is 4105µs/call (0.004105s/call)
blm_replace("abzzefzzijlmzzpqrzzuvzzyz", "zz", "xx"):
6237100 calls in 3.999s is 641.2ns/call (6.412e-07s/call)
rh_replace2("abzzefzzijlmzzpqrzzuvzzyz", "zz", "xx"):
25080600 calls in 4.000s is 159.5ns/call (1.595e-07s/call)
cmt_replace("abzzefzzijlmzzpqrzzuvzzyz", "zz", "xx"):
7199800 calls in 4.000s is 555.6ns/call (5.556e-07s/call)
w_replace("abzzefzzijlmzzpqrzzuvzzyz", "zz", "xx"):
21167000 calls in 4.000s is 189ns/call (1.89e-07s/call)
en_replace("abzzefzzijlmzzpqrzzuvzzyz", "zz", "xx"):
10014700 calls in 4.000s is 399.4ns/call (3.994e-07s/call)
fast_replace("abzzefzzijlmzzpqrzzuvzzyz", "zz", "xx"):
18143800 calls in 4.000s is 220.5ns/call (2.205e-07s/call)
blm_replace("abcdefghijlmnopqrstuvwxyz", "zz", "xx"):
33012700 calls in 3.999s is 121.1ns/call (1.211e-07s/call)
rh_replace2("abcdefghijlmnopqrstuvwxyz", "zz", "xx"):
35563900 calls in 4.000s is 112.5ns/call (1.125e-07s/call)
cmt_replace("abcdefghijlmnopqrstuvwxyz", "zz", "xx"):
18117800 calls in 4.000s is 220.8ns/call (2.208e-07s/call)
w_replace("abcdefghijlmnopqrstuvwxyz", "zz", "xx"):
40621400 calls in 4.000s is 98.47ns/call (9.847e-08s/call)
en_replace("abcdefghijlmnopqrstuvwxyz", "zz", "xx"):
26174300 calls in 4.000s is 152.8ns/call (1.528e-07s/call)
fast_replace("abcdefghijlmnopqrstuvwxyz", "zz", "xx"):
41656400 calls in 4.000s is 96.02ns/call (9.602e-08s/call)