qak said:
Somehow, I can't not reply to all other replies (the reply button is
greyed out and their icons turn to red diamond, closed or deleted
maybe?). If I post an other message, the thread is lost. So here is the
my general reply:
1) My endiantness (hard to changed) is wrong, it is just an example and
quickly look up give the wrong order.
2) Some 'experts' don't seem to realize 'cmp' is a single assemply
instruction, all memcmp, strcmp, strncmp... even after optimized turn to
many 'cmp'(s) : cmp to see if the pointer is null, cmp to see if the
pointer is aligned, cmp to see if it can compare word or dword at a time,
cmp to see if the end is reached... how can they match assemply speed?
3) Without alignment, the 'cmp' return correctly, only slower.
4) to christian.bau, I'm looking for better solution, a good macro maybe?
Thanks all who participate.
Let's see if we can reach some useful conclusions on the question.
I assume you are interested primarily in program speed, and either
not concerned, or less concerned, with generated code size.
As others have pointed out, it's likely any performance gain here
will be down in the noise relative to many other issues.
Having said that, if you have decided it's important, the only
sure way to answer the question is try out different approaches
and measure.
Different techniques will have different performance results on
different hardware and under different compilers, so measurements
should be done on a representative set of platforms.
Which approach is best (using "best" somewhat guardedly) depends
on some things you haven't brought up. Some examples: is this
test going to be an isolated test, or will there be lots of
comparisons against other word choices? Will the comparisons
typically succeed or typically fail? If they typically fail,
what is the breakdown of how many initial characters match?
Starting at the end, if this is one isolated test, and most
comparisons have a mismatch on the first letter, the method
you describe as "slow and long" will likely be fastest or
nearly fastest. (To clarify - this is what my tests show on
two different platforms, but don't take my word for it, try
measuring yourself.)
If this is one isolated test, but tests typically succeed
or match on at least two initial characters, you might get
a slight performance advantage from using one of the full
word comparison approaches (compared to the "slow and long"
method). Unlikely to be worthwhile, but again the only
way to be sure is measure.
If the test is not to find a single word, but doing lots
of comparisons against, eg, a table of words, this can be
done quickly, simply, and safely using unions, like this:
union char4_unsigned32 {
char s[4];
uint32_t u;
};
union char4_unsigned32 four_letter_words[] = {
"Test",
"Fast",
"Slow",
"More",
"Less",
};
int
find_word( const char *p ){
union char4_unsigned32 pu;
uint32_t u;
int i;
int i_limit = sizeof four_letter_words / sizeof four_letter_words[0];
memcpy( pu.s, p, 4 );
u = pu.u;
for( i = 0; i < i_limit; i++ ){
if( u == four_letter_words
.u ) break;
}
return i;
}
The code shown above is safe and portable (assuming a suitable
typedef for uint32_t), and all comparisons are done using a
simple 32-bit-word compare. The loop should run very fast,
and the setup overhead is small.
Given all the above, it's pretty unlikely that comparing against
inline integer constants (either as hex/decimal numbers or
character constants) will be the method of choice. But, if you
do choose to go down that path, it's probably a good idea to
generate the constants programmatically, eg, by generating a
header file, and refer to the constants symbolically in your
program. Note that using the string literal approach, eg,
*(int*)p == *(int*)"Test"
produced (in my trials) performance results very close to
using straight integer constants, so using string literals
might be preferable so the code is easier to write and
understand.
In my performance tests, memcmp() and strncmp() were both slower
than all other methods tested, by factors ranging from two to
five, depending on platform and number of leading characters
matching the target word ("Test"). Again, don't take my word
on any of these results - take measurements yourself to be sure.
General performance advice: starting out, write the simplest
and most obvious code you can think of, and worry about these
kinds of micro-issues only after the program is running. Then
if you still think performance is a problem, go back and take
measurements as part of evaluating different approaches. I
suspect you'll be surprised by the results (I was, even for
the simple set of tests that I did).
Remember when doing measurements to take into account the
relative frequencies of which paths are taken, and also
the expected mix of different platforms on which the
program will run. Different plaforms often will favor
different approaches, and different approaches may be
faster or slower depending on whether the test is likely
to succeed or likely to fail, especially if it fails
early.
Good luck!