matching assembly speed for small string comparison

Tim Rentsch · Aug 31, 2013

Barry Schwarz said:
qak said:

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

Click to expand...

I'd second memcmp. Try it and measure the speed (not of the test --
that will be quite hard and not very informative -- but of your
program). A lot of maintainable C code has been sacrificed on the alter
of efficiency.

If you still want to avoid the standard idiom (memcmp) then you can
avoid the gruesome constant simply by using a string literal:

if (*(int32_t *)p == *(int32_t *)"Test") ...

Using int32_t makes the intent a little clearer.

Click to expand...

How do you insure that the value of p and the address of the
string literal are suitably aligned for int. If either is not,
the statement invokes undefined behavior.

Actually both sides are undefined behavior regardless of
alignment, because effective type conditions are not met.

glen herrmannsfeldt · Aug 31, 2013

(snip, I wrote)

It's in the standard, but the result is, as you'd expect, implementation
defined.

I think it's quite common to support some reasonable interpretation of
multi-character constants up to the width of int (the type of a '...'
constant), but in every case that I can remember (only a handful, mind
you) the resulting constant is "big-endian", i.e. '1234' has the value
('1' << 24) + ('2' << 16) + ('3' << 8) + '4'. That probably won't do
for the OP, quite apart from the other issues.

I didn't make enough if it my post, by memcmp is the way to go --
nothing else comes close in terms of correctness, portability and
clarity. Other tricks might need to be used in special circumstances,
but that will be the result of timing tests.

I would probably do multi-character constants over inline assembler,
but, yes, memcmp() is the best way.

-- glen

Ian Collins · Aug 31, 2013

glen said:
(snip, I wrote)

I would probably do multi-character constants over inline assembler,
but, yes, memcmp() is the best way.

Or strncmp.

One compiler I tried (Sun cc) generates the same code as for the OP's
slow and long case with strncmp but calls memcmp. gcc uses a loop for both.

glen herrmannsfeldt · Sep 1, 2013

James Kuyper said:
On 08/31/2013 05:46 PM, Keith Thompson wrote:
(snip)

If you can arrange for the memory to be correctly aligned from the
beginning, it's probably no slower, and may be faster. If you have to
copy the pointed-at object into a correctly aligned buffer, it's
definitely slower.

Specifically, many compilers generate inline code for these.
There is no function call overhead in that case.

It is a little less obvious that the inline code is optimal for
specific known size cases. But only a little.

-- glen

Eric Sosman · Sep 1, 2013

Eric Sosman said:
Eric Sosman said:

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

On many systems, int requires alignment (frequently a multiple of 4).
String literals and character arrays do not have a similar
requirement. Any attempt to convert a char* to an int* when the value
of the char* does not meet the alignment for an int* will invoke
undefined behavior.

Click to expand...

Let's also note that even if the unaligned comparison
actually compares, it almost certainly does *not* do what
the O.P. wanted.

(Hint: Is the O.P.'s machine likely to be Big-Endian?)

Click to expand...

I'd expect the byte ordering of a string literal to match the byte
ordering of whatever 'Test' means in the assembly instruction:

cmp [esi], 'Test'

I'm not sure which CPU the OP is using, but if it's x86 then alignment
likely isn't going to be a problem (though of course the code will be
non-portable).

What's the byte ordering of 0x54657374? *That's* where
qak's "optimization" is very likely wrong: If his machine is
Little-Endian (as seems likely from the one line of assembly
exhibited), then 0x54657374 compares equal to the first four
characters of "tseT". Not only is his code "hard to change,"
it's also "hard to get right in the first place."

It's said that optimization is the art of getting the wrong
answer sooner, and qak's code is a perfect example. qak: Use
memcmp(); life's too short to waste on insignificances.

(An old acquaintance spoke of removing cigarette butts and
bottle tops from the beach, "so the sand would be nice and clean
around the whale carcases." That's probably what qak is up to.)

qak · Sep 1, 2013

Specifically, many compilers generate inline code for these.
There is no function call overhead in that case.

It is a little less obvious that the inline code is optimal for
specific known size cases. But only a little.

-- glen

Somehow, I can't not reply to all other replies (the reply button is
greyed out and their icons turn to red diamond, closed or deleted
maybe?). If I post an other message, the thread is lost. So here is the
my general reply:
1) My endiantness (hard to changed) is wrong, it is just an example and
quickly look up give the wrong order.
2) Some 'experts' don't seem to realize 'cmp' is a single assemply
instruction, all memcmp, strcmp, strncmp... even after optimized turn to
many 'cmp'(s) : cmp to see if the pointer is null, cmp to see if the
pointer is aligned, cmp to see if it can compare word or dword at a time,
cmp to see if the end is reached... how can they match assemply speed?
3) Without alignment, the 'cmp' return correctly, only slower.
4) to christian.bau, I'm looking for better solution, a good macro maybe?
Thanks all who participate.

Eric Sosman · Sep 1, 2013

Somehow, I can't not reply to all other replies (the reply button is
greyed out and their icons turn to red diamond, closed or deleted
maybe?). If I post an other message, the thread is lost. So here is the
my general reply:
1) My endiantness (hard to changed) is wrong, it is just an example and
quickly look up give the wrong order.

The fact that you got it wrong should be cautionary. If even
a simple example has a bug, how will you fare when things get more
complex?

2) Some 'experts' don't seem to realize 'cmp' is a single assemply
instruction, all memcmp, strcmp, strncmp... even after optimized turn to
many 'cmp'(s) : cmp to see if the pointer is null, cmp to see if the
pointer is aligned, cmp to see if it can compare word or dword at a time,
cmp to see if the end is reached... how can they match assemply speed?

First, the "single assemply [sic] instruction" you exhibited
will not in fact do the job. It will perform the comparison and
set a flag somewhere -- and then proceed merrily to the next
instruction, just as if nothing had happened. You need at least
one further instruction to get the program to do This versus That
depending on the outcome of the comparison. In short, your "single
assembly instruction" is not equivalent to *any* C `if' construct;
it is equivalent to something like

memcmp(p, "Test", 4);

.... with no test. Many compilers will recognize that this statement
has no effect and will optimize it away (typically with a warning),
leaving *zero* instructions in the code -- *much* faster than your
"single assembly instruction."

But that's a quibble: We all know that there'll be a conditional
branch or some such, and probably an unconditional branch as well
if there's an `else'. What everyone's trying to tell you is that
your effort is almost certainly misplaced. How many nanoseconds do
you hope to save, and how many nanoseconds have you *already* spent
in hopes of saving them?

Let's just imagine that it takes you one minute to look up the
four characters in an ASCII chart, type in their hex values, proofread
them, and double-check the endianness (since you've already shown that
endianness is a trap for you, omitting the double-check would be Really
Dumb). Okay, one minute. Now let's suppose that this effort saves
ONE HUNDRED NANOSECONDS per comparison, wow! You will *break even*
after

Six
Hundred
Million

comparisons.

Now, six hundred million is not a Really Enormous number of
comparisons: You'd expect about that many when sorting twenty-five
million items, for example. But remember: Your optimization only
applies when comparing to a *constant*, so scenarios like big sorts
just aren't applicable. I put it to you that either (1) your code
will never even come close to making six hundred million string-
vs-constant comparisons, or (2) you've got an infinite loop.

What the `experts' are trying to tell you is that there are
almost certainly other parts of your program whose optimization
will make much more difference than this one could ever hope to.

BartC · Sep 1, 2013

qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

You can get fast short-string compares by forgetting about strings and just
using ints:

if (a == 'ABCD') ...

However C seems to have a problem turning 'ABCD' into an int value in a
consistent, portable manner. (Maybe there are macros that can be created but
if it involves having to write stuff like ('A'<<24)+('B'<<16)... then forget
it!).

But if 'ABCD' is handled properly on the systems you're interested in, then
that's simplest. Endian-ness may or may not be relevant. (How does ASM
denote 'ABCD' anyway?)

And are strings always going to be 4 characters? I think shorter strings
will be OK, but if some will be longer, or you might be comparing short with
long, then it might be simpler to just stick with strcmp(), or use some
entirely different scheme depending on what you're trying to do.

Keith Thompson · Sep 2, 2013

BartC said:
qak said:

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

Click to expand...

You can get fast short-string compares by forgetting about strings and just
using ints:

if (a == 'ABCD') ...

However C seems to have a problem turning 'ABCD' into an int value in a
consistent, portable manner.

The value of such a character constant is implementation-defined. On
one system I tested, the byte order of 'ABCD' is the opposite of the
byte order of "ABCD".

And you'll have problems on system where sizeof (int) < 4.

I wouldn't consider using 'ABCD' in any code intended to be portable.
I wouldn't willingly use even in non-portable code.

(Maybe there are macros that can be created but
if it involves having to write stuff like ('A'<<24)+('B'<<16)... then forget
it!).

You only have to write that once. If you want to reject the idea
because of that, you can.

But if 'ABCD' is handled properly on the systems you're interested in, then
that's simplest. Endian-ness may or may not be relevant. (How does ASM
denote 'ABCD' anyway?)

I don't know what you mean by "handled properly".

And are strings always going to be 4 characters? I think shorter strings
will be OK, but if some will be longer, or you might be comparing short with
long, then it might be simpler to just stick with strcmp(), or use some
entirely different scheme depending on what you're trying to do.

The OP presented this:

cmp [esi], 'Test'

Click to expand...

as the assembly code he's trying to replicate in C, but hasn't told us
what CPU he's using (I think "esi" is x86-specific, but I'm not
certain), or how his assembly treats 'Test'.

Siri Cruise · Sep 2, 2013

qak said:
qak said:

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

Click to expand...

You can get fast short-string compares by forgetting about strings and just
using ints:

if (a == 'ABCD') ...

This is an old CDC programming trick. A lot of fields were 1, 2, 3, 7, or 10
character strings that were compared as integers. It also meant when you dumped
memory it characters and octal, you could easily find and decode these values.

However C seems to have a problem turning 'ABCD' into an int value in a
consistent, portable manner. (Maybe there are macros that can be created but
if it involves having to write stuff like ('A'<<24)+('B'<<16)... then forget
it!).

Apple also uses this technique from way back in 1984. A compiler that works with
Macintosh systems will do this consistently.

And are strings always going to be 4 characters? I think shorter strings
will be OK, but if some will be longer, or you might be comparing short with
long, then it might be simpler to just stick with strcmp(), or use some
entirely different scheme depending on what you're trying to do.

You can't do switches on strings.

BartC · Sep 2, 2013

Keith Thompson said:
The value of such a character constant is implementation-defined. On
one system I tested, the byte order of 'ABCD' is the opposite of the
byte order of "ABCD".

That might not matter, if you're only interested in comparing the whole
string.

And you'll have problems on system where sizeof (int) < 4.

Naturally, you would have to use an int size that is 4 chars long (or even 8
chars).

You only have to write that once. If you want to reject the idea
because of that, you can.

It will usually be more than once, as there could be hundreds of these
constants. And normal development involves deleting, adding and modifying
these all the time. Maybe there are editor tricks that can be used to
replicate the pattern (but it still looks terrible). If it was possible to
have a macro that looked like INTSTR("ABCD") then it wouldn't be so bad (but
I think not, and it would need different versions for different lengths).

I don't know what you mean by "handled properly".

By ending up with an int value looks like 0x41424344 or 0x44434241 or
something along those lines. Just a unique representation of a 4-char string
that can be stored in a single int value. Shorter strings would need to be
justified towards the lsb (but I think that happens anyway otherwise 'A'
would be 0x41000000).

Clearly treating such multi-char constants as int values would be extremely
useful. C would have allowed these, but I suspect that a number of
implementations that did things slightly differently meant it couldn't be
standardised. (FWIW my gcc stores 'ABCD' as 0x41424344, but it gives a
warning that I don't know how to disable. And it doesn't like 'ABCDEFGH'.)

-
Bartc

James Kuyper · Sep 2, 2013

That might not matter, if you're only interested in comparing the whole
string.

I suppose that's true, but only because you used the word "might" -
which acknoledges that the flip side of your statement is also true: it
might matter, even if you're only interested in comparing the whole
string. If the integer constant that you need to use to match "ABDC" in
this fashion is 'DCAB', it matters very much whether or not you're aware
of that fact.

It will usually be more than once, as there could be hundreds of these
constants.

Why would you have to write the macro more than once, just because
you'll be using it hundreds of times?

And normal development involves deleting, adding and modifying

these all the time. Maybe there are editor tricks that can be used to
replicate the pattern (but it still looks terrible). If it was possible to
have a macro that looked like INTSTR("ABCD") then it wouldn't be so bad (but
I think not, and it would need different versions for different lengths).

Why do you think that wouldn't be possible?
#define INTSTR(string) (string[0]<<24 + string[1]<<16 + ...

It depends upon the implementation-specific byte order of 'int', and
needs rewriting to be useful on systems where INT_MAX is too small, but
you've already conceded those portability issues. The result wouldn't be
an integer constant expression, which restricts it's potential uses.
Howe but I hadn't thought that the intended use would be one that
requires an ICE.
When applied to a string literal, it is an integer expression with a
value that can be computed at compile time, and I would hope that many
compilers would actually do so. I haven't bothered checking; I would
never have any use for something so thoroughly unportable.

Tim Rentsch · Sep 2, 2013

qak said:
Somehow, I can't not reply to all other replies (the reply button is
greyed out and their icons turn to red diamond, closed or deleted
maybe?). If I post an other message, the thread is lost. So here is the
my general reply:
1) My endiantness (hard to changed) is wrong, it is just an example and
quickly look up give the wrong order.
2) Some 'experts' don't seem to realize 'cmp' is a single assemply
instruction, all memcmp, strcmp, strncmp... even after optimized turn to
many 'cmp'(s) : cmp to see if the pointer is null, cmp to see if the
pointer is aligned, cmp to see if it can compare word or dword at a time,
cmp to see if the end is reached... how can they match assemply speed?
3) Without alignment, the 'cmp' return correctly, only slower.
4) to christian.bau, I'm looking for better solution, a good macro maybe?
Thanks all who participate.

Let's see if we can reach some useful conclusions on the question.

I assume you are interested primarily in program speed, and either
not concerned, or less concerned, with generated code size.

As others have pointed out, it's likely any performance gain here
will be down in the noise relative to many other issues.

Having said that, if you have decided it's important, the only
sure way to answer the question is try out different approaches
and measure.

Different techniques will have different performance results on
different hardware and under different compilers, so measurements
should be done on a representative set of platforms.

Which approach is best (using "best" somewhat guardedly) depends
on some things you haven't brought up. Some examples: is this
test going to be an isolated test, or will there be lots of
comparisons against other word choices? Will the comparisons
typically succeed or typically fail? If they typically fail,
what is the breakdown of how many initial characters match?

Starting at the end, if this is one isolated test, and most
comparisons have a mismatch on the first letter, the method
you describe as "slow and long" will likely be fastest or
nearly fastest. (To clarify - this is what my tests show on
two different platforms, but don't take my word for it, try
measuring yourself.)

If this is one isolated test, but tests typically succeed
or match on at least two initial characters, you might get
a slight performance advantage from using one of the full
word comparison approaches (compared to the "slow and long"
method). Unlikely to be worthwhile, but again the only
way to be sure is measure.

If the test is not to find a single word, but doing lots
of comparisons against, eg, a table of words, this can be
done quickly, simply, and safely using unions, like this:

union char4_unsigned32 {
char s[4];
uint32_t u;
};

union char4_unsigned32 four_letter_words[] = {
"Test",
"Fast",
"Slow",
"More",
"Less",
};

int
find_word( const char *p ){
union char4_unsigned32 pu;
uint32_t u;
int i;
int i_limit = sizeof four_letter_words / sizeof four_letter_words[0];

memcpy( pu.s, p, 4 );
u = pu.u;
for( i = 0; i < i_limit; i++ ){
if( u == four_letter_words.u ) break;
}
return i;
}

The code shown above is safe and portable (assuming a suitable
typedef for uint32_t), and all comparisons are done using a
simple 32-bit-word compare. The loop should run very fast,
and the setup overhead is small.

Given all the above, it's pretty unlikely that comparing against
inline integer constants (either as hex/decimal numbers or
character constants) will be the method of choice. But, if you
do choose to go down that path, it's probably a good idea to
generate the constants programmatically, eg, by generating a
header file, and refer to the constants symbolically in your
program. Note that using the string literal approach, eg,

*(int*)p == *(int*)"Test"

produced (in my trials) performance results very close to
using straight integer constants, so using string literals
might be preferable so the code is easier to write and
understand.

In my performance tests, memcmp() and strncmp() were both slower
than all other methods tested, by factors ranging from two to
five, depending on platform and number of leading characters
matching the target word ("Test"). Again, don't take my word
on any of these results - take measurements yourself to be sure.

General performance advice: starting out, write the simplest
and most obvious code you can think of, and worry about these
kinds of micro-issues only after the program is running. Then
if you still think performance is a problem, go back and take
measurements as part of evaluating different approaches. I
suspect you'll be surprised by the results (I was, even for
the simple set of tests that I did).

Remember when doing measurements to take into account the
relative frequencies of which paths are taken, and also
the expected mix of different platforms on which the
program will run. Different plaforms often will favor
different approaches, and different approaches may be
faster or slower depending on whether the test is likely
to succeed or likely to fail, especially if it fails
early.

Good luck!

Tim Rentsch · Sep 2, 2013

James Kuyper said:
If you can arrange for the memory to be correctly aligned from the
beginning, it's probably no slower, and may be faster. If you have to
copy the pointed-at object into a correctly aligned buffer, it's
definitely slower.

Actual measurements show otherwise.

Tests run on two different platforms show memcmp() and strncmp()
both running slower by factors of at least 2.5 compared to
copying into an aligned buffer and then doing a single 32-bit
comparison.

BartC · Sep 2, 2013

James Kuyper said:
On 09/02/2013 04:51 AM, BartC wrote:

I suppose that's true, but only because you used the word "might" -
which acknoledges that the flip side of your statement is also true: it
might matter, even if you're only interested in comparing the whole
string. If the integer constant that you need to use to match "ABDC" in
this fashion is 'DCAB', it matters very much whether or not you're aware
of that fact.

That's why I said to think of it as int operations rather than string ones.
So the implementer has to be aware of the multi-char format and make sure
both sides of == use the same.

Why would you have to write the macro more than once, just because
you'll be using it hundreds of times?

Well, OK, the << and + will be in the macro. You just have to write or edit
code such as MACRO('A','B','C','D') many times.

Why do you think that wouldn't be possible?
#define INTSTR(string) (string[0]<<24 + string[1]<<16 + ...

When applied to a string literal, it is an integer expression with a
value that can be computed at compile time, and I would hope that many
compilers would actually do so. I haven't bothered checking; I would
never have any use for something so thoroughly unportable.

Writing just "X"[0] in places where a constant is expected gives an error on
gcc. If gcc doesn't support something then I would call *that* unportable!
Without this, making such a macro that yields a compile-time constant is not
going to work.

James Kuyper · Sep 2, 2013

Well, OK, the << and + will be in the macro. You just have to write or edit
code such as MACRO('A','B','C','D') many times.

I suppose if you want to define it to be used that way, it would be
rather inconvenient, which is one reason why I wouldn't do so.

Why do you think that wouldn't be possible?
#define INTSTR(string) (string[0]<<24 + string[1]<<16 + ...

Click to expand...

When applied to a string literal, it is an integer expression with a
value that can be computed at compile time, and I would hope that many
compilers would actually do so. I haven't bothered checking; I would
never have any use for something so thoroughly unportable.

Click to expand...

Writing just "X"[0] in places where a constant is expected gives an error on
gcc. If gcc doesn't support something then I would call *that* unportable!

As I said in text that you've clipped, it isn't an integer constant
expression. Therefore, using it in a context where one is required would
be a constraint violation. But if(a == INTSTR("Test")) is NOT such a
context.

Places where constant expressions are required include the size of a
bit-field, the value of an enumeration constant, the argument of an
_Alignas() specifier, in the subscript of an array element designator,
the initializer for an object of static or thread local storage
duration, the first argument of a _static_assert(), in a case label, and
in a #if or #elif directive. The only of of those that sounds like a
plausible context for this macro are the initializer and the case label.

qak · Sep 3, 2013

If this is one isolated test...

Even for once, we should use the more efficient method.
Knowledge can be re-apply else where.
I can stand code like this:
if(!strcmp(p, "This"))...
else if(!strcmp(p, "That"))...
else if(!strcmp(p, "Other"))...
Note that 'Other' doesn't fit into integer but if I do first 4, eliminate
majority of cases, then one more test to see if it is zero terminated, or
some short string... 8 bytes or even 12 bytes we still win (2.5 times as
your timing test in other message), not to mention with the advance of 64
bits...

*(int*)p == *(int*)"Test"

This is brilliant, master. Thank you very much, never occur to me we could
do that.

BartC · Sep 3, 2013

qak said:
Even for once, we should use the more efficient method.
Knowledge can be re-apply else where.
I can stand code like this:
if(!strcmp(p, "This"))...
else if(!strcmp(p, "That"))...
else if(!strcmp(p, "Other"))...
Note that 'Other' doesn't fit into integer but if I do first 4, eliminate
majority of cases, then one more test to see if it is zero terminated, or
some short string... 8 bytes or even 12 bytes we still win (2.5 times as
your timing test in other message), not to mention with the advance of 64
bits...

This is brilliant, master. Thank you very much, never occur to me we could
do that.

Watch out for:

* Alignment (if it might be a problem machines you're likely to run this on)

* p pointing to strings shorter than 3 characters, because there will be
garbage following the nul terminator (but I think an ==/!= compare will
still work provided the right-hand-side is at least 3 characters so doesn't
have garbage of its own).

* (There is also the unlikely situation where p points to a short string at
the end of a memory block, and accessing a full 4 bytes causes a memory
violation.)

* p containing strings longer than 4 characters, because "Testa" will
wrongly match "Test"

* Also that the right-hand-side is likely to be a memory reference rather
than a constant such as 'ABCD', which might adversely affect the timing, if
it is a critical as you make out. (But if the processor is an x86, it could
well be faster, because what is fast or slow is unintuitive.)

Tim Rentsch · Sep 3, 2013

qak said:
If this is one isolated test...

Click to expand...

Even for once, we should use the more efficient method.
[snip elaboration]

Let's try this again.

There is no such thing as "the" most efficient method. Depending
on circumstances different choices, including the "slow and long"
method, will exhibit better performance than other approaches,
and may end up being a better choice overall.

There is no substitute for measuring actual performance.

Putting in effort for negligible performance gains is a
waste of time and a mark of a poor programmer.

In almost all cases coding choices should be limited to code
that is safe and widely portable, especially when the safe
and widely portable methods have nearly identical performance
characteristics as the "faster" versions do.

If you still feel a need to spend time on such micro-level
performance questions, read through comments in the thread
again until the lessons sink in.

This is brilliant, master. Thank you very much, never occur to me we
could do that.

It wasn't my idea, it came from Ben Bacarisse in a direct
response to your original posting. You haven't been paying
attention. Please add "pay attention when you read" to
the list of development practices you should be following.

glen herrmannsfeldt · Sep 3, 2013

(snip)

Even for once, we should use the more efficient method.
Knowledge can be re-apply else where.
I can stand code like this:
if(!strcmp(p, "This"))...
else if(!strcmp(p, "That"))...
else if(!strcmp(p, "Other"))...
Note that 'Other' doesn't fit into integer but if I do first 4, eliminate
majority of cases, then one more test to see if it is zero terminated, or
some short string... 8 bytes or even 12 bytes we still win (2.5 times as
your timing test in other message), not to mention with the advance of 64
bits...

Depending how many there are to test, it is common to use a hash
table to look up the string values.

Then again, converting the strings to 32 bit integers is, more or
less, hashing.

-- glen

Speed Test for Comparison	14	Dec 8, 2011
FindFirstIn	83	May 22, 2014
Array for speed or...	11	Aug 23, 2006
[alt.lang.asm,comp.lang.c]assembly Vs C Language	22	Nov 30, 2006
An idea for heap allocation at near stack allocation speed	14	Feb 13, 2011
How do ensure atomic update of a shared global in a multi-threadedapplication?	7	Mar 16, 2011
Outputting signal values to terminal Within Character Array	0	Dec 10, 2021
PATCH: Speed up direct string concatenation by 20+%!	34	Sep 29, 2006

matching assembly speed for small string comparison

Tim Rentsch

glen herrmannsfeldt

Ian Collins

glen herrmannsfeldt

Eric Sosman

qak

Eric Sosman

BartC

Keith Thompson

Siri Cruise

BartC

James Kuyper

Tim Rentsch

Tim Rentsch

BartC

James Kuyper

qak

BartC

Tim Rentsch

glen herrmannsfeldt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads