about STREQ

L

Leor Zolman

Oops: should be !strcmp( ... ), of course.

That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say that
earlier, and then got so wrapped up in minutiae that I forgot to.

The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

int streq(const char *s1, const char *s2)
{
return !strcmp(s1, s2);
}

and be done with it.
-leor
 
K

kal

Thomas Matthews said:
On many systems, the execution time saved by this expression
is negligble compared to the actual speed and observed speed
of a program. It is called premature optimization.

This is true today. But may not have been when the code
referred to by the OP (Original Poster) was written.

Even now there are instances where optimizations, even small
ones, are essential.

Since the code is implemented as a macro and the define is
presumably included in a header file, its impact on readability
is much less than it would otherwise be. I would pause a moment
to tip my hat to him who went before me and wrote that code.

<OT>
Try bubble sort (whose code is far more readable than that
of binary sort) on an array of, say, 1 million entries with
todays FAST computers.
</OT>
 
M

Mabden

Leor Zolman said:
That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say that
earlier, and then got so wrapped up in minutiae that I forgot to.

Agreed. This stuff is bad enough for those of us who consider ourselves
experts (notice how I weasel my way into that group), now have a newbie try
to make sense of it...

I mean, if you're worried about optimizations, why are you calling strcmp()
at all...
Mabden wrote:
At least make the macro:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a+1), (b+1)) == 0)


Plus, it would only really help if you are reasonably certain that the two
strings will *usually* differ in the first character, so it would have to be
a specific data group. Hence my comment about calling the macros with the
second character (since you already know the first ones match, and if you
know the data that well - like part numbers or something - you would know
there's going to be a second char)
 
L

Leor Zolman

Plus, it would only really help if you are reasonably certain that the two
strings will *usually* differ in the first character, so it would have to be
a specific data group. Hence my comment about calling the macros with the
second character (since you already know the first ones match, and if you
know the data that well - like part numbers or something - you would know
there's going to be a second char)

And, in that case, you could go hog wild with something like:

#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

But life's just too short.
-leor
 
G

Guest

Leor said:
#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

It is illegal in the following call:
LONG_STR_EQ("", "")

- Dario
 
N

Neil Cerutti

This is true today. But may not have been when the code
referred to by the OP (Original Poster) was written.

Even now there are instances where optimizations, even small
ones, are essential.

Since the code is implemented as a macro and the define is
presumably included in a header file, its impact on readability
is much less than it would otherwise be. I would pause a moment
to tip my hat to him who went before me and wrote that code.

<OT>
Try bubble sort (whose code is far more readable than that
of binary sort) on an array of, say, 1 million entries with
todays FAST computers.
</OT>

Choosing an appropriate algorithm is more important than
optimization of that algorithm.
 
A

Arthur J. O'Dwyer

That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say that
earlier, and then got so wrapped up in minutiae that I forgot to.

What you're missing (well, I'm sure you're not really missing it,
but you're glossing over it) is that this "chicanery" is not scattered
through the OP's code, but rather stuck behind a very sensibly-named
macro in a sensible part of the program. The programmer never needs
to know how it works, any more than he needs to know how 'qsort' is
optimized to deal with special cases of *its* input. They're both
library functions, conceptually, and if you don't want to know why
the macro works, nobody's forcing you to do all those !!s in your head.
:)
The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

int streq(const char *s1, const char *s2)
{
return !strcmp(s1, s2);
}

and be done with it.

Except for the namespace invasion, this is decent advice. This
is why in my programs I always include two macros right up at the
top:

#define steq(x,y) (!strcmp(x,y))
#define stneq(x,y) (!steq(x,y))

If I wanted, I could change that to the OP's

#define steq(x,y) (*(x)==*(y) && !strcmp(x,y))

in theory without any loss of sleep. In practice, that would
lose me a lot of sleep, because I know that I use 'steq' heavily
to parse arguments out of 'argv', and there's always the chance
I might have written somewhere

if (steq(argv, "--output-file") && steq(argv[++i], "-"))
OutputFile = stdout;

(I doubt it, though, because that would lose the filename if it
weren't "-", and that seems like a silly thing to do.)
This double-evaluation is the biggest danger of the OP's macro; the
tricky negations and "chicanery" have nothing to do with it as far
as I'm concerned.

-Arthur
 
L

Leor Zolman

Leor said:
#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

It is illegal in the following call:
LONG_STR_EQ("", "")

- Dario

Please have some more coffee, then read the last part of the post I was
replying to and consider it in context (I notice you left out the line I
wrote just before showing that code...)

Thanks,
-leor
 
G

Guest

Leor said:
Leor said:
#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

It is illegal in the following call:
LONG_STR_EQ("", "")

- Dario

Please have some more coffee,

Yes, I do...
then read the last part of the post I was
replying to and consider it in context
(I notice you left out the line I
wrote just before showing that code...)

Read: OK!

Pas de qoi.

- Dario
 
M

Michael Wojcik

The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

There's also the macro which I believe Peter van der Linden gives in
_Expert C Programming_ (though I can't seem to find it there), along
the lines of:

#define CMPSTR(s1, op, s2) (strcmp(s1, s2) op 0)

which is used as in:

if (CMPSTR(word, ==, "hello"))
 
M

Mabden

Michael Wojcik said:
There's also the macro which I believe Peter van der Linden gives in
_Expert C Programming_ (though I can't seem to find it there), along
the lines of:

I think we are now getting into the realm of redefining the C language.

If the result of all these fancy macros is to rewrite strcmp() then I think
we need to step back and realize that all the C books have a page on
strcmp() but none have CMPSTR or STREQ or STRNEQ or LONG_STR_EQ or whatever.

What is the point of having a terse, manageable language like C and
cluttering it up with crappy macros that only save one character comparison.
Perhaps there _was_ a time when this was a viable, necessary activity. It no
longer is.

Stick to the known functions. Profile your code if it's slow. Then spend the
$150 to upgrade the damn machine, you cheap ass bastard!
 
L

Leor Zolman

What you're missing (well, I'm sure you're not really missing it,
but you're glossing over it) is that this "chicanery" is not scattered
through the OP's code, but rather stuck behind a very sensibly-named
macro in a sensible part of the program. The programmer never needs
to know how it works, any more than he needs to know how 'qsort' is
optimized to deal with special cases of *its* input. They're both
library functions, conceptually, and if you don't want to know why
the macro works, nobody's forcing you to do all those !!s in your head.
:)

Okay, once you have a debugged, correct macro (not like the "optimized"
version of STREQ we've been talking about here, since the unintended
side-effects issue relegates anything like that to the fringes), no one
would need to look at its implementation.

However, I was thinking more in terms of the cost of using implementation
techniques like this in new development. Until you can abstract it away,
you'll be paying the price to develop it. And eventually /someone/ will
probably be put in the position of having to understand it again, for one
reason or another, and then the price would only go up...
Except for the namespace invasion, this is decent advice.

Can you elaborate on "namespace invasion" here? Sorry, I don't know what
you mean.
-leor
 
S

Sam Dennis

Leor said:
Can you elaborate on "namespace invasion" here?

Functions beginning with str and a lowercase letter are reserved for
future expansion of the standard library (<string.h> and <stdlib.h>,
but streq has external linkage here, so it'll be undefined behaviour
regardless.)

There are a few other such names and namespaces listed under `Future
library directions' in the Standard. (is|to)[a-z] and E[A-Z0-9] are
particularly noteworthy, along with mem[a-z], also for <string.h>.
 
L

Leor Zolman

Leor said:
Can you elaborate on "namespace invasion" here?

Functions beginning with str and a lowercase letter are reserved for
future expansion of the standard library (<string.h> and <stdlib.h>,
but streq has external linkage here, so it'll be undefined behaviour
regardless.)

There are a few other such names and namespaces listed under `Future
library directions' in the Standard. (is|to)[a-z] and E[A-Z0-9] are
particularly noteworthy, along with mem[a-z], also for <string.h>.

Thanks, I just remembered about that this morning. I don't know why I have
such a mental block on that particular aspect of the Standard; perhaps it
just seems completely counter-intuitive to me for it to reserve /any/
arbitrary "ordinary-looking" sequence of initial characters. Well, at least
this time it will probably have finally sunk in...
-leor
 
M

Michael Wojcik

I think we are now getting into the realm of redefining the C language.

No, since no one has suggested adding any of these to the standard.
We're discussing using the C language, of which function-type macros
are a part.
If the result of all these fancy macros is to rewrite strcmp() then I think
we need to step back and realize that all the C books have a page on
strcmp() but none have CMPSTR or STREQ or STRNEQ or LONG_STR_EQ or whatever.

While some of the macros posted attempt to eliminate strcmp calls in
some cases, I haven't seen one that rewrote strcmp.

And some C books do discuss macros that wrap strcmp. That's what the
text you quoted from my post says, in fact.
What is the point of having a terse, manageable language like C and
cluttering it up with crappy macros that only save one character comparison.

The macro I posted had nothing to do with "sav[ing] one character
comparison". Did you read it? Is this comment in any way relevant
to my post?

And the point of macros in C is and has always been to simplify
development and maintenance of source code. Using macros for this
purpose is not trivial and there is much disagreement on how best
to do it, but that is the point. A macro aims to give a more
meaningful name to a value or a (hopefully short) segment of code;
as such, it should provide more information that what it replaces,
and thereby *increase* terseness and manageability, the goals you
claim for C.
Stick to the known functions.

I'd like to see how you'd implement a significant project in C
with only the standard library functions. No functions of your
own, nothing added by the implementation.

I have no love for the "avoid a call if the first character differs"
macro that started this thread - if a program makes sufficient
calls to strcmp that it becomes necessary to optimize some of them
away, it's almost certainly a candidate for redesign. But using
that as an argument to eliminate strcmp wrappers entirely is silly.
 
M

Mabden

I have no love for the "avoid a call if the first character differs"
macro that started this thread - if a program makes sufficient
calls to strcmp that it becomes necessary to optimize some of them
away, it's almost certainly a candidate for redesign. But using
that as an argument to eliminate strcmp wrappers entirely is silly.

Ah, good then we agree.
 
R

Richard Bos

Arthur J. O'Dwyer said:

No, he said alpha_numeric_. That is, a-z plus A-Z plus 0-9 is 62 chars.
In 1/26 of the cases, yes. In 25/26 of the cases, no, strcmp will
never get called, because the initial characters will differ. Thus
we are trading the cost of (26 comparisons and one call to strcmp) for
the cost of (26 calls to strcmp). It's likely that this is a good
trade, I think, although as the cost of a function call gets cheaper,
it becomes less and less of a good trade.

Especially since strcmp() is simple enough to be a likely candidate for
inlining.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,142
Messages
2,570,819
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top