Need to implement strdup, strnicmp and stricmp

J

jamihuq

I'm trying to use strdup, strnicmp and stricmp in an OS that doesn't
have an implementation in the OSs string.h function. Does someone have
the implementation for these functions and can you please post them.

Thanks
Jami
 
I

Ian Malone

jamihuq said:
I'm trying to use strdup, strnicmp and stricmp in an OS that doesn't
have an implementation in the OSs string.h function. Does someone have
the implementation for these functions and can you please post them.

glibc appears to implement strdup, although it shouldn't be taxing to
write your own. I'll guess from the names the other two are supposed
to be case independent comparisons. Looping through tolower() first
then doing a normal strcmp/strncmp would work at a guess (although
probably slower and needs a duplicating step to do non-destructively
compared to a locale aware comparison).
 
P

pete

jamihuq said:
I'm trying to use strdup, strnicmp and stricmp in an OS that doesn't
have an implementation in the OSs string.h function. Does someone have
the implementation for these functions and can you please post them.

This is the current form of my case insensitive versions
of strcmp and strncmp:

int str_ccmp(const char *s1, const char *s2)
{
for (;;) {
if (*s1 != *s2) {
int c1 = toupper((unsigned char)*s1);
int c2 = toupper((unsigned char)*s2);

if (c2 != c1) {
return c2 > c1 ? -1 : 1;
}
} else {
if (*s1 == '\0') {
return 0;
}
}
++s1;
++s2;
}
}

int str_cncmp(const char *s1, const char *s2, size_t n)
{
for (;;) {
if (n-- == 0) {
return 0;
}
if (*s1 != *s2) {
int c1 = toupper((unsigned char)*s1);
int c2 = toupper((unsigned char)*s2);

if (c2 != c1) {
return c2 > c1 ? -1 : 1;
}
} else {
if (*s1 == '\0') {
return 0;
}
}
++s1;
++s2;
}
}
 
K

Kenneth Brody

pete said:
This is the current form of my case insensitive versions
of strcmp and strncmp:
[...]

I just ran into a similar situation. The Linux box doesn't have a
stricmp() function. Rather, it has an identical function called
strcasecmp().

#define stricmp(str1,str2) strcasecmp(str1,str2)

There is probably an equivalent to strnicmp() as well.

Are either of these functions part of the C standard?

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
P

pete

Kenneth Brody wrote:
The Linux box doesn't have a stricmp() function.
Rather, it has an identical function called strcasecmp().

#define stricmp(str1,str2) strcasecmp(str1,str2)

There is probably an equivalent to strnicmp() as well.

Are either of these functions part of the C standard?

No.

http://www.open-std.org/JTC1/SC22/WG14/www/docs/

n869 is very good for function descriptions
and comes in a text version.

Some people prefer to use n1124 these days.
 
R

Robert Gamble

Kenneth said:
pete said:
This is the current form of my case insensitive versions
of strcmp and strncmp:
[...]

I just ran into a similar situation. The Linux box doesn't have a
stricmp() function. Rather, it has an identical function called
strcasecmp().

#define stricmp(str1,str2) strcasecmp(str1,str2)

There is probably an equivalent to strnicmp() as well.

Are either of these functions part of the C standard?

No. strcasecmp is a POSIX extension and stricmp appears to be of
Microsoft origin. strcasecmp is more likely to be portable although,
as demonstrated elsethread, it is trivial to write your own version.

Robert Gamble
 
R

Robert Gamble

pete said:
No.

http://www.open-std.org/JTC1/SC22/WG14/www/docs/

n869 is very good for function descriptions
and comes in a text version.

Some people prefer to use n1124 these days.

Aside from format preferences (txt versus pdf), is there any reason at
all to prefer n869 over n1124? n1124 contains, for free, the same
thing you would pay for by purchasing 9899:1999 plus TC1, TC2, and
other DR corrections all in one document. The reluctance some people
seem to have over recommending it over n869 puzzles me, is this due
solely to formatting reasons?

Robert Gamble
 
F

Flash Gordon

Robert said:
Aside from format preferences (txt versus pdf), is there any reason at
all to prefer n869 over n1124? n1124 contains, for free, the same
thing you would pay for by purchasing 9899:1999 plus TC1, TC2, and
other DR corrections all in one document. The reluctance some people
seem to have over recommending it over n869 puzzles me, is this due
solely to formatting reasons?

Probably the main reason is that most people will be using an
implementation that can be C89 compliant (possibly with the need of
appropriate options) but not C99 compliant.
 
R

Robert Gamble

Flash said:
Probably the main reason is that most people will be using an
implementation that can be C89 compliant (possibly with the need of
appropriate options) but not C99 compliant.

n869 is the last public C99 draft...

Robert Gamble
 
C

Chris Torek

This is the current form of my case insensitive versions
of strcmp and strncmp: [snippage]
int c1 = toupper((unsigned char)*s1);
int c2 = toupper((unsigned char)*s2);

I would convert both to lowercase myself. In German, the eszet
character (ß -- code point 0xdf -- in ISO Latin 1) is lowercase
but has no uppercase equivalent (it has to be written as SS). I
suspect most toupper()s will leave it alone, so the code will
"work right" anyway; and it is possible there are other languages
that have an uppercase character with no lowercase equivalent; but
the existence of this one example is enough for me to favor
conversion to lowercase.
 
E

Eric Sosman

Chris Torek wrote On 06/29/06 17:52,:
pete said:
This is the current form of my case insensitive versions
of strcmp and strncmp:
[snippage]

int c1 = toupper((unsigned char)*s1);
int c2 = toupper((unsigned char)*s2);


I would convert both to lowercase myself. In German, the eszet
character (ß -- code point 0xdf -- in ISO Latin 1) is lowercase
but has no uppercase equivalent (it has to be written as SS). I
suspect most toupper()s will leave it alone, so the code will
"work right" anyway; and it is possible there are other languages
that have an uppercase character with no lowercase equivalent; but
the existence of this one example is enough for me to favor
conversion to lowercase.

The snipped part of the code has already established
that *s1 != *s2. Things might be different if he were
doing a wholesale case conversion followed by an ordinary
strcmp(), but that's not the, er, case.
 
P

pete

Robert said:
Aside from format preferences (txt versus pdf), is there any reason at
all to prefer n869 over n1124?

No.

I quote ISO/IEC 9899:1999 (E), when I have to,
like for example, when the topic is "negative zero".
 
P

pete

Chris said:
This is the current form of my case insensitive versions
of strcmp and strncmp: [snippage]
int c1 = toupper((unsigned char)*s1);
int c2 = toupper((unsigned char)*s2);

I would convert both to lowercase myself. In German, the eszet
character (ß -- code point 0xdf -- in ISO Latin 1) is lowercase
but has no uppercase equivalent (it has to be written as SS). I
suspect most toupper()s will leave it alone, so the code will
"work right" anyway; and it is possible there are other languages
that have an uppercase character with no lowercase equivalent; but
the existence of this one example is enough for me to favor
conversion to lowercase.

The standard says:

[#3] If the argument is a character for which islower is
true and there are one or more corresponding characters, as
specified by the current locale, for which isupper is true,
the toupper function returns one of the corresponding
characters (always the same one for any given locale);
otherwise, the argument is returned unchanged.

.... which means that toupper returns the eszet argument unchanged.

I don't understand what are the consequences
of the worst possible case, that you see.
 
S

SM Ryan

# I'm trying to use strdup, strnicmp and stricmp in an OS that doesn't
# have an implementation in the OSs string.h function. Does someone have
# the implementation for these functions and can you please post them.

char *strdup(char *s) {
return strcpy(malloc(strlen(s)+1),s);
}

int stricmp(char *s,char *t) {
int cc;
do cc = tolower(*s++) - tolower(*t++); while (!cc && s[-1]);
return cc;
}

int strnicmp(char *s,char *t,int n) {
int cc;
if (n==0) return 0;
do cc = tolower(*s++) - tolower(*t++); while (!cc && s[-1] && --n>0);
return cc;
}
 
R

Richard Heathfield

SM Ryan said:
# I'm trying to use strdup, strnicmp and stricmp in an OS that doesn't
# have an implementation in the OSs string.h function. Does someone have
# the implementation for these functions and can you please post them.

char *strdup(char *s) {

Invasion of implementation namespace. Make s a const char * instead.
return strcpy(malloc(strlen(s)+1),s);

Undefined behaviour if malloc fails.
int stricmp(char *s,char *t) {

Invasion of implementation namespace.
int cc;
do cc = tolower(*s++) - tolower(*t++); while (!cc && s[-1]);

Possible underflow issue here if *s is CHAR_MIN and t is positive. Also, the
negative offset will upset some maintenance programmers. Finally, UB if *s
or *t is not representable as an unsigned char.

int cmpistr(const char *s, const char *t)
{
unsigned char c, d;
int diff = 0;
while(diff == 0 && *s != '\0' && *t != '\0')
{
c = *s++;
d = *t++;
diff = (c > d) - (c < d);
}

if(0 == diff)
{
diff = *s ? 1 : -1;
}

return diff;
}
int strnicmp(char *s,char *t,int n) {
int cc;
if (n==0) return 0;
do cc = tolower(*s++) - tolower(*t++); while (!cc && s[-1] && --n>0);
return cc;

Much the same comments as above apply here too.
 
C

Chris Torek

Chris said:
I would [use tolower()] myself [on concern for a lowercase character
that lacks an uppercase equivalent].

The standard says:

[#3] If the argument is a character for which islower is
true and there are one or more corresponding characters, as
specified by the current locale, for which isupper is true,
the toupper function returns one of the corresponding
characters (always the same one for any given locale);
otherwise, the argument is returned unchanged.

... which means that toupper returns the eszet argument unchanged.

My C89 standard is not easily accessible, but I *think* this text
is new in C99 (or perhaps C95). C89's locale support was not very
well tacked-down, as I recall.

Of course, as Eric Sosman pointed out, your original code was
completely safe anyway.
 
K

Keith Thompson

Chris Torek said:
Chris said:
I would [use tolower()] myself [on concern for a lowercase character
that lacks an uppercase equivalent].

The standard says:

[#3] If the argument is a character for which islower is
true and there are one or more corresponding characters, as
specified by the current locale, for which isupper is true,
the toupper function returns one of the corresponding
characters (always the same one for any given locale);
otherwise, the argument is returned unchanged.

... which means that toupper returns the eszet argument unchanged.

My C89 standard is not easily accessible, but I *think* this text
is new in C99 (or perhaps C95). C89's locale support was not very
well tacked-down, as I recall.

Here's what C90 says:

7.3.2.2 The toupper function

Synopsis
#include <ctype.h>
int toupper(int c);

Description
The toupper function converts a lowercase letter to the
corresponding uppercase letter.

Returns
If the argument is a character for which islower is true and
there is a corresponding character for which isupper is true,
the toupper function returns the corresponding character;
otherwise, the argument is returned unchanged.
 
P

pete

Chris said:
Chris said:
I would [use tolower()] myself [on concern for a lowercase character
that lacks an uppercase equivalent].

The standard says:

[#3] If the argument is a character for which islower is
true and there are one or more corresponding characters, as
specified by the current locale, for which isupper is true,
the toupper function returns one of the corresponding
characters (always the same one for any given locale);
otherwise, the argument is returned unchanged.

... which means that toupper returns the eszet argument unchanged.

My C89 standard is not easily accessible, but I *think* this text
is new in C99 (or perhaps C95). C89's locale support was not very
well tacked-down, as I recall.

Well, it doesn't use the word "locale", here:

ISO/IEC 9899: 1990
7.3.2.2 The toupper function
Returns
If the argument is a character for which islower is true
and there is a corresponding character for which isupper is true,
the toupper function returns the corresponding character;
otherwise, the argument is returned unchanged.
Of course, as Eric Sosman pointed out, your original code was
completely safe anyway.

Thank you, to both of you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,181
Messages
2,570,970
Members
47,537
Latest member
BellCorone

Latest Threads

Top