Warnings with gcc

R

Randy Howard

jacob navia wrote
(in article said:
strdup is in string.h in my linux system with
gcc version 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)

By now you should know very well why this is an issue,
particularly in clc. I suspect you do, but intentionally don't
mention it. Disappointing.
 
K

Keith Thompson

Hello,
I get a warning when I compile:
#include <string.h>

int main (int argc, char ** argv)
{
char * s;
s = strdup("a string");
return 0;
}

As others have mentioned, strdup() is not a standard C function. If
you want to maximize the portability of your code, it's easy enough
to roll your own:

char *dupstr(const char *s)
{
char *result = malloc(strlen(s) + 1);
if (result != NULL) {
strcpy(s, result);
}
return result;
}

(This requires #include directives for <string.h> and <stdlib.h>.)

On the other hand, strdup() is part of the POSIX standard. If you're
already using other POSIX-specific features that aren't part of
standard C, you might as well use strdup() as well; just don't tell
your compiler to restrict itself to strict ANSI/ISO C.
 
J

Jordan Abel

As others have mentioned, strdup() is not a standard C function. If
you want to maximize the portability of your code, it's easy enough
to roll your own:

char *dupstr(const char *s)
{
char *result = malloc(strlen(s) + 1);
if (result != NULL) {
strcpy(s, result);
}
return result;
}

Note that storing the length and using memcpy may be more efficient.
this is also what freebsd does for strdup. This also accounts for the
[dubious] possibility that the length of the string may overflow size_t.
 
K

Keith Thompson

Jordan Abel said:
As others have mentioned, strdup() is not a standard C function. If
you want to maximize the portability of your code, it's easy enough
to roll your own:

char *dupstr(const char *s)
{
char *result = malloc(strlen(s) + 1);
if (result != NULL) {
strcpy(s, result);
}
return result;
}

Note that storing the length and using memcpy may be more efficient.
this is also what freebsd does for strdup. This also accounts for the
[dubious] possibility that the length of the string may overflow size_t.

It might be more efficient, but there's no *fundamental* reason to
expect that it will be. Both strcpy() and memcpy() do a linear (O(1))
scan of the input string; strcpy() looks for a '\0' character, and
memcpy() scans until it's copied the specified number of bytes.

I suppose strcpy() might be slower because it has to examine each
byte, and accessing a byte might be slower than testing a word-sized
counter. Also, memcpy() might be optimized in ways that strcpy()
might not be, for example copying a word at a time rather than a byte
at a time.

There are cases where scanning to the end of a string looking for the
'\0' is clearly less efficient than an alternative, such as building
up a string with multiple calls to strcat(). This isn't one of them;
the performance issues here are more subtle.

This could be an argument in favor of using the system-provided
strdup() function, if there is one, since it might be optimized for
the underlying hardware in ways that a portable implementation can't
be.
 
M

Mark McIntyre

strdup is in string.h in my linux system with
gcc version 2.96

In order for this implementation to conform, compilation with strict
ISO compatibility would have to exclude the declaration.

As indeed it does - which is why the OP has the problem he does.

Its also why posting platform-specific answers here is a bad idea.
Your post can only have further confused the OP. Jacob, please take
note of this.
 
N

Netocrat

[dubious] possibility that the length of the string may overflow size_t.

I don't believe it's possible to portably construct a string whose length
overflows size_t. How would you do it?
 
F

Flash Gordon

Netocrat said:
[dubious] possibility that the length of the string may overflow size_t.

I don't believe it's possible to portably construct a string whose length
overflows size_t. How would you do it?

Untested:
#include <stdlib.h>

#define BIGNUM1 32767
#define BIGNUM2 32767

int main(void)
{
unsigned long i,j;
char *bigstring=calloc(BIGNUM1,BIGNUM2);
char *ptr=bigstring;
if (bigstring) {
for (i=1; i<BIGNUM1; i++) {
for (j=0; j<(i==BIGNUM1)?BIGNUM2-1:BIGNUM2; j++) {
*ptr++ = 'A';
}
}
/* do whatever */
free(bigstring);
}
}

No guarantee you this will actually generate such a string, since in all
probability calloc will fail, but with suitable values of BIGNUM1 and
BIGNUM2 for your system I believe the above could theoretically generate
a string with a length greater than size_t.

For the best chance you want to try and allocate only just enough space
for it to be larger than can be expressed in size_t.
 
P

pete

Netocrat said:
[dubious] possibility that the
length of the string may overflow size_t.

I don't believe it's possible to
portably construct a string whose length
overflows size_t. How would you do it?

Portably? You can't.

The description of the standard string functions
is that they are for:
"manipulating arrays of character type and other
objects treated as arrays of character type."
.... despite the fact
that the definition of "string" is broader than that.

I don't see much point in worrying about strings
longer than (size_t)-1.
 
S

S.Tobias

pete said:
Netocrat said:
[dubious] possibility that the
length of the string may overflow size_t.

I don't believe it's possible to
portably construct a string whose length
overflows size_t. How would you do it?

Portably? You can't.

The description of the standard string functions
is that they are for:
"manipulating arrays of character type and other
objects treated as arrays of character type."
... despite the fact
that the definition of "string" is broader than that.

I don't see much point in worrying about strings
longer than (size_t)-1.
#include <...>
char *p = calloc(2, SIZE_MAX);
assert(p);
for (p1=p, i=0; i<2; ++i)
for (j=0; j<SIZE_MAX; ++j)
if (!(i == 1 && j == SIZE_MAX-1))
*p1++ = 'x';
/*code not checked, but you get the idea*/
strlen(p); /* ??? */
 
N

Netocrat

Netocrat said:
[dubious] possibility that the length of the string may overflow size_t.

I don't believe it's possible to portably construct a string whose length
overflows size_t. How would you do it?

Untested:
#include <stdlib.h>

#define BIGNUM1 32767
#define BIGNUM2 32767

You presumably intend that SIZE_MAX is equal to or only slightly larger
than 32767.
int main(void)
{
unsigned long i,j;
char *bigstring=calloc(BIGNUM1,BIGNUM2);

char *ptr=bigstring;
if (bigstring) {
for (i=1; i<BIGNUM1; i++) {
for (j=0; j<(i==BIGNUM1)?BIGNUM2-1:BIGNUM2; j++) {
*ptr++ = 'A';

calloc returns an array of objects. This code treats that array as a
single object, but pointer arithmetic is only defined within an object (or
one past the end).

That can be easily solved using indexing instead of pointer arithmetic:

bigstring[j] = 'A';
}
}
/* do whatever */
free(bigstring);
}
}

No guarantee you this will actually generate such a string, since in all
probability calloc will fail, but with suitable values of BIGNUM1 and
BIGNUM2 for your system I believe the above could theoretically generate
a string with a length greater than size_t.

Pete's already pointed out that the functions of <string.h> operate on
arrays of character type.

Aside from that, there's defect report #266
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_266.htm) which deals
with sizeof applied to a static two-dimensional array of size greater than
SIZE_MAX and concludes that "The program is not strictly conforming
because it exceeds an environmental limit."

Looking at the description of strlen in both C89 and C99 drafts, I see no
mention of how strings with length greater than SIZE_MAX are dealt with,
so I imagine that the same reasoning would apply. Such a string would
violate an environmental limit.
 
F

Flash Gordon

Netocrat said:
Netocrat said:
On Tue, 22 Nov 2005 20:40:19 +0000, Jordan Abel wrote:

[dubious] possibility that the length of the string may overflow size_t.
I don't believe it's possible to portably construct a string whose length
overflows size_t. How would you do it?
Untested:
#include <stdlib.h>

#define BIGNUM1 32767
#define BIGNUM2 32767

You presumably intend that SIZE_MAX is equal to or only slightly larger
than 32767.

Did C89 have SIZE_MAX? I could not see it mentioned in my copy of K&R2.
Anyway, I intend for the *product* of the two numbers to be slightly
larger than SIZE_MAX, so select them appropriately for your environment.
calloc returns an array of objects. This code treats that array as a
single object, but pointer arithmetic is only defined within an object (or
one past the end).

I believe that is not a problem for a char pointer. An array is an
object and you can use a char pointer to step through any object. Note
the definition of object in the standard, which is, "region of data
storage in the execution environment, the contents of which can
represent values" which, as far as I can see, applies to the region of
data storage the pointer returned by calloc.
That can be easily solved using indexing instead of pointer arithmetic:

bigstring[j] = 'A';


This would require changing the type of bigstring.
Pete's already pointed out that the functions of <string.h> operate on
arrays of character type.

The definition of string I can see in section 7.1.1 of N1124 does not
mention arrays, it says:
| A string is a contiguous sequence of characters terminated by and
| including the first null character. The term multibyte string is
| sometimes used instead to emphasize special processing given to
| multibyte characters contained in the string or to avoid confusion
| with a wide string. A pointer to a string is a pointer to its initial
| (lowest addressed) character. The length of a string is the number of
| bytes preceding the null character and the value of a string is the
| sequence of the values of the contained characters, in order.

What I generate above meets that definition as far as I can see.
Aside from that, there's defect report #266
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_266.htm) which deals
with sizeof applied to a static two-dimensional array of size greater than
SIZE_MAX and concludes that "The program is not strictly conforming
because it exceeds an environmental limit."

Definitely true if you use a declared array rather than a calloced
block. I would also argue that as an array is an object (see my
arguments above) it also applies to a block created by calloc.
Looking at the description of strlen in both C89 and C99 drafts, I see no
mention of how strings with length greater than SIZE_MAX are dealt with,
so I imagine that the same reasoning would apply. Such a string would
violate an environmental limit.

Either that or it is undefined behaviour "by the omission of any xplicit
definition of behavior." if you pass such a string to strlen or one of
the other functions that returns an index of type size_t. However, your
suggestion of exceeding an environmental limit would also, IMHO, apply
to my call to calloc.

I should have put the above by where I devined the constants, rather
than down here.
 
N

Netocrat

[can a string with length > SIZE_MAX be portably created; and if so what
is the result of calling strlen on it]

[whose product is intended to be greater than SIZE_MAX]
Did C89 have SIZE_MAX?

It did not.
I believe that is not a problem for a char pointer.

N1124, 6.5.6#8 (on pointer arithmetic)

| If both the pointer operand and the result point to elements of the
| same array object, or one past the last element of the array object,
| the evaluation shall not produce an overflow; otherwise, the
| behavior is undefined. If the result points one past the last
| element of the array object, it shall not be used as the operand of a
| unary * operator that is evaluated.

The "array object" is an array of char (the type to which ptr points),
whereas calloc returns an array of arrays of char. As soon as the ptr
reaches one beyond the end of the first array and is dereferenced, it
is breaking the "shall" condition; and once it reaches two beyond the
end of the first array, it is explicitly undefined behaviour.
An array is an
object and you can use a char pointer to step through any object.

C&V?

[calloc returns an object]

Agreed.
That can be easily solved using indexing instead of pointer arithmetic:

bigstring[j] = 'A';


This would require changing the type of bigstring.


Right: char (*bigstring)[BIGNUM2];
Pete's already pointed out that the functions of <string.h> operate on
arrays of character type.

The definition of string I can see in section 7.1.1 of N1124 does not
mention arrays, [quote omitted]
What I generate above meets that definition as far as I can see.

Agreed, and the quote pete referred to allows for "objects treated as
arrays of char type".
Definitely true if you use a declared array rather than a calloced
block. I would also argue that as an array is an object (see my
arguments above) it also applies to a block created by calloc.

Although the DR's conclusion also says:

| Translation limits do not apply to objects whose size is determined
| at runtime.
Either that or it is undefined behaviour "by the omission of any xplicit
definition of behavior."

That seems like better reasoning given what I quoted above from the DR re
translation limits not applying to runtime objects.
if you pass such a string to strlen or one of
the other functions that returns an index of type size_t. However, your
suggestion of exceeding an environmental limit would also, IMHO, apply
to my call to calloc.

The standard doesn't seem to require it; likewise DR 266. Regarding the
result of applying sizeof to such an oversize (determined at runtime)
object, the DR seems to say nothing though.

[...]
 
F

Flash Gordon

Netocrat said:
[can a string with length > SIZE_MAX be portably created; and if so what
is the result of calling strlen on it]

[whose product is intended to be greater than SIZE_MAX]
Did C89 have SIZE_MAX?

It did not.

That's what I thought, and why I did not use it.
N1124, 6.5.6#8 (on pointer arithmetic)

| If both the pointer operand and the result point to elements of the
| same array object, or one past the last element of the array object,
| the evaluation shall not produce an overflow; otherwise, the
| behavior is undefined. If the result points one past the last
| element of the array object, it shall not be used as the operand of a
| unary * operator that is evaluated.

The "array object" is an array of char (the type to which ptr points),
whereas calloc returns an array of arrays of char. As soon as the ptr
reaches one beyond the end of the first array and is dereferenced, it
is breaking the "shall" condition; and once it reaches two beyond the
end of the first array, it is explicitly undefined behaviour.

I disagree because an array is an object so this applies from N1124,
6.3.2.3 Pointers:
| 7 A pointer to an object or incomplete type may be converted to a
| pointer to a different object or incomplete type. If the resulting
| pointer is not correctly aligned57) for the pointed-to type, the
| behavior is undefined. Otherwise, when converted back again, the
| result shall compare equal to the original pointer. When a pointer
| to an object is converted to a pointer to a character type, the
| result points to the lowest addressed byte of the object. Successive
| increments of the result, up to the size of the object, yield
| pointers to the remaining bytes of the object.

See above, and the definition of an object in N1124, 3.14:
| object
| region of data storage in the execution environment, the contents of
| which can represent values
[calloc returns an object]

Agreed.

Since you agree calloc returns an object, 6.3.2.3 applies when you
convert it to a pointer to a character type.

Although the DR's conclusion also says:

| Translation limits do not apply to objects whose size is determined
| at runtime.

Ah well.
That seems like better reasoning given what I quoted above from the DR re
translation limits not applying to runtime objects.

Indeed, which is why I mentioned it :)
The standard doesn't seem to require it; likewise DR 266. Regarding the
result of applying sizeof to such an oversize (determined at runtime)
object, the DR seems to say nothing though.

Well then, if that is the case then I would guess the calloc call is
required to fail or return a block that could be used for an oversize
string, it's just that functions line strlen are not required to work
this the oversize string.
 
N

Netocrat

Netocrat wrote:
[which part of the standard legitimises stepping with a char pointer
through memory returned by a call to calloc with two non-zero arguments?]
[...]
Since you agree calloc returns an object, 6.3.2.3 applies when you
convert it to a pointer to a character type.

That's what I was looking for.

[...]
I would guess the calloc call is
required to fail or return a block that could be used for an oversize
string, it's just that functions line strlen are not required to work
this the oversize string.

So Jordan's advice was sound (use memcpy rather than strcpy after
malloc'ing memory based on the return of strlen).
 
F

Flash Gordon

Netocrat said:
Netocrat wrote:
[which part of the standard legitimises stepping with a char pointer
through memory returned by a call to calloc with two non-zero arguments?]
[...]
Since you agree calloc returns an object, 6.3.2.3 applies when you
convert it to a pointer to a character type.

That's what I was looking for.
:)
[...]
I would guess the calloc call is
required to fail or return a block that could be used for an oversize
string, it's just that functions line strlen are not required to work
this the oversize string.

So Jordan's advice was sound (use memcpy rather than strcpy after
malloc'ing memory based on the return of strlen).

Indeed.

It's my habit in any case, because of the processors I've used where
memcpy could be implemented significantly more efficiently than strcpy
(you could tell the processor to repeat the next instruction n times,
with the next instruction being copy a byte, and where a byte was the
same size as the data bus). No guarantee that it is more efficient on
other processors of course, or even on that processor (the
implementation is allowed to provide a needlessly inefficient memcpy),
but it is why I developed the habit of always using memcpy IF I know in
advance how much data is available to copy, even when copying strings.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,173
Messages
2,570,938
Members
47,474
Latest member
VivianStuk

Latest Threads

Top