Unexpected behaviour of malloc

S

sindica

I am using DevC++ 4.0 lately, which uses Mingw port of GCC, on a
WinXP. I am surprised to see the malloc behaviour which is not
consistent with the documentation. See the program and its output
below.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *str1;
char *str2;
str1 = "Hello world";
printf("Length of str1 %d\n", strlen(str1));
str2 = (char *) malloc(strlen(str1));
printf("Length of str2 %d\n", strlen(str2));

strncpy(str2, str1, strlen(str1));
printf("Length of str2 after %d\n", strlen(str2));


return 0;
}

Output is
Length of str1 11
Length of str2 3
Length of str2 after 17

From what i understand from the explanation of malloc, "Length of
str2" should be 11. And how come "Length of str2 after" is 17.

Is it because malloc allocates a minimum chunk of memory (3 here) and
later grow to the demand? Ok agreed, but it grows well beyond the
demand (It grows up to 17 instead of 11).

~saraca
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(e-mail address removed) wrote:
| I am using DevC++ 4.0 lately, which uses Mingw port of GCC, on a
| WinXP. I am surprised to see the malloc behaviour which is not
| consistent with the documentation. See the program and its output
| below.
|
| #include <stdio.h>
| #include <stdlib.h>
|
| int main(int argc, char *argv[])
| {
| char *str1;
| char *str2;
| str1 = "Hello world";
| printf("Length of str1 %d\n", strlen(str1));
| str2 = (char *) malloc(strlen(str1));
| printf("Length of str2 %d\n", strlen(str2));
|
| strncpy(str2, str1, strlen(str1));
| printf("Length of str2 after %d\n", strlen(str2));
|
|
| return 0;
| }
|
| Output is
| Length of str1 11
| Length of str2 3
| Length of str2 after 17
|
| From what i understand from the explanation of malloc, "Length of
| str2" should be 11.

Not the way you compute it, it shouldnt.

malloc() returns a pointer to an uninitialized piece of memory that you can
store a string in.

strlen() counts the number of characters in a character array, stopping at, and
excluding the first '\0' character.

Since you don't initialize the memory that malloc() gives you before you use
strlen() to count the number of characters, you are counting the non-zero
characters that are the residual garbage in the character array pointed to by
the value that malloc() returned. There is no guarantee that there will be a
'\0' character anywhere in that array, or even beyond it, and no guarantee that
the run-time will prevent strlen() from exceeding the bounds of the memory that
was allocated by malloc(). This means that you are potentially counting /beyond/
the confines of the memory allocated by malloc(), or counting short within the
memory block.

In other words, your method is wrong, and your results are wrong because of it.

There is no predetermined 'expected results' for your computation, so 3 is as
valid as 11.

| And how come "Length of str2 after" is 17.

See above. You fouled up the computation with the same design flaw.

One further note; in C, strings are terminated by a '\0' character that isn't
counted by strlen(). Thus, your str1 contains 12 characters, even though
strlen() says that there are 11.

This means that your malloc() is one byte too short to contain the '\0' that's
needed for your third test ("Length fo str2 after") to work properly.


| Is it because malloc allocates a minimum chunk of memory (3 here)
No. See above

[snip]

- --
Lew Pitcher
IT Consultant, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFBr0WcagVFX4UWr64RAmSXAKCldZ0/lVk9Gu/Qg/Yq24EezLYSGgCg3Vpy
3uA0cgS7BWXbhV0K7PyR9oU=
=TFwh
-----END PGP SIGNATURE-----
 
D

dandelion

I am using DevC++ 4.0 lately, which uses Mingw port of GCC, on a
WinXP. I am surprised to see the malloc behaviour which is not
consistent with the documentation. See the program and its output
below.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *str1;
char *str2;
str1 = "Hello world";
printf("Length of str1 %d\n", strlen(str1));
str2 = (char *) malloc(strlen(str1));

printf("Length of str2 %d\n", strlen(str2));

Invoking strlen on an unititialized string (as you do) may give you *any*
size
and still be correct. Nobody knows what's in the string, after all.
strncpy(str2, str1, strlen(str1));

Since you provided strncpy with a maximum length of 11 (strlen("Hello
World")), the terminating '\0' (which denotes the end of the string) is not
copied. Hence str2 is not zero-terminated and the second strlen will count
until it encounters a '\0'. It might aswell have printed any other number,
since that basically depends on what happens to be in memory after str2 +
11.

Try strlen(str1) + 1 instead.
printf("Length of str2 after %d\n", strlen(str2));

From what i understand from the explanation of malloc, "Length of
str2" should be 11.

Nope. malloc does not automathemagically copy any contents anywhere. It
returns an *unitialized* chuck of memory (or NULL, of course).
And how come "Length of str2 after" is 17.
Is it because malloc allocates a minimum chunk of memory (3 here) and
later grow to the demand?
Nope.

Ok agreed, but it grows well beyond the demand (It grows up to 17 instead
of 11).

For the reasons mentioned above. It seems your understanding of 'C' is not
yet complete. I would suggest to buy a good (tutorial) book on the subject.
 
M

Martin Ambuhl

I am using DevC++ 4.0 lately, which uses Mingw port of GCC, on a
WinXP. I am surprised to see the malloc behaviour which is not
consistent with the documentation. See the program and its output
below.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *str1;
char *str2;
str1 = "Hello world";
printf("Length of str1 %d\n", strlen(str1));
str2 = (char *) malloc(strlen(str1));
printf("Length of str2 %d\n", strlen(str2));

/* DON'T DO THIS! You have not initialized the memory to which str2
points. You have no guarantee that there is a '\0' anywhere in the
allocated memory. */
strncpy(str2, str1, strlen(str1));
printf("Length of str2 after %d\n", strlen(str2));

/* DON'T DO THIS! You have not copied the '\0' terminating str1 into the
memory to which str2 points. You have no guarantee that there is a '\0'
anywhere in the allocated memory. */
return 0;
}

Output is
Length of str1 11
Length of str2 3
Length of str2 after 17

From what i understand from the explanation of malloc, "Length of
str2" should be 11.

Why should it be? If you want the string to be terminated properly,
copy the terminating '\0'.

And how come "Length of str2 after" is 17.
Is it because malloc allocates a minimum chunk of memory (3 here)

It has nothing to do with malloc. The 3 is an accident: it could have
been 0 or 6000000 or 42 as easily.
> and
later grow to the demand? Ok agreed, but it grows well beyond the
demand (It grows up to 17 instead of 11).

Learn how to do the simple thing of terminating strings properly before
wading into arcana that you don't understand. Try your code with
*str2 = 0;
before the
printf("Length of str2 %d\n", strlen(str2));
and with
strncpy(str2, str1, strlen(str1));
changed to
strncpy(str2, str1, 1+strlen(str1));
or
strcpy(str2, str1);
 
C

Charlie Gordon

I am using DevC++ 4.0 lately, which uses Mingw port of GCC, on a
WinXP. I am surprised to see the malloc behaviour which is not
consistent with the documentation. See the program and its output
below.

#include <stdio.h>
#include <stdlib.h>

#include said:
int main(int argc, char *argv[])
{
char *str1;
char *str2;
str1 = "Hello world";

const char *str1 would be advisable, but the standard is lame, and the compiler
defaults favor sloppiness.
printf("Length of str1 %d\n", strlen(str1));
str2 = (char *) malloc(strlen(str1));

Three (!) regulars pointed out the unnecessary cast, but none noticed that the
size is one byte too short !
str2 = strdup(str1); is what the OP wants, but the C99 people were too lame to
introduce useful de facto standard stuff into the standard instead of
martyrizing flies with tons of int*_t types, iso646.h crap, fucking digraphs,
zillions of stupid PRI*** and SCN*** macros...
printf("Length of str2 %d\n", strlen(str2));

Of course str2 points to uninitialized memory, so this is undefined behaviour
strncpy(str2, str1, strlen(str1));

Here we go again : it is a miracle, there is nothing wrong with this gem !
except the OP has no clue about the consequences, and I am getting tired of
explaining why strncpy should be deprecated altogether. NEVER USE strncpy !
printf("Length of str2 after %d\n", strlen(str2));

strlen(str2) yields undefined behaviour.
return 0;
}

Output is
Length of str1 11
Length of str2 3
Length of str2 after 17

It could just as well crash !
 
M

Method Man

#include <string.h> as well !

I was going to comment on this as well. The OP's code seems to have compiled
ok without the header, but how? Is this (handling library includes) totally
implementation specific?

Three (!) regulars pointed out the unnecessary cast, but none noticed that the
size is one byte too short !

I think Lew Pitcher discussed that at the bottom of his post. It's an
important point to mention though.
 
T

Thomas Stegen

Method said:
I was going to comment on this as well. The OP's code seems to have compiled
ok without the header, but how? Is this (handling library includes) totally
implementation specific?

No, just stupid. The headers only contain declarations, no definitions
(usually recommended practice at least). The libraries are linked in
later. If there then is no prototype for a function in scope the
compiler generates code as if the function was declared to return int
(regardless of what it actually returns) and takes arguments with the
types of the actul arguments (regardless of what the formal arguments
are).

This is stupid, has always been stupid and will always be stupid. So
this no longer happens in the newest standard (which hardly anyone
implements).
 
C

Chris Croughton

I was going to comment on this as well. The OP's code seems to have compiled
ok without the header, but how? Is this (handling library includes) totally
implementation specific?

Some implementations have lots of functions in stdlib.h which are also
defined elsewhere (often using a system-specific include file which is
included by both stdlib.h and other headers). At least one version of
GCC, for instance, also pulls in stdint.h within stdlib.h (via one of
the system headers it needs). stdio.h also often pulls in string.h (and
very often pulls in stdlib.h), and lots of things pull in limits.h and
stddef.h (and NULL gets defined all over the place).

You can't guarantee that you get /only/ the functions etc. the standard
says are in a particular header, you should be able to guarantee that
you get /at least/ those functions. In particular memory functions get
included often in both string.h (many traditional compilers had them
there) and in stdlib.hi (it's odd that the memory allocation functions
and the string conversion functions are in stdlib.h but the copy and
comparison functions are in string.h, some implementors just put both
sets in both headers).

Chris C
 
D

dandelion

Three (!) regulars pointed out the unnecessary cast, but none noticed that the
size is one byte too short !

Not sure wether you count me as a "regular", but I did notice. I should have
commented on it more clearly, though, instead of merely writing "Try
strlen(str1) + 1". After all, the "target audience" in this case is an
obvious newbie. So the criticism is entirely in order and appreciated.
 
C

CBFalconer

Chris said:
.... snip ...

Some implementations have lots of functions in stdlib.h which are also
defined elsewhere (often using a system-specific include file which is
included by both stdlib.h and other headers). At least one version of
GCC, for instance, also pulls in stdint.h within stdlib.h (via one of
the system headers it needs). stdio.h also often pulls in string.h (and
very often pulls in stdlib.h), and lots of things pull in limits.h and
stddef.h (and NULL gets defined all over the place).

Not if the system is compliant. Then no functions not listed in
the standard as being defined in that header will appear (with the
exception of names in the implementors namespace, which is why you
don't use such names).
 
P

Peter Shaggy Haywood

Groovy hepcat (e-mail address removed) was jivin' on 2 Dec 2004 08:23:51
-0800 in comp.lang.c.
Unexpected behaviour of malloc's a cool scene! Dig it!
printf("Length of str1 %d\n", strlen(str1));
^^ ^^^^^^^^^^^^
In addition to what others have told you, this line is wrong.
strlen() returns a size_t, an implementation defined unsigned integer
type; but you use a %d conversion specifier, which requires a signed
int argument. Instead, either cast the strlen() call to int or, better
still, cast it to unsigned long (to be sure you don't lose precision)
and use the %lu conversion specifier.

printf("Length of str1 %lu\n", (unsigned long)strlen(str1));
str2 = (char *) malloc(strlen(str1));
printf("Length of str2 %d\n", strlen(str2));
^^ ^^^^^^^^^^^^
Same here.
strncpy(str2, str1, strlen(str1));
printf("Length of str2 after %d\n", strlen(str2));
^^ ^^^^^^^^^^^^
And here.

--

Dig the even newer still, yet more improved, sig!

http://alphalink.com.au/~phaywood/
"Ain't I'm a dog?" - Ronny Self, Ain't I'm a Dog, written by G. Sherry & W. Walker.
I know it's not "technically correct" English; but since when was rock & roll "technically correct"?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,404
Latest member
PerryRutt

Latest Threads

Top