gcc inline memcpy

A

absurd

Hello,

I have the following simple code.

#include <string.h>

int main()
{
char dst[50];
char src[50];
memcpy(dst, src, 50);
return dst[1];
}

When I use gcc to compile it without any optimization, I can see a functioncall to memcpy is in the executable (with objdump). With optimization, thememcpy seems to be inlined.

I checked string.h. It has only the declaration of memcpy, not implementation. The implementation is in glibc. My understanding is in order to inline,gcc has to see the function implementation when it compiles. How does the gcc inline memcpy here ? Does it have special treatment to the functions inthe standard library ?

Thanks.
 
G

gwowen

How does the gcc inline memcpy here ?
Does it have special treatment to the functions in the standard library ?

Yes, exactly this, at least for some functions from the standard
library. If the header is included, GCC can (and frequently does)
replace certain functions memcpy, mempcpy, memmove, memset, strcpy,
stpcpy, strncpy, strcat and strncat among them) with optimised builtin
versions. This behaviour can be prevent with -fno-builtin (or, for
more fine-grained control '-fno-builtin-memcpy')
 
V

Victor Bazarov

I have the following simple code.

You know, your code actually (a) is C and (b) has undefined behavior
since you didn't initialize your memory.
#include<string.h>

int main()
{
char dst[50];
char src[50];
memcpy(dst, src, 50);
return dst[1];
}

When I use gcc to compile it without any optimization, I can see a
function call to memcpy is in the executable (with objdump). With
optimization, the memcpy seems to be inlined.

I checked string.h. It has only the declaration of memcpy, not
implementation. The implementation is in glibc. My understanding is
in order to inline, gcc has to see the function implementation when
it compiles. How does the gcc inline memcpy here ? Does it have
special treatment to the functions in the standard library ?

Probably. For all we know, when it sees 'memcpy' function and the
optimizations are turned on, it generates special code to replace the
function call. It's allowed to do that. What "inline" means is not
explicitly (and fully) specified by the Standard.

V
 
P

Paul N

I have the following simple code.
#include<string.h>

int main()
{
    char dst[50];
    char src[50];
    memcpy(dst, src, 50);
    return dst[1];
}

You know, your code actually (a) is C

It looks like valid C++ code to me. I don't think the fact that it is
also valid code in another language really justifies not discussing it
here.
and (b) has undefined behavior
since you didn't initialize your memory.

Are you sure about this? I was under the impression that, if a piece
of memory could be read at all, it could be read safely by treating it
as an array of unsigned char, and while I'm less sure, I thought it
could also be read as an array of char as well. Is this not so?
 
A

Andrew Cooper

I have the following simple code.
#include<string.h>

int main()
{
char dst[50];
char src[50];
memcpy(dst, src, 50);
return dst[1];
}

You know, your code actually (a) is C

It looks like valid C++ code to me. I don't think the fact that it is
also valid code in another language really justifies not discussing it
here.
and (b) has undefined behavior
since you didn't initialize your memory.

Are you sure about this? I was under the impression that, if a piece
of memory could be read at all, it could be read safely by treating it
as an array of unsigned char, and while I'm less sure, I thought it
could also be read as an array of char as well. Is this not so?

As far as this piece of code goes, the result is perfectly well defined.
It just has an unpredictable result.

dst and src are both 50 byte char arrays on the stack, so by the time
you get to call to memcpy, your stack is known good memory. There are
just no guarantee made about the contents of the bytes there.

Chances are that they will be 0'd, but no guarantee.

~Andrew
 
J

Joshua Maurice

On 7/12/2012 11:28 AM, absurd wrote:
#include <string.h>

int main()
{
char dst[50];
char src[50];
memcpy(dst, src, 50);
return dst[1];
} [...]
and (b) has undefined behavior
since you didn't initialize your memory.

Are you sure about this? I was under the impression that, if a piece
of memory could be read at all, it could be read safely by treating it
as an array of unsigned char, and while I'm less sure, I thought it
could also be read as an array of char as well. Is this not so?

As far as this piece of code goes, the result is perfectly well defined.
It just has an unpredictable result.

dst and src are both 50 byte char arrays on the stack, so by the time
you get to call to memcpy, your stack is known good memory. There are
just no guarantee made about the contents of the bytes there.

Yes and no. While I hate disagreeing with Victor Bazarov, as he's usually right, I believe he's wrong here. As best as I can determine, Victor is referencing the rules:
1- An uninitialized and unwritten auto (stack) object has an indeterminate value.
2- An indeterminate value may be a trap representation.
3- Reading a trap representation is undefined behavior.

However, I think Victor is wrong because he forgot a couple exceptions to that:
4- char and unsigned char do not have trap representations.
5- C++03, 3.9 Types / 2, among other sections of the standard, seems to strongly state that memcpy similarly ignores trap representations.

Still, I'm not sure if Andrew or other readers of the thread have a proper appreciation for how reading indeterminate values can blow up (e.g. undefined behavior) so I appreciate Victor's comment in part.

PS: first post with the new Google Groups UI. Let's see how bad it garbles this.
 
N

Nobody

I checked string.h. It has only the declaration of memcpy, not
implementation. The implementation is in glibc. My understanding is in
order to inline, gcc has to see the function implementation when it
compiles. How does the gcc inline memcpy here ? Does it have special
treatment to the functions in the standard library ?

gcc doesn't hard-code such optimisations. The glibc headers conditionally
define macros or inline functions which use the __builtin_* functions
which are built-in to gcc.

E.g. <string.h> may include <bits/string2.h> and/or <bits/string3.h>,
which may use the __builtin_* functions.

Here, if I compile your test program with "-O2 -E", the preprocessed
output includes an inline declaration of memcpy() from bits/string3.h
which is just a wrapper around __builtin___memcpy_chk().

[The *_chk versions pass an extra parameter, the result of applying
__builtin_object_size() to the destination parameter. This allows buffer
overruns to be detected in the case where the destination is an array
whose size is known at compile time.]
 
J

Jorgen Grahn

On 7/12/2012 11:28 AM, absurd wrote:
#include <string.h>

int main()
{
char dst[50];
char src[50];
memcpy(dst, src, 50);
return dst[1];
} [...]
and (b) has undefined behavior
since you didn't initialize your memory.
^^^^^
....

PS: first post with the new Google Groups UI. Let's see how bad it
garbles this.

They have a new one *again*? Well, I see nothing obviously wrong
except for two things:

- that weird escape sequence above, not present in the parent article
(which I suspect had a plain ASCII ')

- my newsreader still doesn't like lines longer than ~76 characters,
making me less likely to respond to such postings (here I had to
manually reformat just one paragraph)

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top