memcpy() vs. for() performance

Xenos · Jun 30, 2004

But how do you know when the transfer is complete then ? I assume that
even in synchronous mode, using DMA for large transfers can be beneficial.

DMA engine usually generate an interrupt or have a status register or such
to indicate completion.

Case - · Jul 1, 2004

Dan said:
ALWAYS use memcpy(), NEVER use for loops, unless you have empirical
evidence that your memcpy() is very poorly implemented.

A well implemented memcpy() can use many tricks to accelerate its
operation.

Thanks Dan, I've moved over to always using memcpy(). And as
you say in a later post, its shorter/more elegant too; this is
an impotant thing (for me) too.

gcc is smart enough to inline memcpy calls for short memory blocks,
when optimisations are enabled:

fangorn:~/tmp 273> cat test.c
#include <string.h>

void foo(int *p, int *q)
{
memcpy(q, p, 2 * sizeof *p);
}
fangorn:~/tmp 274> gcc -O2 -S test.c
fangorn:~/tmp 275> cat test.s
.file "test.c"
.text
.p2align 4,,15
.globl foo
.type foo, @function
foo:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %ecx
movl (%edx), %eax
movl %eax, (%ecx)
movl 4(%edx), %eax
movl %eax, 4(%ecx)
popl %ebp
ret
.size foo, .-foo
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.3.3"

Even if you have no clue about x86 assembly, you can easily see that there
is no memcpy call in the code generated by gcc for this function. One
more reason to prefer memcpy to for loops.

Yes, this clearly states the point!

Barry Schwarz · Jul 1, 2004

#define SIZE 100
#define USE_MEMCPY

int main(void)
{
char a[SIZE];
char b[SIZE];
int n;

/* code 'filling' a[] */

#ifdef USE_MEMCPY
memcpy(b, a, sizeof(a));
#else
for (n = 0; n < sizeof(a); n++)
{
b[n] = a[n];
}
#endif
}

While the two techniques are equivalent for char, they are not for any
type where sizeof(type) is not 1. You can change the limit check in
the for loop from n<sizeof(a) to n<SIZE to eliminate this restriction.

/*
Any (general) ideas about when (depending on SIZE) to use
memcpy(), and when to use for()?

The call to memcpy has a certain amount of overhead. The break even
point is when this overhead balances out the "extra efficiency" that
may be built in to memcpy. The only practical way to tell is to run
some tests.

<<Remove the del for email>>

Case · Jul 1, 2004

Dan said:
ALWAYS use memcpy(), NEVER use for loops, unless you have empirical
evidence that your memcpy() is very poorly implemented.

A well implemented memcpy() can use many tricks to accelerate its
operation.

I did some tests myself, and found out that this is only true
when the block size is fixed/known. GCC nor Sun-CC 'inline/optimize'
the memcpy() when size is a variable. Unfortunately, at many
places in my code, the size is variable. Although my understanding
of this issue has increased, I must admit this was a flaw in my
initial question: an over simplification.

I'd be interested to hear comments/insights about this variable
case.

Case

Dan Pop · Jul 1, 2004

In said:
I did some tests myself, and found out that this is only true
when the block size is fixed/known. GCC nor Sun-CC 'inline/optimize'
the memcpy() when size is a variable. Unfortunately, at many
places in my code, the size is variable. Although my understanding
of this issue has increased, I must admit this was a flaw in my
initial question: an over simplification.

I'd be interested to hear comments/insights about this variable
case.

It would be *very* helpful if you didn't mix up things. Inlining is one
thing and providing a highly optimised library version of memcpy is a
completely different one.

When the size is unknown at compile time (or too large), the compiler
cannot won't inline the memcpy call, it will call the library version.
But the library version can still be much faster than the code generated
by the compiler from a for loop. Especially when dealing with arrays of
characters.

If you want ultimate answers, benchmark the two versions yourself.
Keep in mind that they cannot be extrapolated to other implementations.

Dan

Alex Vinokur · Jul 1, 2004

Alex Fraser said:
In practice I would expect the loop to be slower for anything more than a
few bytes, as memcpy() is likely to be implemented efficiently (more so than
can possibly be done in standard C).

[snip]

Some results of performance measurement for several str- and mem-functions can be seen at:
* http://groups.google.com/[email protected]

Case · Jul 1, 2004

Dan said:
It would be *very* helpful if you didn't mix up things. Inlining is one
thing and providing a highly optimised library version of memcpy is a
completely different one.

I know the difference. What the compiler does looks like (in my eyes)
a form of inlining (the function call is replaced). But at the same
time the code that is inserted is highly optimized for the particular
block size; it's not just inserting a standard piece of memcpy code.
That's why I write 'inline/optimize', and quoted the expression to
mark it as not to be taken to literally, because it's a combination.

When the size is unknown at compile time (or too large), the compiler
cannot won't inline the memcpy call, it will call the library version.

When I had to make a choice between the two, I would call it
call it optimization. I'm surprized that you seem to prefer the
term inlining. Why?

But the library version can still be much faster than the code generated
by the compiler from a for loop. Especially when dealing with arrays of
characters.

Agreed. And, for simplicity I'd rather use one way all the time,
instead of context depedently (either code-time or even run-time)
choosing between a couple of alternatives. Otherwise this will
easily fall within the famous 97%.

If you want ultimate answers, benchmark the two versions yourself.
Keep in mind that they cannot be extrapolated to other implementations.

Yep, one other good reason to always use memcpy(). However, how was
the saying again .... "Never say always!"

Thanks,

Case

Dan Pop · Jul 1, 2004

In said:
When I had to make a choice between the two, I would call it
call it optimization. I'm surprized that you seem to prefer the
term inlining. Why?

Because this is the specific name of that particular optimisation.
What is so difficult to understand?

As I said, inlining is NOT the only way an implementation can optimise
a memcpy call. There are plenty of optimisations that can be applied
to the library version of memcpy (especially if it's not written in C).

Yep, one other good reason to always use memcpy(). However, how was
the saying again .... "Never say always!"

Another failed attempt at humour...

Dan

Case - · Jul 1, 2004

Dan said:
Another failed attempt at humour...

Humour is in the eye of the beholder.

Dan Pop · Jul 2, 2004

In said:
Humour is in the eye of the beholder.

Only when a large enough number of beholders perceive it as such.

Dan

red floyd · Jul 2, 2004

Case - said:
Humour is in the eye of the beholder.

Would that be vitreous humor or aqueous humor

Keith Thompson · Jul 3, 2004

Would that be vitreous humor or aqueous humor

checking for [OT] tag ... ok

Yes, the eye certainly lens itself to puns. But enough of this
ocularity. If the jokes get any cornea, I'll give you 40 lashes.

Case - · Jul 3, 2004

Dan said:
Only when a large enough number of beholders perceive it as such.

No, on the contrary! Needing only the personal (i.e.,
individual) observation, is at the heart of the original
'beholder-saying'.

Case

Dan Pop · Jul 5, 2004

In said:
No, on the contrary! Needing only the personal (i.e.,
individual) observation, is at the heart of the original
'beholder-saying'.

Which is why it doesn't apply to humour ;-)

Dan

Adding adressing of IPv6 to program	1	Feb 16, 2023
[memcpy] dst=NULL,size=0	9	Mar 3, 2009
gcc inline memcpy	7	Jul 12, 2012
Homework in C - Help Needed	1	Oct 16, 2024
Linux: using "clone3" and "waitid"	0	Oct 17, 2023
CIN Input #2 gets skipped, I don't understand why.	1	Feb 9, 2023
memcpy() Behaviour	6	Dec 30, 2004
linux <--> windows strcpy etc performance	5	Aug 29, 2010

memcpy() vs. for() performance

Xenos

Case -

Barry Schwarz

Case

Dan Pop

Alex Vinokur

Case

Dan Pop

Case -

Dan Pop

red floyd

Keith Thompson

Case -

Dan Pop

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads