why prefix increment is faster than postfix increment?

C

Christian Bau

"Branimir Maksimovic said:
In case that sizeof(x) == 1 , I agree.


Well, I don't need to, cause I don't use memcpy to assign variables.

In other words, you are a complete bullshitter.
 
P

peter koch

Christian Bau skrev:
[snip]
Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));
[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour and is just utterly contrived and useless. You
also have my sympathy when you call a poster who suggests using
assignment to assign for a "complete bullshitter".
In short, you have described yourself and your skills wonderfully in
two short posts.

/Peter
 
O

Old Wolf

Greg said:
Not necessarily. If one can show that Y performs every operation that
X performs, and then has to perform additional operations outside of
that set and that require a measurable amount of time to complete,
then one would have successfully proven that X is faster than Y.

Not true, unless the additional operations are independent of the
X operations.

For example, if you apply the same logic to a file system, then
appending data to a file should increase the amount of space
required to store a file. But for many filesystems that is not
true.

Similar possibilities apply to the CPU case. Maybe the extra
operation fits within some timing interval that had to happen
anyway. Maybe the extra instruction means the whole operation
can be done with different assembly instructions that work out
faster. Maybe the CPU's pipelining is better in one case than
the other. Etc.
 
G

Greg

Old said:
Not true, unless the additional operations are independent of the
X operations.

For example, if you apply the same logic to a file system, then
appending data to a file should increase the amount of space
required to store a file. But for many filesystems that is not
true.

Similar possibilities apply to the CPU case. Maybe the extra
operation fits within some timing interval that had to happen
anyway. Maybe the extra instruction means the whole operation
can be done with different assembly instructions that work out
faster. Maybe the CPU's pipelining is better in one case than
the other. Etc.

If the additional operations follow the ones in common, than it would
be difficult to see how executing those instructions would be able to
speed up the previous set of instructions that have already executed.

But even if the additional instructions came before or were
interspersed with the ones in common, the only way that the additional
instructions would not add time to the procedure would be if the
program could execute two instructions in less time than it could
execute one of those instructions. [Note that the one instruction must
also be one of the two executed in the comparison]

On a macro scale, because similar operations can be composed of
different sub-operations, adding an operation may make an existing one
faster. But as the granularity of the operations becomes finer, at a
certain point every operation is independent of another and each
executes in constant time.

Greg
 
W

William Hughes

peter said:
Christian Bau skrev:
[snip]
Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));
[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)
and is just utterly contrived and useless.

Contrived yes. Most simple examples of complex behaviour
are contrived. Useless no. Indeed this code is not meant
to be used but the *example* of an "bigger operaton"
(copying x bytes rather than x-1 bytes) that might reasonably
be expected to execute faster is useful indeed.

A related question. Is it ever better to use an int
variable, even when a char is big enough?

[For a less useful example consider a perverse
implementation (e.g. the DS2K) which introduces a
delay of say 20 minutes, seemingly at random. If the "smaller"
operation incurs the delay, but the "bigger" does not, then
the larger operation will be faster. While this
is correct, such an implementation cannot be considered
reasonable.]
You
also have my sympathy when you call a poster who suggests using
assignment to assign for a "complete bullshitter".


The poster claimed undefined behaviour, then when challenged
claimed ignorance (and gave a stupid exuse for this
ignorance). The term "complete bullshitter" seems an accurate
description.

- William Hughes
 
J

Jordan Abel

peter said:
Christian Bau skrev:
[snip]

Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));
[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)

for example you could end up with a trap representation in x. say, a signalling
nan of some kind. and in any case you're not guaranteed anything useful about
the value you might get
 
B

Branimir Maksimovic

William said:
peter said:
Christian Bau skrev:
[snip]

Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));
[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)
and is just utterly contrived and useless.

Contrived yes. Most simple examples of complex behaviour
are contrived. Useless no. Indeed this code is not meant
to be used but the *example* of an "bigger operaton"
(copying x bytes rather than x-1 bytes) that might reasonably
be expected to execute faster is useful indeed.

A related question. Is it ever better to use an int
variable, even when a char is big enough?

[For a less useful example consider a perverse
implementation (e.g. the DS2K) which introduces a
delay of say 20 minutes, seemingly at random. If the "smaller"
operation incurs the delay, but the "bigger" does not, then
the larger operation will be faster. While this
is correct, such an implementation cannot be considered
reasonable.]
You
also have my sympathy when you call a poster who suggests using
assignment to assign for a "complete bullshitter".


The poster claimed undefined behaviour, then when challenged
claimed ignorance (and gave a stupid exuse for this
ignorance). The term "complete bullshitter" seems an accurate
description.

No I didn't claim undefined behavior.
I claimed that first case would probably produce hardware exception
and second one would probably work.
As memcpy is defined to be copy operation of n characters from
memory location to memory location, behavior is undefined only
when "to" and "from" overlap.
The original message claimed that compiler can be smart enough
to recognize use case and according to situation, apply different
semantics then those specified by code.
This leads him to conclusion that produced code will be faster when
compiler applies assignment semantics then memcpy semantics.
This is just wrong example, but if we observe this:
double x[2];y=0.;
memcpy((char*)x+1,&y,sizeof(y));
double t = *(double*)((char*)x+1); /* depends on
hardware tolerance to alignment */
memcpy(x,&y,sizeof(y));
t = *x;

It is obvious that second case will be always faster or at least
equal then first case, even if memcpy have to copy same number of bytes
and use ram instead or sizeof (x) ==1 .

Greetings, Bane.
 
B

Branimir Maksimovic

Branimir said:
double x[2];y=0.;

double x[2],y = 0.;
memcpy((char*)x+1,&y,sizeof(y));
double t = *(double*)((char*)x+1); /* depends on
hardware tolerance to alignment */
memcpy(x,&y,sizeof(y));
t = *x;

It is obvious that second case will be always faster or at least
equal then first case, even if memcpy have to copy same number of bytes
and use ram instead or sizeof (x) ==1 .

sizeof (y) == 1;
 
K

Keith Thompson

Jordan Abel said:
peter said:
Christian Bau skrev:
[snip]

Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));

[snip explanation that second memcpy might be faster]

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)

for example you could end up with a trap representation in x. say, a
signalling nan of some kind. and in any case you're not guaranteed
anything useful about the value you might get

Even if the memcpy() stores a trap representation in x, there's no
undefined behavior until you try to read x as a double. The quoted
code doesn't do that.

BTW, please keep your text down to about 72 columns so it doesn't
overflow an 80-column screen when quoted. My newsreader lets me
reformat quoted text easily, but others might not.
 
M

Mark McIntyre

No I didn't claim undefined behavior.
I claimed that first case would probably produce hardware exception
and second one would probably work.

Given that neither will produce any such thing, and given your very
agressive attitude in subsequent postings, "bullshitter" seems
entirely reasonable.
The original message claimed

(stuff thats completely irrelevant to your claim that the code quoted
will cause an exception).
 
W

William Hughes

Jordan said:
peter said:
Christian Bau skrev:

[snip]

Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));

[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)

for example you could end up with a trap representation in x. say, a signalling
nan of some kind.

and as this could only cause a problem if x was subsequently read
as a double, there is not undefined behaviour above.
and in any case you're not guaranteed anything useful about
the value you might get

you are guaranteed that the first x-1 bytes starting at x
are the same as those starting at y. This may be useful
(e.g. if you are treating x and y as arrays of characters)

True, you are not guarenteed that x is meaningful as a double,
but so what. It might be, but this is beside the point,
the original example was not meant as an example of useful code.

- William Hughes
 
W

William Hughes

Branimir said:
William said:
peter said:
Christian Bau skrev:

[snip]

Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));

[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)
and is just utterly contrived and useless.

Contrived yes. Most simple examples of complex behaviour
are contrived. Useless no. Indeed this code is not meant
to be used but the *example* of an "bigger operaton"
(copying x bytes rather than x-1 bytes) that might reasonably
be expected to execute faster is useful indeed.

A related question. Is it ever better to use an int
variable, even when a char is big enough?

[For a less useful example consider a perverse
implementation (e.g. the DS2K) which introduces a
delay of say 20 minutes, seemingly at random. If the "smaller"
operation incurs the delay, but the "bigger" does not, then
the larger operation will be faster. While this
is correct, such an implementation cannot be considered
reasonable.]
You
also have my sympathy when you call a poster who suggests using
assignment to assign for a "complete bullshitter".


The poster claimed undefined behaviour, then when challenged
claimed ignorance (and gave a stupid exuse for this
ignorance). The term "complete bullshitter" seems an accurate
description.

No I didn't claim undefined behavior.
I claimed that first case would probably produce hardware exception

And this differs from undefined behaviour how?
(are you claiming implementation defined behaviour?)

Anyway, you have yet to even attempt to justify your
claim that the first case would probably produce
a hardware exception.

- William Hughes
 
B

Branimir Maksimovic

William said:
Branimir said:
William said:
peter koch wrote:
Christian Bau skrev:

[snip]

Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));

[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)

and is just utterly contrived and useless.

Contrived yes. Most simple examples of complex behaviour
are contrived. Useless no. Indeed this code is not meant
to be used but the *example* of an "bigger operaton"
(copying x bytes rather than x-1 bytes) that might reasonably
be expected to execute faster is useful indeed.

A related question. Is it ever better to use an int
variable, even when a char is big enough?

[For a less useful example consider a perverse
implementation (e.g. the DS2K) which introduces a
delay of say 20 minutes, seemingly at random. If the "smaller"
operation incurs the delay, but the "bigger" does not, then
the larger operation will be faster. While this
is correct, such an implementation cannot be considered
reasonable.]

You
also have my sympathy when you call a poster who suggests using
assignment to assign for a "complete bullshitter".


The poster claimed undefined behaviour, then when challenged
claimed ignorance (and gave a stupid exuse for this
ignorance). The term "complete bullshitter" seems an accurate
description.

No I didn't claim undefined behavior.
I claimed that first case would probably produce hardware exception

And this differs from undefined behaviour how?
(are you claiming implementation defined behaviour?)

If implementation is allowed to use floating point registers
for memcpy, then yes implementation defined behavior.
Anyway, you have yet to even attempt to justify your
claim that the first case would probably produce
a hardware exception.

In case that implementation use FPU registers for
memcpy of floating point variables that would be
normal. It is irrelevant how many bytes are copied.

Question is: Are such implementations conformant?
eg:
double x,double y; // produces FPU exception if x,y gets trap value?
memcpy(&x,&y,sizeof(x)); // produces exception if FPU registers are
used
// and y has trap representation value
// which is non conformant as I understand memcpy semantics

Conclusion: if FPU registers are allowed to be used
for memcpy then it is normal to allow hardware exceptions
during memcpy.
Compiler wouldn't care if memcpy produce exception or not
in that case.

Greetings, Bane.
 
C

Christian Bau

"peter koch said:
Christian Bau skrev:
[snip]
Not necessarily. If one can show that Y performs every operation that X
performs, and then has to perform additional operations outside of that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));
[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour and is just utterly contrived and useless. You
also have my sympathy when you call a poster who suggests using
assignment to assign for a "complete bullshitter".
In short, you have described yourself and your skills wonderfully in
two short posts.

Seems our IQs differ by about 30 points. Let's just disagree about the
direction.
 
M

Michael Mair

Branimir said:
In case that sizeof(x) == 1 , I agree.

Which is nothing more than you did before.

Well, I don't need to, cause I don't use memcpy to assign variables.

Beside the point.

--- C99 ---
7.21.2.1 The memcpy function
Synopsis
1 #include <string.h>
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);

Description
2 The memcpy function copies n characters from the object pointed to by
s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined.

Returns
3 The memcpy function returns the value of s1.
------------

The only way to safely and portably access the representation of an
object is bytewise (unsigned char). memcpy() does exactly that.

The first memcpy() operation can be replaced
size_t i;
unsigned char *p1= (unsigned char*) &y;
unsigned char *p2= (unsigned char*) &x;

for (i = 0; i < (sizeof x - 1); i++)
*(p2++) = *(p1++);

A conforming implementation has to do this right; you are thinking
of actual hardware and concluding that it cannot work.
Still, the "as if" rule has to hold, the operation has to work. There
must not be any repercussions as long as x is not accessed afterwards.


Cheers
Michael
 
C

Christian Bau

Jordan Abel said:
peter said:
Christian Bau skrev:

[snip]

Not necessarily. If one can show that Y performs every operation that
X
performs, and then has to perform additional operations outside of
that
set and that require a measurable amount of time to complete, then one
would have successfully proven that X is faster than Y.

One would have proven no such thing.

Consider this:

double x, y;
memcpy (&x, &y, sizeof (x) - 1);
memcpy (&x, &y, sizeof (x));

[snip explanation that second memcpy might be faster]

Hi Christian

Just a marvellous example you gave us - code that in C++ (and C) causes
undefined behaviour

What is the undefined behaviour (assume sizeof (x) >1)

for example you could end up with a trap representation in x. say, a
signalling
nan of some kind. and in any case you're not guaranteed anything useful about
the value you might get

You can end up with a trap representation in x, but that doesn't in
itself invoke undefined behavior. There would be undefined behavior if
you would later on access x as a double value, but that wasn't done. You
could printf () the individual bytes from x. You could memset () seven
bytes in y to zeroes, then copy those seven bytes back from x and y
would be restored to its original value. When writing a double to a
binary stream or file, it is quite likely that a memcpy similar to this
one will happen: Assume your standard library uses a 512 byte buffer to
write to binary streams, all but sizeof (double) - 1 bytes are used up
in a buffer, and you write another double: sizeof (double) - 1 bytes
will be copied to the buffer, the buffer will be flushed, and another
byte will be copied.

This code was not supposed to do something particularly useful - it was
supposed to give a clear example where "doing more work" is faster than
"doing less work". Which is exactly what it did.
 
C

Christian Bau

"Branimir Maksimovic said:
No I didn't claim undefined behavior.

Absolutely correct, you never claimed that.
I claimed that first case would probably produce hardware exception
and second one would probably work.

None of the cases will produce any hardware exception. Both are
completely legitimate uses of memcpy. The first one is a bit unusual,
the second one is a bit clumsy as the effect could have been achieved
much easier, but both are absolutely legitimate.
As memcpy is defined to be copy operation of n characters from
memory location to memory location, behavior is undefined only
when "to" and "from" overlap.
The original message claimed that compiler can be smart enough
to recognize use case and according to situation, apply different
semantics then those specified by code.

The compiler wouldn't "apply different semantics", the compiler would
detect that the effect of memcpy can be achieved much quicker and
therefore generate much better code.
This leads him to conclusion that produced code will be faster when
compiler applies assignment semantics then memcpy semantics.
This is just wrong example, but if we observe this:
double x[2];y=0.;
memcpy((char*)x+1,&y,sizeof(y));
double t = *(double*)((char*)x+1); /* depends on
hardware tolerance to alignment */

I would recommend to write ((char *) x) + 1 instead of (char *) x + 1,
so that (1) everyone knows what the expression means without having to
look up the precedence of cast operators, and (2) everyone knows that
what you wrote is what you meant.
memcpy(x,&y,sizeof(y));
t = *x;

It is obvious that second case will be always faster or at least
equal then first case, even if memcpy have to copy same number of bytes
and use ram instead or sizeof (x) ==1 .

In this case, the first assignment to t will have undefined behavior.
There are implementations where it will crash, there are others where it
will be set t to the same value as y, just very slowly, but it is
undefined behavior.

Since the compiler can easily detect that this is undefined behavior, it
is free to do whatever it likes - for example, not doing the memcpy and
the initialisation of t at all. Which will make the first case run
_faster_ than the second case.

x is of type double. In common implementations, sizeof (x) == 8. sizeof
(double) == 1 would be extremely unusual.
 
B

Branimir Maksimovic

Michael said:
Which is nothing more than you did before.



Beside the point.

--- C99 ---
7.21.2.1 The memcpy function
Synopsis
1 #include <string.h>
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);

Description
2 The memcpy function copies n characters from the object pointed to by
s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined.

Returns
3 The memcpy function returns the value of s1.
------------

The only way to safely and portably access the representation of an
object is bytewise (unsigned char). memcpy() does exactly that.

The first memcpy() operation can be replaced
size_t i;
unsigned char *p1= (unsigned char*) &y;
unsigned char *p2= (unsigned char*) &x;

for (i = 0; i < (sizeof x - 1); i++)
*(p2++) = *(p1++);

A conforming implementation has to do this right; you are thinking
of actual hardware and concluding that it cannot work.
Still, the "as if" rule has to hold, the operation has to work. There
must not be any repercussions as long as x is not accessed afterwards.

Thank you for proving my point. memcpy can't have x=y semantics
in any way. It can only have same final effect, but paths are
different as x=y is allowed to produce hardware exception
but memcpy(&x,&y,sizeof(x)); is not


Greetings, Bane.
 
C

Christian Bau

"Branimir Maksimovic said:
If implementation is allowed to use floating point registers
for memcpy, then yes implementation defined behavior.

memcpy has some defined meaning, defined by the C Standard (and the C++
Standard uses the same definition). The implementation is free to do
whatever it likes, as long as it guarantees that the results will be the
same as required.

If I have variables

double x, y;

and a call

memcpy (&x, &y, sizeof (x));

then _in this particular case_ the effect of the memcpy case happens to
be exactly the same as the effect of

(void) (x = y)

(not on every possible implementation, but in many implementations. The
implementation would have to know for example that assigning NaN's or
negative zeroes or denormalised numbers etc. doesn't change the bit
pattern, and doesn't cause any side effects like hardware exceptions).

So if the implementation knows all that, then in this particular case it
can use floating point registers for copying these bytes instead of
calling memcpy.
Question is: Are such implementations conformant?
eg:
double x,double y; // produces FPU exception if x,y gets trap value?
memcpy(&x,&y,sizeof(x)); // produces exception if FPU registers are
used
// and y has trap representation value
// which is non conformant as I understand memcpy semantics

Conclusion: if FPU registers are allowed to be used
for memcpy then it is normal to allow hardware exceptions
during memcpy.

No, this is exactly the wrong way round: If the assignment of trap
values would raise hardware exceptions, then the compiler _wouldn't_ be
allowed to use floating-point registers for memcpy. memcpy is _not_
allowed to raise an exception in this situation.

The compiler is allowed to do _anything_ as long as you can't detect the
difference by observing what the program does. If memcpy would raise a
hardware exception, then you could observe that, so memcpy isn't allowed
to do that.
 
B

Branimir Maksimovic

Christian said:
This is just wrong example, but if we observe this:
double x[2];y=0.;
memcpy((char*)x+1,&y,sizeof(y));
double t = *(double*)((char*)x+1); /* depends on
hardware tolerance to alignment */

I would recommend to write ((char *) x) + 1 instead of (char *) x + 1,
so that (1) everyone knows what the expression means without having to
look up the precedence of cast operators, and (2) everyone knows that
what you wrote is what you meant.
memcpy(x,&y,sizeof(y));
t = *x;

It is obvious that second case will be always faster or at least
equal then first case, even if memcpy have to copy same number of bytes
and use ram instead or sizeof (x) ==1 .

In this case, the first assignment to t will have undefined behavior.
There are implementations where it will crash, there are others where it
will be set t to the same value as y, just very slowly, but it is
undefined behavior.

Only on implementations where alignment requirement for a type
is not met.
This is a basic thing for implementing memory allocators.
memcpy works in all cases because it is defined that char is
aligned on any address.
If that wouldn't be the case then no memory allocator can't be written
in C or C++ without causing undefined behavior.
Remember that objects are defined as a sequence of bytes.
So when you convert object to void* it is plain raw memory
of object size bytes. You can place there anything which is smaller
or equal and meats right alignment.

Greetings, Bane.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,170
Messages
2,570,927
Members
47,469
Latest member
benny001

Latest Threads

Top