removing a loop cause it to go at half the speed?

T

tom fredriksen

Sorry, for some reason I was thinking my results where in minutes. But
it makes sense now:)
Gcc v3.41 yes, but DJGPP uses it's own libc. I keep forgetting that DJGPP
has it's own libc which is different from GNU libc. That could _easily_ be
a factor. Also, CPU speeds increase by roughly a factor of 2 per generation
(2), so 1/4 of 44 ~ 11.

(2 runs, w/loop)
gcc -O2
Elapsed time (ms): 42790.000000
Elapsed time (ms): 43450.000000

(2 runs, no loop)
gcc -O2
Elapsed time (ms): 44160.000000
Elapsed time (ms): 44050.000000

(2 runs, w/loop)
gcc -Wall -O2 -D_LARGEFILE64_SOURCE -std=gnu99
Elapsed time (ms): 44600.000000
Elapsed time (ms): 43340.000000

(3 runs, no loop)
gcc -Wall -O2 -D_LARGEFILE64_SOURCE -std=gnu99
Elapsed time (ms): 39600.000000
Elapsed time (ms): 42840.000000
Elapsed time (ms): 40970.000000

After some head banging I realise that your number are comparatively
correct.

I find the reason to be twofold 1) cpu 2) compiler/system. Athlon is
quite different and newer than a K6 that should explain some of it. The
other reason is because of the cpu and system it optimises differently.
Leading to the athlon/linux test to behave differently than on a k6/win.

If you replaced the rand expression with a guaranteed integer
expression, the numbers between the two tests where equal. So there is
something in the rand causing it to behave radically different. I looked
at the generated assembler and I didn't find it was that different, but
still the execution was different... interesting.

Another thing I just noticed is that if I create the program to contain
both version at the same time, there is no difference, both are at 7
seconds. But if I remove the float version its back up to 1 seconds. I
am going mad now...

There is definitely something going on with the rand statement causing
it to affect the entire program.

data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0));

The complete code is at the bottom, I fixed it to be like yours.
Can anybody test this code on an equivalent system and a one that is
different but newer than k6? My system is Linux 2.6.3 with gcc 3.3.2

It might be time to try a linux NG to see if someone with a similar system
to yours is getting the same issue...

Linux NG? never heard of it.. do explain...

/tom

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>


void test_orig()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);
}


void test_int()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
unsigned int data2[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data2[c] = data[c];
}


gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);

for(c=0; c<100; c++) {
printf("data2: %u ", data2[c]);
}
printf("\n");
}



int main(int argc, char *argv[])
{
/* test_orig(); */

test_int();

return(0);
}
 
T

tom fredriksen

Richard said:
tom fredriksen said:


Wrong.

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation and is
read by an lvalue expression that does not have character type, the
behavior is undefined."

That does not contradict what I am saying. These statements only define
what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

/tom
 
C

Chris Dollin

tom said:
That does not contradict what I am saying. These statements only define
what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

It says using such a value /produces/ undefined behaviour: if you do
it, you can no longer appeal to the C standard for the remaining
behaviour of your code. An implementation can do /anything it likes/.

Now, you may happen to know - or believe - that your implementation
will do something harmless. And you may be right. But this is not
because of the semantics of C, but because of some implementation-specific
behaviour, which in turn may be accidental.

When we say that such-and-such is undefined behaviour, we mean that
the behaviour of the program can no longer be predicted from the
semantics of C, and that the standard allows /anything/, even
unreasonable or unphysical behaviour. Luckily, most implementations
are constrained by physics, otherwise we'd all require much bigger
nostrils.
 
T

tom fredriksen

Chris said:
It says using such a value /produces/ undefined behaviour: if you do
it, you can no longer appeal to the C standard for the remaining
behaviour of your code. An implementation can do /anything it likes/.

Now, you may happen to know - or believe - that your implementation
will do something harmless. And you may be right. But this is not
because of the semantics of C, but because of some implementation-specific
behaviour, which in turn may be accidental.

When we say that such-and-such is undefined behaviour, we mean that
the behaviour of the program can no longer be predicted from the
semantics of C,

Can you then explain to me how it is that you think the behaviour of my
code can not be predicted? Are you actually telling me that just because
the array is uninitialised my program will fail/have undefined
behaviour, independent of how I use the array? lol:)

Before you answer, you should know that the reason for the problem is
that the rand() statement causes the compiler to output float operations
instead of integer operations. I tested it by replacing the statement in
the loop with a guaranteed integer statement. So the undefined behaviour
argument has exploded in its own face. See code below.

/tom

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>


void test_orig()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);
}


void test_int()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
unsigned int data2[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data2[c] = data[c];
}


gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);

for(c=0; c<100; c++) {
printf("data2: %u ", data2[c]);
}
printf("\n");
}



int main(int argc, char *argv[])
{
/* test_orig(); */

test_int();

return(0);
}


/tom
 
C

Chris Dollin

tom fredriksen wrote:

(long quotes)
Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?

Certainly.

When you don't initialise the data array, the line:

total += data[c]

references an indeterminately-valued object. This produces undefined
behaviour. Therefore, the further behaviour of your code cannot be
predicted /from the C standard/: you must use additional information.
 
T

tom fredriksen

Chris said:
Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?

Certainly.

When you don't initialise the data array, the line:

total += data[c]

references an indeterminately-valued object.
Correct.

This produces undefined behaviour.
> Therefore, the further behaviour of your code cannot be
> predicted /from the C standard/: you must use additional information.

You see, even though the standard defines it as undefined behaviour,
does not practically make it undefined behaviour. It is only defined as
such because the writers of the standard are not in a position to guess
such a variables value or use for all possible future programs.

I think there is no point continuing the discussion. Because if you only
look at it literally, as you do, then you are correct, I dont dispute
that. But if you consider the pragmatics of it, in addition to the
literal meaning, which you should when you are programming, then you are
wrong.

/tom
 
C

Chris Dollin

tom said:
Chris said:
Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?

Certainly.

When you don't initialise the data array, the line:

total += data[c]

references an indeterminately-valued object.
Correct.

This produces undefined behaviour.
Therefore, the further behaviour of your code cannot be
predicted /from the C standard/: you must use additional information.

You see, even though the standard defines it as undefined behaviour,
does not practically make it undefined behaviour.

In what way have I not been explicit about this?

"the further behaviour of your code cannot be predicted /from the C
standard/: you must use additional information."

"Undefined behaviour" in this newsgroup /means/ "behaviour left undefined
by the [relevant] C standard". So, for portability, /avoid/ UB, because
even if you know (or think you know) what will happen on the machine you
happen to be running on today, you don't know what will happen tomorrow,
or on a different machine elsewhere.
It is only defined as such because the writers of the standard are
not in a position to guess such a variables value or use for all
possible future programs.

It's defined as such because different implementations have been known to
do different things with this, or similar, situations, it's very hard
to define the exact line to cross, and even if you /could/ it probably
wouldn't help.

Here's a well-known type of undefined bahviour:

int eg() { int i = 17; return i++ + ++i; }

An interesting variety of "answers" are available, all correct.
I think there is no point continuing the discussion. Because if you only
look at it literally, as you do, then you are correct, I dont dispute
that. But if you consider the pragmatics of it, in addition to the
literal meaning, which you should when you are programming, then you are
wrong.

The pragmatics are

* avoid code with undefined behaviour
* if you can't, justify it with additional standards
* if you can't, have sanity checks and document it
 
T

tom fredriksen

Chris said:
tom fredriksen wrote:
>>> total += data[c]

Explain to me how this statement causes undefined behaviour and how its
behaviour will be different on different architectures.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

The only thing happening here is an addition of two binary values from a
memory address to another memory address or register, as defined in
the language specification for the "+=" operator.

I can imagine a system where the compiler or the architecture causes a
program using an uninitialised variable to abort or cast an exception /
interrupt etc. if thats what you mean then agreed, than can happen, but
I don't think its a good idea.

/tom
 
V

Vladimir S. Oka

tom said:
Chris said:
tom fredriksen wrote:
total += data[c]

Explain to me how this statement causes undefined behaviour and how its
behaviour will be different on different architectures.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

Why are you so sure the same memory area would have been used before,
and even if it was, that it's not going to contain something that's a
trap representation for the type that `data` above is? That, exactly,
is why it's *undefined* behaviour. As far as C Standard is concerned,
the memory may have just been shipped from Vladivostok, after being
used on a Russian nuclear submarine.
The only thing happening here is an addition of two binary values from a
memory address to another memory address or register, as defined in
the language specification for the "+=" operator.

One of which may be a trap representation for the type...
I can imagine a system where the compiler or the architecture causes a
program using an uninitialised variable to abort or cast an exception /
interrupt etc. if thats what you mean then agreed, than can happen, but
I don't think its a good idea.

Stop and think. There may exist bit patterns that are invalid (trap
representation) for a certain type, even if they are perfectly valid
for some other type. Your `data` array, which I believe is of type that
may have trap representation(s), may have been allocated the space
containing just such bit pattern (e.g. from previous use by `unsigned
char` which cannot have trap values, padding bits notwithstanding).
 
C

Chris Dollin

tom said:
Chris said:
tom fredriksen wrote:
total += data[c]

Explain to me how this statement causes undefined behaviour

/because the standard says so/.

Really. If `data[c]` has not been given a value (in a way sanctioned
by the standard), then the Standard says that all bets are now off,
it can't say anthing about the future behavior of the code.

That's all that "undefined behaviour" means (according to the C
standard). It means that the Standard doesn't specify the future
behaviour of the code - it's not defined, it's un-defined, the
implementation is at liberty to prescribe the behaviour in any
way it likes without fear of contradiction.
and how its
behaviour will be different on different architectures.

It will be different if the architecture, or the implementation,
says so.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

Well, no. The difference is that the implementation can do what it
likes. It could arrange to set all the elements to 0, or -1, or
0xdeadbeef, or __TRAP before you access it. The implementation
could arrange that the array is given its own memory segment
without read access, and only grant read access when at least
one write has been given. The compiler could spot that you'll
accessed uninitialised memory and, just before you do so, plant
code:

puts( "you were going to read uninitialised memory!" );
exit( 17 );
The only thing happening here is an addition of two binary values from a
memory address to another memory address or register, as defined in
the language specification for the "+=" operator.

.... which says that anything is allowed to happen at this point.
I can imagine a system where the compiler or the architecture causes a
program using an uninitialised variable to abort or cast an exception /
interrupt etc. if thats what you mean

It's one of the things that is /permitted/.
then agreed, than can happen, but I don't think its a good idea.

But people who think it /is/ a good idea can use implementations
that /do/ do it, and those implementations are conformant (in that
respect).

Myself, I'd like to be /able/ to use an implementation that spots
bad reads and bad writes; it's nice to know I'm allowed to.
 
T

tom fredriksen

Chris said:
tom said:
Chris said:
tom fredriksen wrote:
total += data[c]

Explain to me how this statement causes undefined behaviour

It will be different if the architecture, or the implementation,
says so.

Thats not really an argument, its just a contradiction. Please give an
factual example how it can cause undefined behaviour.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

Well, no. The difference is that the implementation can do what it
likes. It could arrange to set all the elements to 0, or -1, or
0xdeadbeef, or __TRAP before you access it.


What is a __TRAP on f.ex an x86 or a ppc? In the eyes of a cpu which
deals with 32 bit integers datatypes, just another 32 bit value.
The implementation
could arrange that the array is given its own memory segment
without read access, and only grant read access when at least
one write has been given. The compiler could spot that you'll
accessed uninitialised memory and, just before you do so, plant
code:

puts( "you were going to read uninitialised memory!" );
exit( 17 );

Could you point me to any system that does anything like this for the
given example?

/tom
 
C

Chris Dollin

tom said:
Chris said:
tom said:
Chris Dollin wrote:
tom fredriksen wrote:
total += data[c]

Explain to me how this statement causes undefined behaviour

It will be different if the architecture, or the implementation,
says so.

Thats not really an argument, its just a contradiction.

It's not a contradiction. Look, you're asking the wrong question:
how could it happen on some machine? /It doesn't matter./ What
matters is that the behaviour is not constrained by the standard;
the implementation is free to choose what to do.
Please give an
factual example how it can cause undefined behaviour.

You seem to think that "undefined behaviour" is something specific.
It isn't. /Defined/ behaviour is specific. The "cause" of undefined
behaviour is doing something that the standard says produces
undefined behaviour.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

Well, no. The difference is that the implementation can do what it
likes. It could arrange to set all the elements to 0, or -1, or
0xdeadbeef, or __TRAP before you access it.

What is a __TRAP on f.ex an x86 or a ppc?

Whatever the implementation says it is (if it exists).
In the eyes of a cpu which
deals with 32 bit integers datatypes, just another 32 bit value.

[Just in passing ... your data array was an `int`, right? So it can be
a 16-bit integer, not a 32-bit one.]

Or just another 64-bit value with 32 value bits, 31 padding bits,
and a single "unassigned" bit set and read appropriately. Slower,
less compact, but handy for debugging.
Could you point me to any system that does anything like this for the
given example?

No. Does that matter? Won't you believe that that latitude is granted
without a system to hand that actually exploits it?

[A quick Google suggests that running under Valgrind or Saber-C would
come plausibly close.]

A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...
 
B

bert

Chris said:
. . .
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow . . .

At last, someone has picked up on what I posted very early in
this thread. Overflow interrupts need to be serviced, and this
consumes extra TIME which, depending on the implementation,
may be charged to the process in which they occurred.
--
 
T

tom fredriksen

Chris said:
You seem to think that "undefined behaviour" is something specific.
It isn't. /Defined/ behaviour is specific. The "cause" of undefined
behaviour is doing something that the standard says produces
undefined behaviour.

No I don't think that. But undefined behaviour in my book is behaviour
not defined, if you tell the computer to do something then its defined.
The compiler controls the undefined behaviour so, therefore its not
undefined anymore, portability or not.

Lets just agree that we to some extent disagree, we have different
viewpoints. I agree that in the standard its undefined, but in my
opinion it is defined because of the above statement.

/tom
 
T

tom fredriksen

Chris said:
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?
In my program that was of no concern, so the "undefined behaviour" is
irrelevant.

/tom
 
R

Richard Bos

tom fredriksen said:
No I don't think that. But undefined behaviour in my book is behaviour
not defined, if you tell the computer to do something then its defined.
The compiler controls the undefined behaviour so, therefore its not
undefined anymore, portability or not.

Lets just agree that we to some extent disagree, we have different
viewpoints. I agree that in the standard its undefined, but in my
opinion it is defined because of the above statement.

The whole issue simply boils down to this:

There are more implementations, tom frederiksen, in Heaven and Earth,
than are dreamt of in thy philosophy.

And if you think that you can rely on always having an implementation
that does what your limited experience on desktop personal computers
tells you it "should" do, get your code away from my computer. It cannot
be trusted.

Richard
 
R

Richard Bos

tom fredriksen said:
Chris said:
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?

There is no overflow bit. Your program is immediately terminated with
the status E_INT_OVERFLOW.

Oh, not on your implementation? Gosh, implementations being different -
what _is_ this world coming to...

Richard
 
C

Chris Dollin

tom said:
Chris said:
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?

There need not be an overflow bit. There might, for example, be an
immediate exception.
In my program that was of no concern, so the "undefined behaviour" is
irrelevant.

Not if you're running on an overflow-trapping exception it isn't.

It's like this: playing games with uninitialised data isn't safe.
In some cases, you get away with it, but the more you push it, the
more likely that the revolving blades will chop out your (hopefully,
metaphorical) eyes.

Pragmatics say: don't /do/ that.
 
R

Rod Pemberton

Actually, I think Tom's interpretation for his implentation is correct.
Take another look:
C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data,

The construct and data aren't errnoneous for his implementation. But, may
be elsewhere.
or of
indeterminately-valued objects,

They aren't indeterminately-valued for his implementation.
for which the Standard imposes no
requirements.

Therefore, valid for his implementation. Maybe UB elsewhere.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation

The object types do represent a value of the object type in his
implementation.
and is
read by an lvalue expression that does not have character type,

It has a character type in his implementation.
the
behavior is undefined."

Therefore, not undefined for his implementation, but may be UB elsewhere.

The key is that his code is correct for his environment.


Rod Pemberton
 
R

Rod Pemberton

tom fredriksen said:
Before you answer, you should know that the reason for the problem is
that the rand() statement causes the compiler to output float operations
instead of integer operations. I tested it by replacing the statement in
the loop with a guaranteed integer statement. So the undefined behaviour
argument has exploded in its own face. See code below.

/tom

Good! You found the problem. I think that concludes with my prior
statements being correct. :)


tom fredriksen said:
I tested that but it gave no difference either.


Later,

Rod Pemberton
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,176
Messages
2,570,947
Members
47,501
Latest member
Ledmyplace

Latest Threads

Top