how to avoid loop overhead of for(), while()

  • Thread starter Pushkar Pradhan
  • Start date
P

Pushkar Pradhan

I want to time my matrix multiply code (in MFLOPS). I want to run the
code 100,000 times or some other big number.
This can be done 2 ways (for and while loops):

timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

However, this means there will be a comparison of "n" on each
iteration and increment operation each time. I want to avoid this, if
you use while() same problem.
So is there someway to avoid the loop overhead, i.e. tell it:
do 100000 times{
mm
}
Pushkar Pradhan
 
J

John Tsiombikas (Nuclear / the Lab)

Pushkar said:
timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

However, this means there will be a comparison of "n" on each
iteration and increment operation each time. I want to avoid this, if
you use while() same problem.
So is there someway to avoid the loop overhead, i.e. tell it:
do 100000 times{
mm
}
Pushkar Pradhan

ehm... and how exactly do you think that statement would be implemented
if it existed apart from keeping a counter initialized to 100000 and
decremented and tested for equality with 0 for each loop? do you know of
any magic that the rest of us miss?

-- Nuclear / the Lab --
 
R

Rick

Pushkar said:
I want to time my matrix multiply code (in MFLOPS). I want to run the
code 100,000 times or some other big number.
This can be done 2 ways (for and while loops):

timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

However, this means there will be a comparison of "n" on each
iteration and increment operation each time. I want to avoid this, if
you use while() same problem.

A good compiler should be (and usually is) smart enough to know about
the comparisons and increments and optimize this away. Even if it
doesn't, these comparisons and integer increments hardly ever take up
extra CPU cycles compared to what your mm() function will be taking
(flops will take more cycles and so while a single flop is underway, the
processor will be able to do a number of integer operations and
comparisons). Effectively, there will be *no* overhead involved in using
a loop for timing your function. Literally *all* scientific code there
is out there uses loops for measuring performance of functions. You
should have no concern about overheads at all.

So is there someway to avoid the loop overhead, i.e. tell it:
do 100000 times{
mm
}

Even if there was a way to do this, this would require an implicit
comparison on the number of times the code has been executed, and
keeping a count by incrementing it each time, thus having the same effect.

-Rick
 
K

Kyle S. Shim

Run two routines measuring the time respectively.

1. first routine
timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

time1 = difftime(timer2, timer1);

2. second routine - n must be declared with the volatile type qualifier
timer1 = time(NULL);
for(n = 0; n < 100000; n++);

timer2 = time(NULL);

time2 = difftime(timer2, timer1);

The net time to call mm() 100000 times could be calculated....
exectime = time2 - time1.

But the results may differ every time you run.


: I want to time my matrix multiply code (in MFLOPS). I want to run the
: code 100,000 times or some other big number.
: This can be done 2 ways (for and while loops):
:
: timer1 = time(NULL);
: for(n = 0; n < 100000; n++)
: mm();
: timer2 = time(NULL);
:
: exectime = difftime(timer2, timer1);
:
: However, this means there will be a comparison of "n" on each
: iteration and increment operation each time. I want to avoid this, if
: you use while() same problem.
: So is there someway to avoid the loop overhead, i.e. tell it:
: do 100000 times{
: mm
: }
: Pushkar Pradhan
 
M

Martijn

Kyle said:
1. first routine
timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

time1 = difftime(timer2, timer1);

/* highrestime is assumed to be an implementation of a highresolution timer
*/
totaltime = 0;
for ( ... )
{
timer = highrestime(NULL);
mm;
totaltime += highrestime(NULL) - timer;
}

of course you would need a real high res timer, or this still wouldn't work
properly.

Good luck,
 
D

Derk Gwen

# However, this means there will be a comparison of "n" on each
# iteration and increment operation each time. I want to avoid this, if
# you use while() same problem.
# So is there someway to avoid the loop overhead, i.e. tell it:
# do 100000 times{
# mm
# }

There may be vector hardware on your machine, but C doesn't have a standard way to
access that. Some variants of C do, C* I think is one, as well other languages like
Fortran 90.
 
M

Mike Wahler

Pushkar Pradhan said:
I want to time my matrix multiply code (in MFLOPS). I want to run the
code 100,000 times or some other big number.
This can be done 2 ways (for and while loops):

timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

However, this means there will be a comparison of "n" on each
iteration and increment operation each time. I want to avoid this, if
you use while() same problem.
So is there someway to avoid the loop overhead, i.e. tell it:
do 100000 times{
mm
}
Pushkar Pradhan

Despite the fact that the loop counting you show
above will not likely have a noticeable impact on
your timing, I had fun writing this:


#include <stdio.h>
#include <stdlib.h>

#define PROTOTYPE "void mm();\n"
#define CALL " mm();\n"
#define ITERATIONS 100000

int main()
{
const char *code[] =
{
"#include <stdio.h>\n",
"#include <time.h>\n",
"\n",
PROTOTYPE,
"\n",
"int main(void)\n",
"{\n",
" time_t timer1 = time(NULL);\n",
" time_t timer2;\n",
" double exectime = 0;\n",
"\n",
"@",
"\n",
" timer2 = time(NULL);\n",
" exectime = difftime(timer2, timer1);\n",
" printf(\"Execution time %f seconds\\n\", exectime);\n",
" return 0;\n",
"}\n"
};

const int ret[] = {EXIT_SUCCESS, EXIT_FAILURE};

const size_t lines = sizeof code / sizeof *code;
size_t line = 0;

const unsigned long iters = ITERATIONS;
unsigned long iter = 0;

int call = 0;
int err = 0;

FILE *file = fopen("timeit.c", "w");

if(file)
for(line = 0; line < lines; ++line)
{
if(call = (*(code[line]) == '@'))
{
for(iter = 0; iter < iters; ++iter)
if(fputs(CALL, file) < 0)
break;
}
else
if(fputs(code[line], file) < 0)
break;

if(call && iter < iters)
break;
}
else
fprintf(stderr, "%s\n", "Cannot open output");

if(file && fclose(file) || (err = line < lines))
fprintf(stderr, "%s (line %lu)\n",
"Error writing output", (unsigned long)++line);

return ret[err];
}


Compile, link, and run this, compile the output
(file "timeit.c"), and link it with your 'mm()'
function. Of course you're still paying time
overhead for the call to 'mm()'. If you have
the source, you can apply this same method,
replacing the call to 'mm()' with its source,
and make a *really* big "timeit.c" file.

:)

-Mike
 
K

Keith Thompson

I want to time my matrix multiply code (in MFLOPS). I want to run the
code 100,000 times or some other big number.
This can be done 2 ways (for and while loops):

timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

However, this means there will be a comparison of "n" on each
iteration and increment operation each time. I want to avoid this, if
you use while() same problem.
So is there someway to avoid the loop overhead, i.e. tell it:
do 100000 times{
mm
}

There's an optimization called loop unrolling that can eliminate some
of the overhead at the expense of larger code. For example:

timer1 = time(NULL);
for(n = 0; n < 10000; n++) {
mm();
mm();
mm();
mm();
mm();
mm();
mm();
mm();
mm();
mm();
}
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

Consult your compiler's documentation; it's quite possible that you
can persuade the compiler to do this for you. (gcc, for example, as a
"-funroll-loops" option.)
 
M

Minti

Kyle S. Shim said:
Run two routines measuring the time respectively.

1. first routine
timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

time1 = difftime(timer2, timer1);

2. second routine - n must be declared with the volatile type qualifier

I am not sure if declaring n to be a 'volatile' would make sure that n
would be incremented precisely 100,000 times rather than the block

for ( n = 0; n < 100000; n ++ )

being ignored completly.
 
M

Martijn

Minti said:
I am not sure if declaring n to be a 'volatile' would make sure that n
would be incremented precisely 100,000 times rather than the block

for ( n = 0; n < 100000; n ++ )

being ignored completly.

If you'd change the loop to call an empty function (and make sure the
compiler doesn't optimize it out), you would also take care of the overhead
of calling the function.
 
J

Joe Wright

Mike said:
[ snip ]

Compile, link, and run this, compile the output
(file "timeit.c"), and link it with your 'mm()'
function. Of course you're still paying time
overhead for the call to 'mm()'. If you have
the source, you can apply this same method,
replacing the call to 'mm()' with its source,
and make a *really* big "timeit.c" file.

:)

-Mike

Doggone it, that was going to be my answer. :)
 
A

Alan Balmer

There's an optimization called loop unrolling that can eliminate some
of the overhead at the expense of larger code. For example:

Yes. The other proposals all have a problem in that the processor's
handling of cache will probably have more effect than any other
factor. (In spite of the fact that cache isn't topical in c.l.c <g>.)
Completely unrolling the loop minimizes both the cache effects and the
overhead.
 
E

Eric Sosman

Pushkar said:
I want to time my matrix multiply code (in MFLOPS). I want to run the
code 100,000 times or some other big number.
This can be done 2 ways (for and while loops):

timer1 = time(NULL);
for(n = 0; n < 100000; n++)
mm();
timer2 = time(NULL);

exectime = difftime(timer2, timer1);

However, this means there will be a comparison of "n" on each
iteration and increment operation each time. I want to avoid this, if
you use while() same problem.
So is there someway to avoid the loop overhead, i.e. tell it:
do 100000 times{
mm
}

No problem:

timer1 = time(NULL);
mm(); /* 1 */
mm(); /* 2 */
/* 99997 additional lines left as an exercise */
mm(); /* 100000 */
timer2 = time(NULL);

However, it's probably not necessary to go to this much
trouble. mm() will probably take a good deal more time than
the loop's increment-test-branch overhead, so the presence
of the latter won't contaminate your timings noticeably. In
the more general case where mm() is replace with something
very fast, possibly on the same order of speed as the loop
machinery, you can time the loop with and without the "guts"
and subtract one timing from the other.
 
N

nrk

Mike said:
#define ITERATIONS 100000

const unsigned long iters = ITERATIONS;

Just out of curiosity, why the #define and then the const constant as well?
Is this for one of those situations where the value of PI might change? :)

-nrk.
 
M

Mike Wahler

nrk said:

Note: (30 lines were snipped)
Just out of curiosity, why the #define and then the const constant as well?
Is this for one of those situations where the value of PI might change?
:)

No, it's so I can quickly change the value, not
needing to hunt for it, by having the #define
at the top of the file. In a "real" program,
I'd have probably put it in a header, but that
seems silly for a single macro in a toy program.

-Mike
 
N

nrk

Mike said:
Note: (30 lines were snipped)

:)

No, it's so I can quickly change the value, not
needing to hunt for it, by having the #define
at the top of the file. In a "real" program,
I'd have probably put it in a header, but that
seems silly for a single macro in a toy program.

-Mike

Ok, Now I am confused. What is the utility of the const constant? Can't you
encode the type information in the #define as well? I know I must be
missing something, but I won't rest till someone beats me with a wet noodle
and makes me realize what it is :)

-nrk.
 
M

Mike Wahler

nrk said:
Ok, Now I am confused. What is the utility of the const constant?

It's a real object which can be type-checked.
Can't you
encode the type information in the #define as well?

Yes.

I know I must be
missing something, but I won't rest till someone beats me with a wet noodle
and makes me realize what it is :)

The constant could indeed have been expressed with a macro,
but my ingrained 'coding practice' causes me to use a const
object instead.

-Mike
 
N

nrk

Mike Wahler wrote:

The constant could indeed have been expressed with a macro,
but my ingrained 'coding practice' causes me to use a const
object instead.

Thank you. I know that this is just a style issue, but I am trying to build
up a case in favor of one or the other for my own coding style. So, if
you'd be so kind as to provide some of your reasons for preferring this
style...

-nrk.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,083
Messages
2,570,591
Members
47,212
Latest member
RobynWiley

Latest Threads

Top