Optimization

S

Spidey

is there any ulternative metho to optimize the following piece of code
(for loop)


main()
{
int i,j;
for(i=0;i<40000000;i++)
j=1;
}

just wanna know how to optimize the for loop; i know the code has no
logic...........
but i heard that there is a way to optimize by restructuring the aboce
code. The j=1 assignment must be kept inside the loop.
 
S

Skarmander

Spidey said:
is there any ulternative metho to optimize the following piece of code
(for loop)


main()

int main(void), please.
{
int i,j;
for(i=0;i<40000000;i++)

40000000 is well out of the range provided by the standard for an int.
Make it a long.
j=1;
}

just wanna know how to optimize the for loop; i know the code has no
logic...........
but i heard that there is a way to optimize by restructuring the aboce
code. The j=1 assignment must be kept inside the loop.

Your post makes no sense. The loop you've shown is equivalent to doing
nothing at all, and the only statement it contains is not to be moved
outside of the loop. There's nothing to optimize. A good compiler will
optimize this by generating code for "i = 40000000; j = 1", and that's
assuming the variables aren't optimized out of existence too.

You can't "restructure" what isn't there. You may be thinking of the
machine language instructions that might be generated by a compiler in
order to implement this loop. That's a question for another newsgroup
that discusses your compiler/platform, however. Also, no compiler worth
its salt will generate faster code if you implement the same loop with a
different loop construct, be it a while, a do-while or a goto.

S.
 
M

MCheu

is there any ulternative metho to optimize the following piece of code
(for loop)


main()
{
int i,j;
for(i=0;i<40000000;i++)
j=1;
}

just wanna know how to optimize the for loop; i know the code has no
logic...........
but i heard that there is a way to optimize by restructuring the aboce
code. The j=1 assignment must be kept inside the loop.

That's kind of the problem. Once you get some clue about the logic,
then you can look at what's going on and see if it's doing anything
that's not necessary and whether you can make some assumptions along
the way if you already know the answer to some calculations. In this
case, since the for loop doesn't do anything, you can cut out the for
loop and skip to the end:

int main (void)
{
long int i=39999999;
int j=1;
return 0;
}

Still does nothing, but it'll do it a bit faster.

Right now, you're just setting j=1 about 40,000,000 times, which is
kind of nuts.
 
M

Michael Mair

MCheu said:
That's kind of the problem. Once you get some clue about the logic,
then you can look at what's going on and see if it's doing anything
that's not necessary and whether you can make some assumptions along
the way if you already know the answer to some calculations. In this
case, since the for loop doesn't do anything, you can cut out the for
loop and skip to the end:

int main (void)
{
long int i=39999999; ITYM 40000000
int j=1;
return 0;
}

Still does nothing, but it'll do it a bit faster.

Right now, you're just setting j=1 about 40,000,000 times, which is
kind of nuts.
---------------------------------------------
Thanks.


MCheu
 
R

Richard Harter

is there any ulternative metho to optimize the following piece of code
(for loop)


main()
{
int i,j;
for(i=0;i<40000000;i++)
j=1;
}

just wanna know how to optimize the for loop; i know the code has no
logic...........
but i heard that there is a way to optimize by restructuring the aboce
code. The j=1 assignment must be kept inside the loop.

As a preliminary remark loop optimization gains very little unless the
loop body is quite short. The reason for this is that all that loop
optimization gains is reducing the cost of the loop test and increment
which is dominated by the cost of the loop body. Furthermore it is
often the case that a good compiler can do the optimizations itself.

That said: There are a couple of common loop optimizations, counting
down to zero, and loop unrolling.

The reason for counting down to zero is that many machines have a
decrement and branch instruction or the equivalent, which means that
the loop test can be a single instruction. Even if it doesn't, it is
generally cheaper to test against zero/nonzero than it is to compare
against a value. To illustrate with your code:

int main ()
{
long i,j;
for (i=N; --i >= 0;)
{
LOOP_BODY(i);
}
}

where N is the number of times the loop is to be executed and
LOOP_BODY is the loop body, both defined elsewhere so that we don't
confuse ourselves with irrelevancies.

It turns out that there are a variety of ways of counting down. Thus
the above code counts down from N-1 to 0 and terminates with i being
negative; the value of i being tested is the same as that used in the
loop body. (This may or may not matter.) Some alternatives for the
for body are:

for (i = N; i-- > 0;)
for (i = N; i>0; i--)
for (i = N; i-- != 0;)
for (i = N; i != 0; i--)

The important thing here is to make sure that the loop body is
formulated so that counting down works.

A second trick is loop unrolling. The basic idea here is that the
loop body code is to be repeated several times within the loop. This
eliminate a large percentage of the loop tests. There are two gotchas
to take into consideration. The first is that we have some left over
iterations if N is not a multiple of the loop unrolling factor. The
second is that the loop index doesn't change between instances of the
loop body within the loop.

There are various ways to handle the left over iterations. A
convenient way is to use two loops, one to handle the left overs and
the other to use that part of N that is divisible by the loop
unrolling factor. I find it convenient to make the loop unrolling
factor a power of two and use the general formula:

for (i = (N&7) ; --i >= 0;) /* remainder loop */
{
LOOP_BODY;
}
for (i = (N>>3) ; --i >= 0;) /* main loop */
{
LOOP_BODY; LOOP_BODY;
LOOP_BODY; LOOP_BODY;
LOOP_BODY; LOOP_BODY;
LOOP_BODY; LOOP_BODY;
}

Generally speaking LOOP_BODY should not depend on i.

Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
I started out in life with nothing.
I still have most of it left.
 
C

Christian Bau

As a preliminary remark loop optimization gains very little unless the
loop body is quite short. The reason for this is that all that loop
optimization gains is reducing the cost of the loop test and increment
which is dominated by the cost of the loop body. Furthermore it is
often the case that a good compiler can do the optimizations itself.

That said: There are a couple of common loop optimizations, counting
down to zero, and loop unrolling.

The reason for counting down to zero is that many machines have a
decrement and branch instruction or the equivalent, which means that
the loop test can be a single instruction. Even if it doesn't, it is
generally cheaper to test against zero/nonzero than it is to compare
against a value. To illustrate with your code:

int main ()
{
long i,j;
for (i=N; --i >= 0;)
{
LOOP_BODY(i);
}
}

where N is the number of times the loop is to be executed and
LOOP_BODY is the loop body, both defined elsewhere so that we don't
confuse ourselves with irrelevancies.

And any compiler worth its money will do that much better if you leave
your loops alone and let the compiler do its job.
It turns out that there are a variety of ways of counting down. Thus
the above code counts down from N-1 to 0 and terminates with i being
negative; the value of i being tested is the same as that used in the
loop body. (This may or may not matter.) Some alternatives for the
for body are:

for (i = N; i-- > 0;)
for (i = N; i>0; i--)
for (i = N; i-- != 0;)
for (i = N; i != 0; i--)

The important thing here is to make sure that the loop body is
formulated so that counting down works.

The most important thing is to write readable code; the compiler is much
better at micro-optimisations than you will ever be.
A second trick is loop unrolling. The basic idea here is that the
loop body code is to be repeated several times within the loop. This
eliminate a large percentage of the loop tests. There are two gotchas
to take into consideration. The first is that we have some left over
iterations if N is not a multiple of the loop unrolling factor. The
second is that the loop index doesn't change between instances of the
loop body within the loop.

And any compiler worth its money will do that much better if you leave
your loops alone and let the compiler do its job.
 
M

MCheu

ITYM 40000000

Took me a while to figure out your acronym ITYM="I Think You Mean."

You're right. I forgot the final iteration that finally kicks you out
of the loop, so by the end of the loop, it would be 40,000,000.
 
T

Tatu Portin

Skarmander said:
int main(void), please.


40000000 is well out of the range provided by the standard for an
int. Make it a long.


During my course in programming, I've never seen a 16-bit int, even it
is said in (old) books that size of int is 16-bit. Only 16-bit
environment I know is 16-bit dos, but that can be regarded as
obsolete (DJGPP is 32-bit environment and has 32-bit ints).

You should use limits.h if portability is important.
 
G

Gordon Burditt

40000000 is well out of the range provided by the standard for an
During my course in programming, I've never seen a 16-bit int, even it
is said in (old) books that size of int is 16-bit. Only 16-bit
environment I know is 16-bit dos, but that can be regarded as
obsolete (DJGPP is 32-bit environment and has 32-bit ints).

The C Standard is not an "old book". And portability doesn't
depend on the systems *you've* seen. (How much work have you
done with embedded systems?)
You should use limits.h if portability is important.

<limits.h> really won't help much determining what types you need
to use to hold data. C doesn't have declarations like:

integer<0..65535> x;


Gordon L. Burditt
 
S

Skarmander

Tatu said:
Skarmander wrote:

During my course in programming, I've never seen a 16-bit int, even it
is said in (old) books that size of int is 16-bit. Only 16-bit
environment I know is 16-bit dos, but that can be regarded as
obsolete (DJGPP is 32-bit environment and has 32-bit ints).
"There are more things in heaven and earth, Horatio, then are dreamt of
in your philosophy."

These (non-DOS) systems exist. Even today. And if it's really no skin
off your back to write your program in such a way that it won't break
should it end up on such a system, why not do it?
You should use limits.h if portability is important.
That makes no sense. The standard guarantees 40000000 will fit in a
long, possibly with room to spare. Therefore "long" is a perfectly
portable solution, and "int" is not. <limits.h> wouldn't buy you
anything extra.

This is the old "you should use exact sizes for your integers if
portability is important" argument, which is just another way of stating
"you should fix the exact platform you expect to run on to get
portability". This sort of portability is hollow: your application will
be very easy to transfer to platforms that meet your exact
specifications, and practically impossible to port to those that don't.
This is almost never warranted.

S.
 
C

Christian Bau

Tatu Portin said:
During my course in programming, I've never seen a 16-bit int, even it
is said in (old) books that size of int is 16-bit. Only 16-bit
environment I know is 16-bit dos, but that can be regarded as
obsolete (DJGPP is 32-bit environment and has 32-bit ints).

You should use limits.h if portability is important.

Since "long" is a perfectly fine type to handle numbers of that size, in
a perfectly portable way, it is just plain stupid to use "int" in this
situation.

"I've never seen this" is an excuse, but using it as a reason for bad
coding is seriously unprofessional.
 
K

Keith Thompson

Christian Bau said:
Since "long" is a perfectly fine type to handle numbers of that size, in
a perfectly portable way, it is just plain stupid to use "int" in this
situation.

"I've never seen this" is an excuse, but using it as a reason for bad
coding is seriously unprofessional.

I think "just plain stupid" is a bit overstated. On a system where
int is 32 bits, long is 64 bits, and operations on int are
significantly faster than operations on long, it makes sense to use
int here if performance is sufficiently important.

Of course, this argues for using one of the typedefs in <stdint.h>,
not for using int directly. (If your implementation doesn't provide
<stdint.h>, there are C90-compatible versions out there.)
 
?

=?ISO-8859-1?Q?=22Nils_O=2E_Sel=E5sdal=22?=

Keith said:
I think "just plain stupid" is a bit overstated. On a system where
int is 32 bits, long is 64 bits, and operations on int are
significantly faster than operations on long, it makes sense to use
int here if performance is sufficiently important.
Says who ?
Some 64bit systems have 32 bit longs (e.g. win64), while 64 bit
operations are generally faster on 64 bit arcitectures. Though
marginally on e.g. amd64.

If you want/can, use int_fast32_t from the c99 stdint.h
 
K

Keith Thompson

Nils O. Selåsdal said:
Says who ?

Says I.
Some 64bit systems have 32 bit longs (e.g. win64), while 64 bit
operations are generally faster on 64 bit arcitectures. Though
marginally on e.g. amd64.

Then such systems don't meet the (hypothetical) conditions I stated,
do they? *If* int is 32 bits, long is 64 bits, and int is
significantly faster than long, then it can make sense to use int when
you need a 32-bit type.
If you want/can, use int_fast32_t from the c99 stdint.h

Of course, which is why I mentioned <stdint.h> in the portion of my
article that you didn't quote.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,174
Messages
2,570,941
Members
47,476
Latest member
blackwatermelon

Latest Threads

Top