which loop is faster

P

petermcmillan_uk

Jack said:
On 3 Jan 2005 04:11:39 -0800, "EventHelix.com" <[email protected]>
wrote in comp.lang.c:

Either figure out how to do one of the following:

1. Get the new Google groups reply to quote context.

2. Quote the context yourself.

3. If you can't do either of the above, use a real newsreader. Free
Agent, Mozilla Thunderbird, and Gravity are all free and quite good.


What's a 'cache'? Where is it defined in the C standard? I write C
code for quite a few platforms that don't have any such thing.

So what's your answer if the OP is writing code for a platform without
a cache?

Hmm, I dunno whether you're joking? The cache is a type of memory,
there are quite a few caches within the computer. In this case the
important caches are the ones CPU caches. Have a look at
http://www.pantherproducts.co.uk/Articles/CPU/CPU Cache.shtml for
more info.
 
M

Mike Wahler

Hmm, I dunno whether you're joking?

He's not.
The cache is a type of memory,
there are quite a few caches within the computer.

1. The topic here is the C language, not computers.
2. The C language has no notion of 'cache'
2. Not all computers use 'caches'
In this case the
important caches are the ones CPU caches. Have a look at
http://www.pantherproducts.co.uk/Articles/CPU/CPU Cache.shtml for
more info.

No need. The hardware described there might include
'caches', but that's nothing to do with C.

-Mike
 
M

Mike Wahler

Yes, mathematically dis is true.

When it comes to relams of C compilers its totally up to the compiler how it
generates the machine instructions to do the same. and dat also depends on,
for which target machine you are going to generate dat code. Its properties
affect the binary code generation like : word length, chache - if any the
processor has, address n data buses (von newman vs. harvard architecture) n
of couse, how compiler optimize the code while exploiting all these
available facilites.

One important thing - Standard doesn't make any assumptions about it.

It is obvious by closely analysing the two loops that memory access will be
more in first case
[100]
...[10]

as compared to
[10]
...[100]

'coz inner loop breaks 10 times more and intialization has to be performed
for loop control variable 10 times more, assuming, all other things being
constant.

Is that make sense?

-Neo

OK, I just had to find out the truth,

The truth for your particular platform, implementation,
and current configuration of such.
so I compiled the program and
looked at the disassemly. The results seem to be pretty much the same.

For your current setup.
Take a look :

No need.

[snip]
That's one way, and if it's done the other way then the only difference
will be the parameters used for the for looping.

On your system.
The thing that makes
it loop is jmp, and in both cases it will jmp the same number of times.
The other parts of the for loops have exactly the same instructions,
but with different parameters.
I think we can conclude that neither of these loops will be faster.

On your system. On others, the results could be completely different.

-Mike
 
N

Neo

sushant said:
hi all,

suppose i have 2 loops one inside the other, like this

1) for(i=0;i<=100;i++)
{
for(j=0;j<=10;j++)
{
some code;
}
}

2) for(i=0;i<=10;i++)
{
for(j=0;j<=100;j++)
{
some code;
}
}

so which loops will work faster the 1st one or the 2nd one.

thnx in advance
sushant

2nd will be faster.
O'kay lets try to analyze it.

As per K&R2 (Page 60):
"The for statement
for(expr1; expr2; expr3)
statement
is equivalent to
expr1;
while(expr2) {
statement
expr3;
}

Grammatically, the three components of a for loop are expressions. Most
commonly, expr1 and expr3 are assignment or function calls and expr2 is a
relational expression."

In 1st one "expr1" will be executed 100 times, because inner loop breaks
more frequently. In 2nd case "expr1" will be executed 10 times only. So, all
things being equal (like generated target code) execution time will be more
in 1st case.

O'kay What about
for(i=0; i<1000; i++)
{
some code;
}
Will it run faster then any of the above versions?

Does it make sense, its all Standard C...

-Neo
 
D

Derrick Coetzee

sushant said:
suppose i have 2 loops one inside the other, like this

> [ snipped: two nested loops, same with loops reversed ]

so which loops will work faster the 1st one or the 2nd one.

A good compiler will perform loop exchange where it's possibly and
beneficial on that platform. In other words, both run just as quickly
with a good compiler (because it transforms one into the other).
 
M

Mark McIntyre

(snip example of nested loops)
2nd will be faster.

There's no way to say that.
O'kay lets try to analyze it.

You can't, unless you know what "some code" inside the inner loop is. A
modern optimising compiler can optimise both these to be identical, if it
can determine there are no side-effects of the inner code. Many compilers
might even unroll both loops entirely and execute 1000 evaluations of 'some
code'.

And then of course modern CPUs can often run such loops as parallel
processes in different pipelines. Again this might (or might not) mean that
version X was faster than version Y.

This is all in concordance with the 'as if' principle which allows the
compiler and/or hardware to rearrange your code how it likes, as long as
the result is the same as if it hadn't.

(of a once-unrolled loop)
Will it run faster then any of the above versions?

Yes. No. Sometimes. Never.
 
N

Neo

Mark McIntyre said:
(snip example of nested loops)


There's no way to say that.


You can't, unless you know what "some code" inside the inner loop is. A
modern optimising compiler can optimise both these to be identical, if it
can determine there are no side-effects of the inner code. Many compilers
might even unroll both loops entirely and execute 1000 evaluations of
'some
code'.

And then of course modern CPUs can often run such loops as parallel
processes in different pipelines. Again this might (or might not) mean
that
version X was faster than version Y.

This is all in concordance with the 'as if' principle which allows the
compiler and/or hardware to rearrange your code how it likes, as long as
the result is the same as if it hadn't.

(of a once-unrolled loop)


Yes. No. Sometimes. Never.

O'kay! that may be, or may not be TRUE, for generated code
or a particular implementation/hardware combinattion.

Also as per C Standard -
5.1.2.3 Program execution
1 The semantic descriptions in this International Standard describe the
behavior of an
abstract machine in which issues of optimization are irrelevant.
[-snip-]
3 In the abstract machine, all expressions are evaluated as specified by the
semantics.
An actual implementation need not evaluate part of an expression if it can
deduce that
its value is not used and that no needed side effects are produced
(including any
caused by calling a function or accessing a volatile object).

But as per language semantics, as also abstract machine states (for loop)
6.6.5.3 The for statement
1 Except for the behavior of a continue statement in the loop body, the
statement
for ( clause­1 ; expression­2 ; expression­3 ) statement
and the sequence of statements
{
clause­1 ;
while ( expression­2 ) {
statement
expression­3 ;
}
}
are equivalent (where clause­1 can be an expression or a declaration). 114
2 Both clause­1 and expression­3 can be omitted. If either or both are an
expression,
they are evaluated as a void expression. An omitted expression­2 is replaced
by a
nonzero constant.

"clause 1" will make the difference, becase it will be evaluated every time
loop is entered.

-Neo
 
I

infobahn

2 Both clause­1 and expression­3 can be omitted. If either or
both are an expression, they are evaluated as a void expression.
An omitted expression­2 is replaced by a nonzero constant.

"clause 1" will make the difference, becase it will be evaluated every time
loop is entered.

It needn't be, for the reasons you pointed out. If it can be
optimised out legally, then it won't make any difference at all.
 
T

Thomas Matthews

Lawrence said:
This is suspicious because it will loop 101 times,




and this will loop 11 times. If you expected 100 and 10 then make the
comparison a < rather than a <= . That's common and normal in C as we tend
to count from 0 a lot especially when array indexing is concerned. A 100
element array has elements indexed from 0 to 99 and trying to access an
element at position 100 would be an error.




It is impossible to say. It will depend on your compiler (and the options
you use with it), the processor used to run the code and not least on what
exactly "some code" is. If "some code" uses i and/or j then the two
samples are not equivalent and it doesn't make much sense to talk about
which is faster. That's also the case if i and/or j are used after the
loops.

Lawrence

Hi, nice to see you back, if you are the real "god".

Is the compiler allowed to optimize the two loops into one?
Given: (Example 1)
unsigned int i, j;
for (i = 0; i < 10; ++i)
{
for (j = 0; j < 100; ++j)
{
Some_Code();
}
}

According to my elementary computer science knowledge, the
statement:
Some_Code();
is executed 1000 times. I believe this would be the same
as: (Example 2)
unsigned int k;
for (k = 0; k < 1000; ++k)
{
Some_Code();
}

Is there any rule in the standard preventing the compiler from
converting Example 1 to Example 2 (i.e. factoring out the
nested loop to just one loop)?

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library
 
J

Joona I Palaste

Thomas Matthews said:
Lawrence Kirby wrote:
(snip)


Hi, nice to see you back, if you are the real "god".

Lawrence, next time someone asks you if you're a god, you say yes!

--
/-- Joona Palaste ([email protected]) ------------- Finland --------\
\-------------------------------------------------------- rules! --------/
"The day Microsoft makes something that doesn't suck is probably the day they
start making vacuum cleaners."
- Ernst Jan Plugge
 
C

CBFalconer

Joona said:
Thomas Matthews <[email protected]> scribbled the following:

(snip)


Lawrence, next time someone asks you if you're a god, you say yes!

No, ignore them, because they have insufficient faith, and are thus
doomed. True worshippers know the truth and have no doubts.
 
J

Joe Wright

CBFalconer said:
No, ignore them, because they have insufficient faith, and are thus
doomed. True worshippers know the truth and have no doubts.

Amen. There is a god and his name is fred.
 
J

Joona I Palaste

No, ignore them, because they have insufficient faith, and are thus
doomed. True worshippers know the truth and have no doubts.

You failed to get the reference, didn't you?
 
L

Lawrence Kirby

....

Hi, nice to see you back, if you are the real "god".


Err, thanks. :)
Is the compiler allowed to optimize the two loops into one?
Given: (Example 1)
unsigned int i, j;
for (i = 0; i < 10; ++i)
{
for (j = 0; j < 100; ++j)
{
Some_Code();
}
}

That depends on what is in Some_Code() and whether it uses i and j. If it
is literally a function call with no arguments as shown here then yes it
can do that. It can do anything it likes as long as the program generates
correct output when run successfully.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,159
Messages
2,570,883
Members
47,416
Latest member
ReeceDorri

Latest Threads

Top