Loop Efficiency

M

Martin

Hi,

Just a question of efficiency really. Here is a bit of a psudo noddy
example but I was wondering about how this might work in reality;

int[][] myArray = new int[1000][1000];

for(int i = 0; i < MaxValue(myArray); i++){

.....do something

}


private int MaxValue(int[][] anArray){

int max = 0;

for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

if(anArray[x][y] > max){

max = anArray[x][y];

}

} // end y

} // end x

return max;

} // end MaxValue

Please excuse any typos, I only intend to illustrate what queries I
have.
I have the following questions about efficiency;
1) Every time the i loop is run, will it have to run the MaxValue()
method? I am expecting that it does but I am unsure.
2) I have read that comparing values from arrays are slower than
primitive type variables. Would something like this be more
appropriate for the MaxValue() method;

for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

int tempValue = anArray[x][y];

if(tempValue > max){

max = tempValue;

}

} // end y

} // end x

3) Is there a performance difference between loops, e.g. for or while
loops

I know normally these efficiency considerations dont matter when
dealing with little amounts of data etc, however I have a very large
program I am developing to deal with Satellite Imagery, so the time
taken to do some of these things grows quite considerably.

Thank you,

Martin
 
J

Joshua Cranmer

Martin said:
I have the following questions about efficiency;
1) Every time the i loop is run, will it have to run the MaxValue()
method? I am expecting that it does but I am unsure.
Yes.

2) I have read that comparing values from arrays are slower than
primitive type variables. Would something like this be more
appropriate for the MaxValue() method;

No. The slow part is array access, which can be heavily optimized if the
code undergoes JIT. Indeed, during JIT, probably only one array access
will be made.
3) Is there a performance difference between loops, e.g. for or while
loops

You will not see a performance increase by switching between a for and a
while loop.
I know normally these efficiency considerations dont matter when
dealing with little amounts of data etc, however I have a very large
program I am developing to deal with Satellite Imagery, so the time
taken to do some of these things grows quite considerably.

"Premature optimization is the root of all evil." -- Donald Knuth
"More computing sins are committed in the name of efficiency (without
necessarily achieving it) than for any other single reason - including
blind stupidity." -- W. A. Wulf

In short, don't try to micro-optimize until you know where the
bottlenecks are. You probably don't know where the bottleneck is, even
if you have good intuition.
 
D

Daniel Pitts

Martin said:
Hi,

Just a question of efficiency really. Here is a bit of a psudo noddy
example but I was wondering about how this might work in reality;

int[][] myArray = new int[1000][1000];

for(int i = 0; i < MaxValue(myArray); i++){

.....do something

}


private int MaxValue(int[][] anArray){

int max = 0;

for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

if(anArray[x][y] > max){

max = anArray[x][y];

}

} // end y

} // end x

return max;

} // end MaxValue

Please excuse any typos, I only intend to illustrate what queries I
have.
I have the following questions about efficiency;
1) Every time the i loop is run, will it have to run the MaxValue()
method? I am expecting that it does but I am unsure.
2) I have read that comparing values from arrays are slower than
primitive type variables. Would something like this be more
appropriate for the MaxValue() method;

for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

int tempValue = anArray[x][y];

if(tempValue > max){

max = tempValue;

}

} // end y

} // end x

3) Is there a performance difference between loops, e.g. for or while
loops

I know normally these efficiency considerations dont matter when
dealing with little amounts of data etc, however I have a very large
program I am developing to deal with Satellite Imagery, so the time
taken to do some of these things grows quite considerably.

Thank you,

Martin
You could always run a microbench...

However, I would suggest this form anyway:
for (int[]row: anArray) {
for (int value: row) {
if (value > max) {
max = value;
}
}
}

Much cleaner, clearer, and possibly the fastest. (Need to bench to check)
 
D

Daniel Pitts

Martin said:
Hi,

Just a question of efficiency really. Here is a bit of a psudo noddy
example but I was wondering about how this might work in reality;

int[][] myArray = new int[1000][1000];

for(int i = 0; i < MaxValue(myArray); i++){

.....do something

}

MaxValue (which should be called maxValue, BTW) will be called EVERY
time. Instead, you might consider these two idioms:

for (int i = maxValue(myArray)-1; i >=0; --i) {
doSomething(i);
}

or
final int max = maxValue(myArray);
for (int i = 0; i < max; ++i) {
doSomething(i);
}

Or, to make it a little more modular:

public void doSomethingForAllLessThan(int max) {
for (int i = 0; i < max; ++i) {
doSomething(i);
}
}

public void doSomethingForAllLessThenMax() {
doSomethingForAllLessThan(maxValue(myArray));
}
 
R

Roedy Green

1) Every time the i loop is run, will it have to run the MaxValue()
method? I am expecting that it does but I am unsure.

I think it would execute it on every time round, unless the optimiser
were smart enough to realise it had to be invariant.
--
Roedy Green Canadian Mind Products
http://mindprod.com

One path leads to despair and utter hopelessness. The other,
to total extinction. Let us pray we have the wisdom to choose correctly.
~ Woody Allen .
 
R

Roedy Green

for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

normally you would use variables i,j or row, col rather than x,y. x,y
implies physical floating point measures or pixels. This is a
convention that dates all the way back to FORTRAN.

--
Roedy Green Canadian Mind Products
http://mindprod.com

One path leads to despair and utter hopelessness. The other,
to total extinction. Let us pray we have the wisdom to choose correctly.
~ Woody Allen .
 
R

Roedy Green

2) I have read that comparing values from arrays are slower than
primitive type variables. Would something like this be more
appropriate for the MaxValue() method;

An answer you get by an experiment will be many times more accurate
than the opinion of someone on usenet.

Most of the time these things don't matter because the compilers and
optimisers are getting so smart.
--
Roedy Green Canadian Mind Products
http://mindprod.com

One path leads to despair and utter hopelessness. The other,
to total extinction. Let us pray we have the wisdom to choose correctly.
~ Woody Allen .
 
E

Eric Sosman

Roedy said:
for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

normally you would use variables i,j or row, col rather than x,y. x,y
implies physical floating point measures or pixels. This is a
convention that dates all the way back to FORTRAN.

Oh, pshaw. As everyone knows, both `i' and `j' denote
the square root of minus one, according to conventions that
date from long before FORTRAN, before even "order code."
 
A

Arne Vajhøj

Eric said:
Roedy said:
for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

normally you would use variables i,j or row, col rather than x,y. x,y
implies physical floating point measures or pixels. This is a
convention that dates all the way back to FORTRAN.

Oh, pshaw. As everyone knows, both `i' and `j' denote
the square root of minus one, according to conventions that
date from long before FORTRAN, before even "order code."

That they denote squareroot of -1 does not make it
obvious to use them as loop counters.

Being default of type integer in Fortran and being regularly
used as Matrix indices do make it obvious to use them as
loop counters.

Arne
 
N

Nigel Wade

Roedy said:
for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){

normally you would use variables i,j or row, col rather than x,y. x,y
implies physical floating point measures or pixels. This is a
convention that dates all the way back to FORTRAN.

I wouldn't call it a convention. Rather a laziness which you would hope real
FORTRAN programmers grow out of once they realize that relying on IMPLICIT
INTEGER and IMPLICIT REAL can get them into all sorts of trouble. IMPLICIT NONE
is a very useful statement in the FORTRAN language, to be used liberally in
every program unit. If nothing else, it means that the variable DO10I is
undefined...
 
A

Arne Vajhøj

Nigel said:
Roedy said:
for(int x = 0; x < anArray.length; x++){

for(int y = 0; y < anArray[0].length; y++){
normally you would use variables i,j or row, col rather than x,y. x,y
implies physical floating point measures or pixels. This is a
convention that dates all the way back to FORTRAN.

I wouldn't call it a convention. Rather a laziness which you would hope real
FORTRAN programmers grow out of once they realize that relying on IMPLICIT
INTEGER and IMPLICIT REAL can get them into all sorts of trouble. IMPLICIT NONE
is a very useful statement in the FORTRAN language, to be used liberally in
every program unit. If nothing else, it means that the variable DO10I is
undefined...

Very true.

But IMPLICIT came in Fortran 77 and IMPLICIT NONE is de facto
Fortran 77 (technically Fortran 90).

So conventions formed in the 20 years before.

Arne
 
A

Arne Vajhøj

Eric said:
Arne said:
[...]
But IMPLICIT came in Fortran 77 and IMPLICIT NONE is de facto
Fortran 77 (technically Fortran 90).

IMPLICIT is certainly older than 1977 -- I used it in
FORTRAN IV in the early 1970's and maybe the late 1960's.

According to Wikipedia then it was added to the standard
in Fortran 77.

It may very well have existed in Fortran dialects before
that - often a new standard just reflects what is
de facto standard out in the wild.
I don't recall whether IMPLICIT NONE was around then, though.

Wikipedia claims it was added to 77 as a military standard
and ANSI'fied in 90.
I *do* recall that FORTRAN II had no kind of IMPLICIT.

I don't - I started with Fortran 77.
Haven't written or read any FORTRAN in years and years,
not since it changed its name to Fortran. Probably wouldn't
recognize it nowadays, with all its facial piercings and
baggy trousers.

Some claim that 90/95/2003 are a completely new language.

Arne
 
J

John W Kennedy

Eric said:
Arne said:
[...]
But IMPLICIT came in Fortran 77 and IMPLICIT NONE is de facto
Fortran 77 (technically Fortran 90).

IMPLICIT is certainly older than 1977 -- I used it in
FORTRAN IV in the early 1970's and maybe the late 1960's.

According to Wikipedia then it was added to the standard
in Fortran 77.

It may very well have existed in Fortran dialects before
that - often a new standard just reflects what is
de facto standard out in the wild.
I don't recall whether IMPLICIT NONE was around then, though.

Wikipedia claims it was added to 77 as a military standard
and ANSI'fied in 90.
I *do* recall that FORTRAN II had no kind of IMPLICIT.

I don't - I started with Fortran 77.

I started with FORTRAN II. The IMPLICIT statement was not in the
original FORTRAN IV for the 709x, but was added by IBM to FORTRAN IV for
the S/360.

IMPLICIT NONE, however, came later.
 
A

Arne Vajhøj

A

Andreas Leitgeb

John W Kennedy said:
I started with FORTRAN II. The IMPLICIT statement was not in the
original FORTRAN IV for the 709x, but was added by IBM to FORTRAN IV for
the S/360.
IMPLICIT NONE, however, came later.

Just curious. Are you talking of IMPLICIT as the fortran statement,
to customize the default-typing rules, or as the principle of starting
letter based typing itself? Or did these two enter Fortran at the same
time, anyway?
 
P

Philipp

Hi,

Just a question of efficiency really. Here is a bit of a psudo noddy
example but I was wondering about how this might work in reality;

Although, as many have repeatedly pointed out, premature optimization
is a bad thing, devising intelligent algorithms is a good thing. For
the sake of it, I have benchmarked your idea on my machine (full code
see below).

The JIT does _not_ a good job at optimizing the max() method away
although the array is untouched. Here are some results:

C:\test>"C:\Program Files\Java\jdk1.6.0_02\bin\java.exe" -server -
version
java version "1.6.0_02"
Java(TM) SE Runtime Environment (build 1.6.0_02-b06)
Java HotSpot(TM) Server VM (build 1.6.0_02-b06, mixed mode)

C:\test>"C:\Program Files\Java\jdk1.6.0_02\bin\java.exe" -server
ArraySpeedTest
Start
Algo 1 took 24828ms for 10000 iterations
Algo 2 took 0ms for 10000 iterations
Algo 1 took 25484ms for 10000 iterations
Algo 2 took 0ms for 10000 iterations
Algo 1 took 26422ms for 10000 iterations
Algo 2 took 0ms for 10000 iterations
Algo 1 took 25000ms for 10000 iterations
Algo 2 took 0ms for 10000 iterations

If _you_ know that the array will not be modified by your methods or
only in a way which does not change its maximum, it is much more
efficient to not call max() repeatedly in the loop.

In case you will modify the array in the loop, there is a solution. As
you probably know, you can often make a CPU / Memory trade-of. In this
case, you could for example implement a class ArrayWithMax which holds
the array as a private member and also has an int member which
indicates the maximum. Using setters and keeping the array completely
private (clone on get of none-primitive values), this max value could
be kept updated throughout the life time of the class. Thus obtaining
the max of the array results in a simple get on a class field. Fast!


The JIT often comes up as a magical beast which can optimize code
aggressively. Can anyone recommend some reading on what and how the
hotspot optimizes?

Phil


----- code for benchmark below -----

import java.util.Random;

public class ArraySpeedTest {

private int[][] array = new int[1000][1000];
private static final Random rnd = new Random();

public static void main(String[] args) {
ArraySpeedTest ast = new ArraySpeedTest();
ast.run();
}

private void run(){
System.out.println("Start");
makeData();
iterations();
iterations2();
iterations();
iterations2();
iterations();
iterations2();
iterations();
iterations2();
}

private void makeData(){
for(int[] row: array){
for(int i = 0; i < row.length; i++){
row = rnd.nextInt(10000);
}
}
// defining the max to 10000
array[rnd.nextInt(array.length)][rnd.nextInt(array[0].length)] =
10000;
}

private int max(int[][] array){
int max = Integer.MIN_VALUE;
for (int[]row: array) {
for (int value: row) {
if (value > max) {
max = value;
}
}
}
return max;
}

private void iterations(){
long start = System.currentTimeMillis();

int count = 0;
for (int i = 0; i < max(array); i++) {
count++;
}
System.out.println("Algo 1 took "+(System.currentTimeMillis()-
start)+"ms for " + count + " iterations");
}

private void iterations2(){
long start = System.currentTimeMillis();

int count = 0;
int max = max(array);
for (int i = 0; i < max; i++) {
count++;
}
System.out.println("Algo 2 took "+(System.currentTimeMillis()-
start)+"ms for " + count + " iterations");
}
}
 
T

Tom Anderson

In case you will modify the array in the loop, there is a solution. As
you probably know, you can often make a CPU / Memory trade-of. In this
case, you could for example implement a class ArrayWithMax which holds
the array as a private member and also has an int member which indicates
the maximum. Using setters and keeping the array completely private
(clone on get of none-primitive values), this max value could be kept
updated throughout the life time of the class. Thus obtaining the max of
the array results in a simple get on a class field. Fast!

Hmm. What's the setter - do you set the entire array at once, or
individual elements? If the latter, this class is harder to implement than
you might think.

Consider the array {1, 2, 3}. The maximum is 3. I set element 0 to 4. The
maximum is now 4. So far, so good. I then set element 0 to 1. How does the
array work out what its new maximum element is?

In a simple implementation, every time you set the value of the current
largest element to something smaller, you have to do a complete sweep of
the array. You might think you can save yourself a bit of work in that
case by tracking the two, three, or N largest elements, but that just
shifts the problem to dealing with updates to the smallest tracked
element.

You could track every element of the array in size order, using a sorted
list, but that obviously involves doubling your storage use. You could
also track them using some kind of heap structure, which would use just as
much storage but be faster.

Is there any way to maintain both the array order and the size order at
once? I've just come across something called a treap:

http://en.wikipedia.org/wiki/Treap

Which might do the job if you use the array index as the tree key,
although i'm not sure. Even then, lookup is O(log n) instead of O(1), with
a pretty bad constant too.

In short, computing the max by doing a scan before using it repeatedly is
probably the best idea!

Actually, going back to the ArrayWithMax idea, another thing you could do
would be to compute the maximum semi-lazily: track the maximum as far as
you can, but if the maximum element gets reduced, just mark the maximum as
unknown. Then, when someone asks for it, do a full scan and record it
again. As long as reduction to the maximum element is rare, this will be
very fast, and in the worst case, it's no worse than doing a full scan
when you need it. So forget all that fancy data structure stuff and try
that!
The JIT often comes up as a magical beast which can optimize code
aggressively. Can anyone recommend some reading on what and how the
hotspot optimizes?

That's a really good question. Sadly, not one i have an answer to.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top