Loop Optimization, Array Alignment

R

Rajeev

Hello,

I'm using gcc 3.4.2 on a Xeon (P4) platform, all kinds of speed optimizations
turned on.

For the following loop
R=(evaluate here); // float
N=(evaluate here); // N min=1 max=100 median=66
for (i=0;i<N;i++){
R+=A*B*K; // all variables are float=4 bytes
}

Q.1. Is there any advantage to having the arrays A,B,C aligned to 16 bytes ?

Q.1b. If yes, I can make them aligned (non-trivial since A[1]:A[N] is part
of a much bigger array, but I can do it), but I don't know how to tell
the compiler that I have aligned these arrays. How do I do that ?

Q.2. Is there an advantage to using arrays or pointers, eg
float *pA=A,pB=B;
for (i=0;i<N;i++){
R+=(*pA++)*(*pB++)*K; // all variables are float=4 bytes
}

Q.3. Will gcc take *K out of the loop ? (It may change the single precision
computed result, eg if R starts off much bigger than the contribution.)

float RL=0;
for (i=0;i<N;i++){
RL+=A*B; // all variables are float=4 bytes
}
R+=(RL*K);

Thanks in advance for any help,

-rajeev-
 
M

Mark A. Odell

(e-mail address removed) (Rajeev) wrote in

I'm using gcc 3.4.2 on a Xeon (P4) platform, all kinds of speed
optimizations turned on.

For the following loop
R=(evaluate here); // float
N=(evaluate here); // N min=1 max=100 median=66
for (i=0;i<N;i++){
R+=A*B*K; // all variables are float=4 bytes
}

Q.1. Is there any advantage to having the arrays A,B,C aligned to 16
bytes ?


Might be but that's not a C issue, it's platform-specific and off-topic in
comp.lang.c.
Q.1b. If yes, I can make them aligned (non-trivial since A[1]:A[N] is
part
of a much bigger array, but I can do it), but I don't know how to
tell the compiler that I have aligned these arrays. How do I do
that ?

Q.2. Is there an advantage to using arrays or pointers, eg
float *pA=A,pB=B;
for (i=0;i<N;i++){
R+=(*pA++)*(*pB++)*K; // all variables are float=4 bytes
}

Shouldn't be but that's not a C issue, it's platform-specific and
off-topic in comp.lang.c.
Q.3. Will gcc take *K out of the loop ? (It may change the single
precision
computed result, eg if R starts off much bigger than the
contribution.)

float RL=0;
for (i=0;i<N;i++){
RL+=A*B; // all variables are float=4 bytes
}
R+=(RL*K);


This is a gcc question and off-topic in comp.lang.c
 
D

Dan Pop

In said:
I'm using gcc 3.4.2 on a Xeon (P4) platform, all kinds of speed optimizations
turned on.

If these details are relevant to your questions, cross-posting to
comp.lang.c was a gross mistake.

Dan
 
P

Paul Hsieh

I'm using gcc 3.4.2 on a Xeon (P4) platform, all kinds of speed optimizations
turned on.

For the following loop
R=(evaluate here); // float
N=(evaluate here); // N min=1 max=100 median=66
for (i=0;i<N;i++){
R+=A*B*K; // all variables are float=4 bytes
}

Q.1. Is there any advantage to having the arrays A,B,C aligned to 16 bytes ?


The Intel compiler might be assisted by such an alignment, because it
can use the packed SSE vector instructions to implement this
operation. I am not aware of any other x86 based compiler that can
automatically vectorize like this.
Q.1b. If yes, I can make them aligned (non-trivial since A[1]:A[N] is part
of a much bigger array, but I can do it), but I don't know how to tell
the compiler that I have aligned these arrays. How do I do that ?

You're probably right, you can't. Even the Intel compiler relies on
deduction to know that an array or pointer is aligned. It will not be
able to deduce it from attempts to hack the array offset to fit the
alignment.
Q.2. Is there an advantage to using arrays or pointers, eg
float *pA=A,pB=B;
for (i=0;i<N;i++){
R+=(*pA++)*(*pB++)*K; // all variables are float=4 bytes
}

No. If there is an advantage to doing it one way or another, the
compiler should be good enough to do the transformation from one form
to the other internally.
Q.3. Will gcc take *K out of the loop ? (It may change the single precision
computed result, eg if R starts off much bigger than the contribution.)

float RL=0;
for (i=0;i<N;i++){
RL+=A*B; // all variables are float=4 bytes
}
R+=(RL*K);


No. The compiler (regardless of which one) can't do this. This is
actually numerically different from your original loop. You need to
do this manually as shown here in order to leverage the operation
count reduction optimization. If the variables were integers, then in
theory a compiler could perform the optimization as you have done it.
 
P

pete

Rajeev wrote:
Q.2. Is there an advantage to using arrays or pointers, eg
float *pA=A,pB=B;
for (i=0;i<N;i++){
R+=(*pA++)*(*pB++)*K; // all variables are float=4 bytes
}

You can simplify the loop counting.

i = N;
while (i-- != 0) {
R += *pA++ * *pB++ * K;
}
 
R

Rajeev

[email protected] (Paul Hsieh) wrote in message news: said:
Q.3. Will gcc take *K out of the loop ? (It may change the single precision
computed result, eg if R starts off much bigger than the contribution.)

float RL=0;
for (i=0;i<N;i++){
RL+=A*B; // all variables are float=4 bytes
}
R+=(RL*K);


No. The compiler (regardless of which one) can't do this. This is
actually numerically different from your original loop. You need to
do this manually as shown here in order to leverage the operation
count reduction optimization. If the variables were integers, then in
theory a compiler could perform the optimization as you have done it.


Paul and Pete,

Thank you both for your informative responses. Trying to do optimization
there's just so many things one can play with and try, it really helps a
non-expert like myself to get clarity on even a few issues, so I can focus
on others.

Regards,
-rajeev-
 
P

pete

kal said:
Why not the following?

T = 0;
i = N;
while (i-- != 0) {
T += *pA++ * *pB++;
}
R += T * K;

That seems fine to me.
I'll restate the original conditions:
For the following loop
R=(evaluate here); // float
N=(evaluate here); // N min=1 max=100 median=66
for (i=0;i<N;i++){
R+=A*B*K; // all variables are float=4 bytes
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
EmeliaBryc

Latest Threads

Top