R
Rajeev
Hello,
I'm using gcc 3.4.2 on a Xeon (P4) platform, all kinds of speed optimizations
turned on.
For the following loop
R=(evaluate here); // float
N=(evaluate here); // N min=1 max=100 median=66
for (i=0;i<N;i++){
R+=A*B*K; // all variables are float=4 bytes
}
Q.1. Is there any advantage to having the arrays A,B,C aligned to 16 bytes ?
Q.1b. If yes, I can make them aligned (non-trivial since A[1]:A[N] is part
of a much bigger array, but I can do it), but I don't know how to tell
the compiler that I have aligned these arrays. How do I do that ?
Q.2. Is there an advantage to using arrays or pointers, eg
float *pA=A,pB=B;
for (i=0;i<N;i++){
R+=(*pA++)*(*pB++)*K; // all variables are float=4 bytes
}
Q.3. Will gcc take *K out of the loop ? (It may change the single precision
computed result, eg if R starts off much bigger than the contribution.)
float RL=0;
for (i=0;i<N;i++){
RL+=A*B; // all variables are float=4 bytes
}
R+=(RL*K);
Thanks in advance for any help,
-rajeev-
I'm using gcc 3.4.2 on a Xeon (P4) platform, all kinds of speed optimizations
turned on.
For the following loop
R=(evaluate here); // float
N=(evaluate here); // N min=1 max=100 median=66
for (i=0;i<N;i++){
R+=A*B*K; // all variables are float=4 bytes
}
Q.1. Is there any advantage to having the arrays A,B,C aligned to 16 bytes ?
Q.1b. If yes, I can make them aligned (non-trivial since A[1]:A[N] is part
of a much bigger array, but I can do it), but I don't know how to tell
the compiler that I have aligned these arrays. How do I do that ?
Q.2. Is there an advantage to using arrays or pointers, eg
float *pA=A,pB=B;
for (i=0;i<N;i++){
R+=(*pA++)*(*pB++)*K; // all variables are float=4 bytes
}
Q.3. Will gcc take *K out of the loop ? (It may change the single precision
computed result, eg if R starts off much bigger than the contribution.)
float RL=0;
for (i=0;i<N;i++){
RL+=A*B; // all variables are float=4 bytes
}
R+=(RL*K);
Thanks in advance for any help,
-rajeev-