G
Gernot Frisch
Hi,
I have 2 C code snippets that prodcue the same result. However A is 2x
faster than B on my PC (x86) but 1.5x slower on my PDA (strongARM @
206MhZ)
// Startup conditions + types
pSrc = new unsigned short[320*240];
pDst = new unsigned short[320*240];
register unsigned short x, y, *ldst;
short xptch = 320, yptch = -1;
dst = pDst + 319;
src = pSrc;
// A:
(unsigned long*) pDisplay = (unsigned long*)dst;
for(x=0; x<240; x++)
{
for(y=0; y<160; y++)
{
*pDisplay++ = (*(src-240)<<16) | *(src); // Process 4 bytes at
once
src-=480;
}
src+=76801; // (320*240+1); // Get a row ahead+320 lines down to
the bottom
}
// B:
for (y = 0; y < 320; y++ )
{
ldst = dst; // Get current line address
for (x = 0; x < 240; x++ )
{
*(ldst) = *src++; // one pixel right on src
ldst += xptch; // add a pixel to the right on dest
}
dst += yptch; // add a line to dst buffer
}
Can someone explain it to me. An better: How to make this really fast?
Using ASM? I need an optimized version for an ARM processor.
Example B shows what it does obviously, I think.
Thank you in advice,
--
-Gernot
int main(int argc, char** argv) {printf
("%silto%c%cf%cgl%ssic%ccom%c", "ma", 58, 'g', 64, "ba", 46, 10);}
________________________________________
Looking for a good game? Do it yourself!
GLBasic - you can do
www.GLBasic.com
I have 2 C code snippets that prodcue the same result. However A is 2x
faster than B on my PC (x86) but 1.5x slower on my PDA (strongARM @
206MhZ)
// Startup conditions + types
pSrc = new unsigned short[320*240];
pDst = new unsigned short[320*240];
register unsigned short x, y, *ldst;
short xptch = 320, yptch = -1;
dst = pDst + 319;
src = pSrc;
// A:
(unsigned long*) pDisplay = (unsigned long*)dst;
for(x=0; x<240; x++)
{
for(y=0; y<160; y++)
{
*pDisplay++ = (*(src-240)<<16) | *(src); // Process 4 bytes at
once
src-=480;
}
src+=76801; // (320*240+1); // Get a row ahead+320 lines down to
the bottom
}
// B:
for (y = 0; y < 320; y++ )
{
ldst = dst; // Get current line address
for (x = 0; x < 240; x++ )
{
*(ldst) = *src++; // one pixel right on src
ldst += xptch; // add a pixel to the right on dest
}
dst += yptch; // add a line to dst buffer
}
Can someone explain it to me. An better: How to make this really fast?
Using ASM? I need an optimized version for an ARM processor.
Example B shows what it does obviously, I think.
Thank you in advice,
--
-Gernot
int main(int argc, char** argv) {printf
("%silto%c%cf%cgl%ssic%ccom%c", "ma", 58, 'g', 64, "ba", 46, 10);}
________________________________________
Looking for a good game? Do it yourself!
GLBasic - you can do
www.GLBasic.com