Paging Fault and Loops order

E

elvira_wang

Hi,
i was studying the virtual memory and the page fault concept in
particularly [see,
http://www.cs.man.ac.uk/~rizos/CS2051/2001-02/lect11.pdf ]

Considering this concept and the two versions code given below. Assume
that the matrix "a" is stored row by row , and considering that the
program is given only 1024 frames, version A below leads to 1024*1024
page fault whereas version B leads to only 1024 Page Fault, thus run
faster.

I have never paid attention to this issue. as a programmer do we have
to? is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B

thanks

int a[1024][1024];
/* version A */
for (j=0; j<1024; j++)
for (i=0;i<1024; i++)
A[j]=0;


/* version B */
for (i=0; i<1024; i++)
for (j=0;j<1024; j++)
A[j]=0;
 
P

Pascal Bourguignon

Hi,
i was studying the virtual memory and the page fault concept in
particularly [see,
http://www.cs.man.ac.uk/~rizos/CS2051/2001-02/lect11.pdf ]

Considering this concept and the two versions code given below. Assume
that the matrix "a" is stored row by row , and considering that the
program is given only 1024 frames, version A below leads to 1024*1024
page fault whereas version B leads to only 1024 Page Fault, thus run
faster.

I have never paid attention to this issue. as a programmer do we have
to? is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B

In standard C, I think the compiler is prohibed to switch from one
version to another. A very good C compiler could provide a warning on
version A.

int a[1024][1024];
/* version A */
for (j=0; j<1024; j++)
for (i=0;i<1024; i++)
A[j]=0;


/* version B */
for (i=0; i<1024; i++)
for (j=0;j<1024; j++)
A[j]=0;



--
__Pascal Bourguignon__ http://www.informatimago.com/

ATTENTION: Despite any other listing of product contents found
herein, the consumer is advised that, in actuality, this product
consists of 99.9999999999% empty space.
 
W

Walter Roberson

i was studying the virtual memory and the page fault concept in
particularly [see,
http://www.cs.man.ac.uk/~rizos/CS2051/2001-02/lect11.pdf ]
Considering this concept and the two versions code given below. Assume
that the matrix "a" is stored row by row , and considering that the
program is given only 1024 frames, version A below leads to 1024*1024
page fault whereas version B leads to only 1024 Page Fault, thus run
faster.
I have never paid attention to this issue. as a programmer do we have
to?
Yes.

is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B

Not usually, no.

*Some* compilers might be able to do it in the special case
of assigning a constant: in particular, some compilers designed
to compile into parallel operations must be able to use
their "code motion" and "strength reduction" analysis to determine
how to minimize the cache effects. A compiler that was smart about
cache effects might even introduce a padding column so that
a[x][0] and a[x+1][0] are on different cache lines. Compilers
are *allowed* to be smart about such things, but by no means
should you assume it.


Consider this situation:

double a[1024][1024] already initialized to something

/* version C*/
double total = 0.; for (j=0;j<1024;j++) for (i=0;i<1024;i++) total += A[j];

/* version D*/
double total = 0.; for (i=0;i<1024;i++) for (j=0;j<1024;j++) total += A[j];

This cannot be reordered, not unless the compiler can pre-determine that
all of the values are the same or that the matrix is symmetric. If
it cannot determine one of those conditions, then it must do the work
in the order specified, because adding doubles in a different order
can lead to different round-off errors.

The parallel compilers I mentioned earlier would usually have options
that control the amount of latittude they have for reordering operations
where the reordering might end up with a different result due to
round-off.
int a[1024][1024];
/* version A */
for (j=0; j<1024; j++)
for (i=0;i<1024; i++)
A[j]=0;

/* version B */
for (i=0; i<1024; i++)
for (j=0;j<1024; j++)
A[j]=0;
 
J

Joe Pfeiffer

I have never paid attention to this issue. as a programmer do we have
to? is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B

I've never seen a compiler smart enough to reorder loops to reduce
page faults in any cases whatever. I can tell you from personal
experience that if you got it wrong, the big old disk drives on a DEC
VAX 11/780 really would start moving around on the floor.
 
G

Greg Lindahl

is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B

Yes, this is a common optimization in modern optimizing compilers.

-- greg
 
M

Maxim Yegorushkin

Greg said:
Yes, this is a common optimization in modern optimizing compilers.

How is this optimization called?

Does gcc do that? If so, which command option turn it on/off?
 
A

Alex Fraser

Pascal Bourguignon said:
i was studying the virtual memory and the page fault concept in
particularly [see,
http://www.cs.man.ac.uk/~rizos/CS2051/2001-02/lect11.pdf ]

Considering this concept and the two versions code given below. Assume
that the matrix "a" is stored row by row , and considering that the
program is given only 1024 frames, version A below leads to 1024*1024
page fault whereas version B leads to only 1024 Page Fault, thus run
faster.

I have never paid attention to this issue. as a programmer do we have
to? is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B

In standard C, I think the compiler is prohibed to switch from one
version to another. A very good C compiler could provide a warning on
version A.

A compliant C compiler can do anything it likes provided it is impossible to
tell the difference between what the standard specifies the result must be
and the actual result. In the case the OP mentioned:
int a[1024][1024];
/* version A */
for (j=0; j<1024; j++)
for (i=0;i<1024; i++)
A[j]=0;

/* version B */
for (i=0; i<1024; i++)
for (j=0;j<1024; j++)
A[j]=0;


(I ignore the fact the above mistakenly uses 'A' instead of 'a')

If, as is likely, assigning 0 to an int sets all bits to zero, either of
these versions /could/ be compiled to the same as:

memset(a, 0, sizeof a);

I don't know if any compilers are "clever enough" to realise this though.

Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top