Extent of the "as-if" rule

K

Keith Thompson

Sidney Cadot said:
Another thing is that I think the standard's way of defining a "side
effect" (by enumeration of cases) is flawed. This is a bit like
defining mammals as "primates, whales, furry animals, ... (and so
on)", which works fine until you find a platypus.

A good example, only slightly damaged by the fact that platypuses do
have fur.
Surely, there has to be a more generic way of defining a side effect.

If the definition is too generic, it could include modifying memory
(which could be observed even if the program isn't running under a
debugger).
 
K

Keith Thompson

Keith Thompson said:
Side effects include "modifying a file". In a Unix filesystem, a
directory can be treated as a file; so can the physical device
containing the filesystem.

This is admittedly stretching the point.

After I posted this, I realized that a swap file or partition (if
there is one) can also be treated as a file, so modifying a variable
could conceivably "modify a file".

Intuitively, I think an fopen() call that updates a timestamp should
be considered a side effect, but modifying a non-volatile variable
shouldn't, even if it causes a write to a swap file. The trick is
figuring out how to state it.
 
D

Dan Pop

In said:
Side effects include "modifying a file".

In the context of the C standard, opening a file for read access and
closing it doesn't modify its contents. The C standard blissfully
ignores any timestamping issues.

Dan
 
D

Dan Pop

In said:
After I posted this, I realized that a swap file or partition (if
there is one) can also be treated as a file, so modifying a variable
could conceivably "modify a file".

A swap file or partition is not a file, in the sense of the C standard.
A program with no side effects can cause the contents of the swap file
to be changed by the mere fact that it is partly or totally swapped out.

Dan
 
D

Dan Pop

In said:
Jack Klein wrote:
...

No - seperate times are kept for the creation date, last access, and the
last modification.

Nope, the creation date is not stored anywhere. There are three
timestamps associated to each file:

time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last modification */
time_t st_ctime; /* time of last inode change */

The field st_atime is changed by file accesses, e.g. by
exec(2), mknod(2), pipe(2), utime(2) and read(2) (of more
than zero bytes). Other routines, like mmap(2), may or may
not update st_atime.

Note, in the context of this discussion, that open() doesn't change
st_atime, a genuine read() call is needed for that.

The field st_mtime is changed by file modifications, e.g.
by mknod(2), truncate(2), utime(2) and write(2) (of more
than zero bytes). Moreover, st_mtime of a directory is
changed by the creation or deletion of files in that
directory. The st_mtime field is not changed for changes
in owner, group, hard link count, or mode.

The field st_ctime is changed by writing or by setting
inode information (i.e., owner, group, link count, mode,
etc.).

Dan
 
N

Niklas Matthies

Jack Klein wrote:
...

No - seperate times are kept for the creation date, last access, and
the last modification.

The first one is not the creation date, but the last inode change date.
See `man 2 stat`.

-- Niklas Matthies
 
M

Michael Wojcik

Moreover, if the file name corresponds to a fifo or named-pipe,
simply opening the file for reading can have an effect.

Opening a file for reading can have an effect even for regular files.
Consider a stock SysV kernel as described by Bach. A file-open
request can result in allocation of one or more in-core inodes and
so forth, and will (if successful) alter the process's file
descriptor table and so on.

These are implementation details of which a conforming program is
unaware, but they could have consequences in the environment. So
I think the original question still stands: does the as-if rule
apply to changes the program causes in the environment which are
not visible to a conforming program?
 
M

Michael Wojcik

The point is that the change your describing (the last accessed date)
is *not* visible to a standard C program, and an implementation could
therefore claim to be conforming even if it removed the open. I think
we'd all agree that such an implementation would have serious QoI
issues.

However, it's entirely possible to imagine an implementation where a
conforming program could indirectly detect an effect of opening a
file for reading.

For example, consider an implementation which creates a file (in the
sense of "a filesystem object which can be opened using fopen") in a
temporary area for each file a program opens. These temporary files
are named using a predictable convention. A conforming program
could potentially determine how many files it had opened by attempting
to fopen (and then immediately fclose) these temporary files and seeing
how many such fopens succeeded.

That ought to work on a Linux system with the proc filesystem mounted,
for example, though I haven't actually tried it.

Such a program would get a different result before and after the
hypothetical fopen if that fopen were not optimized away. If it were
optimized away, of course, the program would get the same result
before and after the fopen. (Unless the implementation were clever
enough to understand the operation of the count-my-open-files
function and simulate the correct result - that is, extend the as-if
behavior to cover this aspect as well.)

Such a program would not be strictly conforming, since (to be useful)
it would have to produce output that depended on unspecified behavior,
but it could be conforming, as far as I can tell.
 
D

Douglas A. Gwyn

Sidney said:
Another thing is that I think the standard's way of defining a "side
effect" (by enumeration of cases) is flawed. This is a bit like defining
mammals as "primates, whales, furry animals, ... (and so on)", which
works fine until you find a platypus.
Surely, there has to be a more generic way of defining a side effect.

The problem is, there are numerous actual side effects, not all
of which are deemed essential for conformance purposes. I
alluded to some of them (timing, code size, etc.) in my previous
posting.
 
D

Dan Pop

Opening a file for reading can have an effect even for regular files.
Consider a stock SysV kernel as described by Bach. A file-open
request can result in allocation of one or more in-core inodes and
so forth, and will (if successful) alter the process's file
descriptor table and so on.

These are implementation details of which a conforming program is
unaware, but they could have consequences in the environment. So
I think the original question still stands: does the as-if rule
apply to changes the program causes in the environment which are
not visible to a conforming program?

Nope, the as-if rule is defined exclusively in terms of the C abstract
machine, where the semantics of a file are as defined in the C standard.

Dan
 
S

Sidney Cadot

It is purely pedantic because no existing compiler will remove the call
to fopen ().

That is true as far as I know. However, many functions /are/ optimized
away by modern compilers (e.g. abs(), or strlen() of a string literal),
so it isn't too far-fetched to suppose this is within reach.
On the other hand: If you are a compiler writer and you want to remove
this kind of call, then you have to _prove_ that the C Standard allows
it. If you are an application programmer and you want to make sure that
the call is not removed, then you could write

volatile FILE* p = fopen ("my file", "options);

and that will make it damned hard for the compiler writer to optimise
the call away.

It would be utterly silly if this kind of thing didn't qualify as a side
effect, wouldn't you agree?

Best regards,

Sidney
 
S

Sidney Cadot

Keith said:
A good example, only slightly damaged by the fact that platypuses do
have fur.

Ah, what a pity.
If the definition is too generic, it could include modifying memory
(which could be observed even if the program isn't running under a
debugger).

Yes. It is intuitively quite clear what "side effect" should mean, but
it's hard to put it in words. However, that's exactly what members of
the Standards comittee are for.

To me it is also intuitively clear that opening a file should count as a
side effect. The fact that it's highly questionable that this stance is
backed by the standard means, IMHO, that the standard is wrong here.

Best regards,

Sidney
 
S

Sidney Cadot

Dan said:
Nope, the as-if rule is defined exclusively in terms of the C abstract
machine, where the semantics of a file are as defined in the C standard.

As far as I can tell, the C standard does not define semantics of a
file. Surprisingly, I don't see "file" defined either.

So why not view a file as 'contents' + 'attributes'; in that case,
opening a file which may change attributes counts as "modifying the file".

Best regards,

Sidney
 
W

Wojtek Lerch

Christian Bau said:
On the other hand: If you are a compiler writer and you want to remove
this kind of call, then you have to _prove_ that the C Standard allows
it. If you are an application programmer and you want to make sure that
the call is not removed, then you could write

volatile FILE* p = fopen ("my file", "options);

and that will make it damned hard for the compiler writer to optimise
the call away.

Not if "my file" is an invalid filename and the compiler recognizes that and
generates code that sets p to NULL without calling any function.
 
C

Christian Bau

"Wojtek Lerch said:
Not if "my file" is an invalid filename and the compiler recognizes that and
generates code that sets p to NULL without calling any function.

In that case you could very well argue that the call has no side
effects, even if calling fopen would usually have side effects.
 
M

Martin Dickopp

Sidney Cadot said:
To me it is also intuitively clear that opening a file should count as a
side effect.

That's the problem with intuitions: To me it is intuitively clear that
opening a file should *not* count as a side effect (in the context of the
C language).
The fact that it's highly questionable that this stance is backed by the
standard means, IMHO, that the standard is wrong here.

I disagree. There's a reason why the scope of the C standard is limited.

In the real world, that's not a problem. Few (if any) C implementations
claim to implement ISO C and nothing else. Usually they try to follow
additional standards, some of which (e.g. POSIX) do indeed define file
timestamps, so that in the context of these standards, opening a file for
reading does have a well-defined side effect and can therefore not be
optimized away.

Martin
 
S

Stewart Brodie

Martin Dickopp said:
That's the problem with intuitions: To me it is intuitively clear that
opening a file should *not* count as a side effect (in the context of the
C language).

Section 5.1.2.3 of the standard defined what a side effect is (a change in
the execution environment). Opening a file may have some effect on the
execution environment of the program is executing, in which it is a side
effect, or it may not, in which case, it would appear not to be a side
effect.

You could have magic filenames whose contents are encoded in the filename
(like data: URLs - which never seemed to catch on, for some reason).

f = fopen("data:Hello world\n", "r");

If the fopen was successful and 'f' now points to a FILE object that
represents this pseudo-file, does this count as a change in the execution
environment, and hence a side effect? I'm not sure.
 
S

Sidney Cadot

Martin said:
That's the problem with intuitions: To me it is intuitively clear that
opening a file should *not* count as a side effect (in the context of the
C language).

I have no intuition 'in the context of the C language' about files,
whatsoever. To me handling files is the job of the operating system and
the C standard ought to be non-explicit in what are and are not side
effects when doing /anything/ with a file.

A possible approach that just occurs to me would be to allow "volatile"
as a function attribute, meaning that the function is effectively
declared to communicate with the world outside the abstract machine,
with all bets off regarding side effects (no optimizations possible).
I disagree. There's a reason why the scope of the C standard is limited.

In the real world, that's not a problem. Few (if any) C implementations
claim to implement ISO C and nothing else. Usually they try to follow
additional standards, some of which (e.g. POSIX) do indeed define file
timestamps, so that in the context of these standards, opening a file for
reading does have a well-defined side effect and can therefore not be
optimized away.

Dan Pop would probably argue that such a 'well-defined side effect' is
still not relevant to the compiler. C99 states exactly three things that
are side effects (for the compiler), and opening a file for reading is
not one of them, whether you do POSIX or not.

Best regards,

Sidney
 
M

Martin Dickopp

Sidney Cadot said:
Dan Pop would probably argue that such a 'well-defined side effect' is
still not relevant to the compiler.

If he argued that this side effect is not relevant to the C standard
conformance of the compiler, I would certainly agree with him.
C99 states exactly three things that are side effects (for the
compiler), and opening a file for reading is not one of them, whether
you do POSIX or not.

A compiler which optimized away opening a file for reading would not be
conforming to the POSIX standard. It might still be conforming to the C
standard, of course.

Martin
 
W

Wojtek Lerch

Christian Bau said:
In that case you could very well argue that the call has no side
effects, even if calling fopen would usually have side effects.

OK, that was a bad example. As a matter of fact, it only got posted because
I pushed the wrong button...

But imagine an implementation that doesn't use an OS or a real filesystem,
and its standard I/O functions operate on fake files implemented as data
structures in memory. The programmer can set up the initial contents of the
fake filesystem by putting them in a special header file that is included by
the implementation's <stdio.h>. After exit() has closed all streams, it
prints out the new contents of the fake filesystem. Is there anything in
the standard that makes it impossible for such an implementation to be
conforming?

The compiler automatically replaces a return statement in main() with a call
to exit(). The calls to fopen() and exit() get inlined, and so does the
fclose() call that exit() makes to close your stream. The optimizer notices
that the fclose() call reverses everything that your fopen() call did, and
optimizes them both out. Does this sound impossible or wrong?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,139
Messages
2,570,807
Members
47,356
Latest member
Tommyhotly

Latest Threads

Top