K
Kelsey Bjarnason
[snips]
Let's move this away from C for a moment. We'll examine two hypothetical
languages, Foo and Bar, each with a function baz(). In the standard
document for language Foo, the behaviour of function baz() is completely
and unequivocally defined. What does this tell us of the function baz()
in the language Bar?
Right, it tells us *nothing*. Why is that? Oh, yes, because we're not
defining baz as some universal cross-language function, we're defining it
solely and strictly within the context of the Foo language.
We can also ask what, if any, effect does Bar's definition of baz() have
upon the operation of baz in language Foo? The answer is simple: none at
all, because however Bar defines baz is irrelevant; all that matters when
using language Foo is *Foo's* definition of the function.
So let's define it:
1.1.3.5 The baz function
Synopsis:
uses IO;
intvar baz(nil);
Description:
If the end of file indicator of the standard input stream is not set and
a next character is present, baz obtains that character as an unsigned
charvar converted to an intvar and advances the associated file position
indicator, if defined.
If the end of file indicator of the standard input stream is set or if
the stream is at end of file, the end of file indicator for the stream is
set and the baz function returns EOF. Otherwise, the function returns
the next character from the standard input stream. If a read error
occurs, the error indicator for the stream is set and the baz function
returns EOF.
Very good. Now, could you kindly point out where, in that definition,
baz is either allow to or prevented from modifying every intvar variable
in the program by assigning them a value a third their current value?
Nothing in the definition of baz allows for such behaviour, obviously,
but then where is such behaviour prevented? Where does it say,
explicitly, that this is not an allowed behaviour?
It doesn't. Why should it? There is absolutely no need for it to
explicitly say this behaviour is not allowed, not because the behaviour
is allowed, but but because the definition of what baz does does *not*
include any allowance for such behaviour.
In short, the definition of baz tells us what it does, and by doing so,
implicitly prevents any other behaviour, such as modifying the values of
every intvar in the program.
Moreover, by intent, the definition of baz not only implicitly but
*explicitly* disallows such behaviour, as the intent of the document in
its entirety is to define, as absolutely as possible, the operation of
the functions and the language defined. By intent, the behaviour defined
by baz is complete and explicit: it does this *and no more*, and an
implementation which does something else - such as modifying the values
of every intvar in the program when baz is called - is thus non-
conforming.
There is no need for it to explicitly say this is a disallowed behaviour;
it is sufficient to define the behaviour that *is* allowed. It does so,
and that behaviour does *not* allow for the modification of the intvars
in the program.
Nor - and here's the apparent sticking point for some folks - does it
allow for the underlying OS, or another process, or some other factor to
modify all the intvars when baz is invoked. The behaviour defined is how
the implementation *must* behave in order to be conforming; no allowance
whatsoever is made for OS, other applications, thumb-fingered idiots or
acts of Zeus to alter this behaviour.
If the underlying system is sufficiently perverse that it *would* modify
all those intvars on an invocation of baz, it follows that the
implementer must take whatever steps necessary to preserve the existing
values of all the intvars before continuing with the operation of baz, as
failure to do so means baz is not operating in the defined manner.
In short, there is no allowance in the definition of baz for any
operation whatsoever other than that which is expressly and explicitly
defined. Even if the underlying OS is perverse, or even actively
hostile, it is the implementation's requirement to provide a baz which
works as defined and not include any extraneous behaviour outside the
scope of that definition.
Obviously, there will be some cases where this is in fact simply not
possible. If the machine loses power, the guarantees are off. If the
machine has faulty memory, variable values may be modified and baz is not
expected to cope with this. In the case of "normal operation", however -
that is, with proper power, properly functioning equipment and the like -
baz's definition simply does not allow for any behaviour other than what
it describes, such as modifying the values of all the intvars.
Yet, for some reason, we have people arguing that the definition of baz
*should* and *does* allow for such things. The reasoning? Apparently,
the reasoning is that such things are not expressly forbidden. Since baz
is not *expressly* forbidden to modify every intvar in the program, then
obviously this should be an acceptable behaviour.
To me that doesn't make any sense. To me that means the standard
document must explicitly and expressly cover every possible contingency.
It must, for example, contain clauses such as the following:
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the system time is 12:03:01 AM.
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the system date is 01/11/2008.
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the program's source file has 173
non-blank, non-comment lines.
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the programmer's name is "Fred".
Such a document would be impossible to create, impossible to maintain,
impossible to use, yet that is the very argument being offered here: that
if the behaviour is not explicitly disallowed, then it must be allowed.
I say this argument is specious. It is sufficient to define how the
function is intended to operate, with the understanding that any
operation other than the definition provided constitutes a failure, a non-
conformance condition in the implementation.
That sort of document is possible to create, to maintain, to use. In
fact, it is comparatively simple (not to belittle the efforts of the
committees); one defines the behaviour of the function(s) and any
behaviour other than what is defined, or expressly allowed by clauses
such as "implementation defined" and the like, is simply disallowed: it
works this way, or it is broken.
So let's look at free, as defined in the C standard. Where does it
explicitly disallow the function modifying the value of every int
variable in the program?
It doesn't. There is no such clause, no such allowance. Any
implementation which allowed this sort of behaviour would be regarded as
broken - yet there is no explicit disallowance of such behaviour. Why,
then, would such behaviour be considered broken? On what basis do we
determine that this is incorrect behaviour?
We determine it by the definition of free, which defines what it is
intended to do, a definition which does not involve any modification of
int variables in the program. It is not necessary to explicitly disallow
this behaviour, it is sufficient to define the intended behaviour, with
the understanding that any behaviour other than what is defined is not
permissible.
So, where do we see, in the definition of free, any explicit statement
that free cannot hand the memory back to the OS, for other purposes? We
don't. However, we don't need to, either, any more than we need to see a
guarantee that it won't modify all the int variables in the program. In
defining the guarantee between the implementation and the program, the
definition expressly states what the behaviour is, and that definition
does not include modifying every int variable in the program, nor does it
include rendering the memory _not_ available for further allocation by
the C program.
Yes, the wording used is inexplicit and in this manner one might argue
that "further allocation" includes "by the OS or another application",
except that the definition of free, the guarantees it makes, are not
being made for the OS or another application, they're being made for the
C program using the free function. There is no need for an explicit
statement against allowing the memory to be handed back to the OS; it is
sufficient to define the behaviour that is intended. Such behaviour *is*
defined, and it does not include handing the memory back to the OS,
unless the implementation can guarantee that the memory is available for
further allocation by the C program, the very thing it is making its
guarantees to.
That may not be the intent of the standard committee, but that is
irrelevant. It may be their intent that main only return an int, but if
the standard allows for main to return an array of long doubles, this is
a legal and legitimate behaviour unless and until the standard is
modified to say otherwise.
It may well be their intent that free does, in fact, return the memory
back to the OS, for further allocation "by whomever", and this is a
perfectly reasonable intent, even arguably the most sensible intent. It
is simply not a behaviour allowed by the wording as written, meaning that
unless and until changed, the standard simply does not allow such
behaviour.
Where in the standard is defined that "make available by further
allocation" does not only means that free has to unblock the memory now
freed? To disallow to give the memory back to the OS there must be
something in the standard that does not explicity allow that. So where
in the standadard is required that this is the case? The standard
requires so many explicity and it woud have done that for free too - but
there is nothing about that.
Let's move this away from C for a moment. We'll examine two hypothetical
languages, Foo and Bar, each with a function baz(). In the standard
document for language Foo, the behaviour of function baz() is completely
and unequivocally defined. What does this tell us of the function baz()
in the language Bar?
Right, it tells us *nothing*. Why is that? Oh, yes, because we're not
defining baz as some universal cross-language function, we're defining it
solely and strictly within the context of the Foo language.
We can also ask what, if any, effect does Bar's definition of baz() have
upon the operation of baz in language Foo? The answer is simple: none at
all, because however Bar defines baz is irrelevant; all that matters when
using language Foo is *Foo's* definition of the function.
So let's define it:
1.1.3.5 The baz function
Synopsis:
uses IO;
intvar baz(nil);
Description:
If the end of file indicator of the standard input stream is not set and
a next character is present, baz obtains that character as an unsigned
charvar converted to an intvar and advances the associated file position
indicator, if defined.
If the end of file indicator of the standard input stream is set or if
the stream is at end of file, the end of file indicator for the stream is
set and the baz function returns EOF. Otherwise, the function returns
the next character from the standard input stream. If a read error
occurs, the error indicator for the stream is set and the baz function
returns EOF.
Very good. Now, could you kindly point out where, in that definition,
baz is either allow to or prevented from modifying every intvar variable
in the program by assigning them a value a third their current value?
Nothing in the definition of baz allows for such behaviour, obviously,
but then where is such behaviour prevented? Where does it say,
explicitly, that this is not an allowed behaviour?
It doesn't. Why should it? There is absolutely no need for it to
explicitly say this behaviour is not allowed, not because the behaviour
is allowed, but but because the definition of what baz does does *not*
include any allowance for such behaviour.
In short, the definition of baz tells us what it does, and by doing so,
implicitly prevents any other behaviour, such as modifying the values of
every intvar in the program.
Moreover, by intent, the definition of baz not only implicitly but
*explicitly* disallows such behaviour, as the intent of the document in
its entirety is to define, as absolutely as possible, the operation of
the functions and the language defined. By intent, the behaviour defined
by baz is complete and explicit: it does this *and no more*, and an
implementation which does something else - such as modifying the values
of every intvar in the program when baz is called - is thus non-
conforming.
There is no need for it to explicitly say this is a disallowed behaviour;
it is sufficient to define the behaviour that *is* allowed. It does so,
and that behaviour does *not* allow for the modification of the intvars
in the program.
Nor - and here's the apparent sticking point for some folks - does it
allow for the underlying OS, or another process, or some other factor to
modify all the intvars when baz is invoked. The behaviour defined is how
the implementation *must* behave in order to be conforming; no allowance
whatsoever is made for OS, other applications, thumb-fingered idiots or
acts of Zeus to alter this behaviour.
If the underlying system is sufficiently perverse that it *would* modify
all those intvars on an invocation of baz, it follows that the
implementer must take whatever steps necessary to preserve the existing
values of all the intvars before continuing with the operation of baz, as
failure to do so means baz is not operating in the defined manner.
In short, there is no allowance in the definition of baz for any
operation whatsoever other than that which is expressly and explicitly
defined. Even if the underlying OS is perverse, or even actively
hostile, it is the implementation's requirement to provide a baz which
works as defined and not include any extraneous behaviour outside the
scope of that definition.
Obviously, there will be some cases where this is in fact simply not
possible. If the machine loses power, the guarantees are off. If the
machine has faulty memory, variable values may be modified and baz is not
expected to cope with this. In the case of "normal operation", however -
that is, with proper power, properly functioning equipment and the like -
baz's definition simply does not allow for any behaviour other than what
it describes, such as modifying the values of all the intvars.
Yet, for some reason, we have people arguing that the definition of baz
*should* and *does* allow for such things. The reasoning? Apparently,
the reasoning is that such things are not expressly forbidden. Since baz
is not *expressly* forbidden to modify every intvar in the program, then
obviously this should be an acceptable behaviour.
To me that doesn't make any sense. To me that means the standard
document must explicitly and expressly cover every possible contingency.
It must, for example, contain clauses such as the following:
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the system time is 12:03:01 AM.
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the system date is 01/11/2008.
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the program's source file has 173
non-blank, non-comment lines.
baz must not return EOF unless the standard input stream is at end of
file or an error occurs, even if the programmer's name is "Fred".
Such a document would be impossible to create, impossible to maintain,
impossible to use, yet that is the very argument being offered here: that
if the behaviour is not explicitly disallowed, then it must be allowed.
I say this argument is specious. It is sufficient to define how the
function is intended to operate, with the understanding that any
operation other than the definition provided constitutes a failure, a non-
conformance condition in the implementation.
That sort of document is possible to create, to maintain, to use. In
fact, it is comparatively simple (not to belittle the efforts of the
committees); one defines the behaviour of the function(s) and any
behaviour other than what is defined, or expressly allowed by clauses
such as "implementation defined" and the like, is simply disallowed: it
works this way, or it is broken.
So let's look at free, as defined in the C standard. Where does it
explicitly disallow the function modifying the value of every int
variable in the program?
It doesn't. There is no such clause, no such allowance. Any
implementation which allowed this sort of behaviour would be regarded as
broken - yet there is no explicit disallowance of such behaviour. Why,
then, would such behaviour be considered broken? On what basis do we
determine that this is incorrect behaviour?
We determine it by the definition of free, which defines what it is
intended to do, a definition which does not involve any modification of
int variables in the program. It is not necessary to explicitly disallow
this behaviour, it is sufficient to define the intended behaviour, with
the understanding that any behaviour other than what is defined is not
permissible.
So, where do we see, in the definition of free, any explicit statement
that free cannot hand the memory back to the OS, for other purposes? We
don't. However, we don't need to, either, any more than we need to see a
guarantee that it won't modify all the int variables in the program. In
defining the guarantee between the implementation and the program, the
definition expressly states what the behaviour is, and that definition
does not include modifying every int variable in the program, nor does it
include rendering the memory _not_ available for further allocation by
the C program.
Yes, the wording used is inexplicit and in this manner one might argue
that "further allocation" includes "by the OS or another application",
except that the definition of free, the guarantees it makes, are not
being made for the OS or another application, they're being made for the
C program using the free function. There is no need for an explicit
statement against allowing the memory to be handed back to the OS; it is
sufficient to define the behaviour that is intended. Such behaviour *is*
defined, and it does not include handing the memory back to the OS,
unless the implementation can guarantee that the memory is available for
further allocation by the C program, the very thing it is making its
guarantees to.
That may not be the intent of the standard committee, but that is
irrelevant. It may be their intent that main only return an int, but if
the standard allows for main to return an array of long doubles, this is
a legal and legitimate behaviour unless and until the standard is
modified to say otherwise.
It may well be their intent that free does, in fact, return the memory
back to the OS, for further allocation "by whomever", and this is a
perfectly reasonable intent, even arguably the most sensible intent. It
is simply not a behaviour allowed by the wording as written, meaning that
unless and until changed, the standard simply does not allow such
behaviour.