C
Chris Dollin
Ben said:Chris Dollin said:Ben said:Because I have a liking for denotational semantics, given a C program
X which has some UB elements in it, I think E[X] = _|_ [1].
UB isn't bottom -- that's a different sense of undefined.
Not entirely different. I know it is somewhat different which is why
I only said that this is how I *think* about UB. The standard says
what UB means (not what programs having UB mean, of course, just what
the term means) and I am not suggesting an alternative. I am saying
how I think of it and why I think that is scary enough.
(To see why, consider our old friend `i = i++;`. If UB was bottom, then
this expression would compute bottom; it would not terminate; any
implementation that gave any value to `i` and continued would be
non-conformant.
E[X] = _|_ does not (always) mean X does not terminate. In the lambda
calculus, non-termination is the most common way to get _|_, but
there are others (E["tail []"] = _|_ in most semantics, for example).
I'd have thought that in /most/ semantics, `tail []` is an error value
or invokes a continuation.
Machines implement bottom as non-termination; I'm almost completely
sure of this, but it has been a long time.
To see what it means in the fullest sense, one has to see what you do
without it. Formulations of denotation semantics without _|_ are
possible but they reply heavily on the theory of partial functions.
In other words the "meaning" functions end up having nothing to say at
all about certain programs. _|_ is "thereof we must be silent" in set
theory. It makes all the functions total by having a symbol for
silence.
Yes. (It does more than this, of course, but it does /at least/ that.)
To capture C's notion of UB, where /any/ behaviour
is legal, I think you'd need to do something different again, perhaps
represent answers as sets of values, typical with one element, "the"
value, but possible several, eg capturing different evaluation orders.
Then an actual implementation would be legal if it computed any of the
answers in the set, and UB would allow the set to contain all possible
values /of the allowed types/.
Which, since demons aren't in the sets used to describe C, means that
UB cannot result in nasal demons, just as DB /can/ -- because demons,
nasal or otherwise, are not part of /the C abstract machine/. No?
)
This is exactly the problem. If you try to specify "anything" you
have to specify the universal set for the domain of discourse. What
is the total set of behaviours?[1]
Whatever the domain of answers says it is: you can choose.
The C standard prescribes the behaviour of the C abstract machine.
Behaviour outside that machine is invisible to it. Invisible behaviour
is not constrained by the standard at all.
If you include just those reachable through some C abstract machine
(no matter how non-deterministic) you will end up specifying the UB.
OK, it will be a very lax spec (maybe even "all program objects and
streams will be in an indeterminate state")
In my formulation, it would be "all states are in the set of allowable
states". I'd have to go and digest the standard to see if "indeterminate"
was like a value or a constraint on a value.
but you will not satisfy the nasal fans -- although I'd be quite happy
with it.
I'm beginning to believe that nasality, being invisible to the standard,
isn't a true (even if good) capturing of UB!
Consider this question: which part of the standard prohibits `i += 1;`
from causing a demon to fly out of your nose?