Knowing the implementation, are all undefined behaviours become implementation-defined behaviours?

M

Michael Tsang

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
pointer is defined to "crash the program with SIGSEGV".

Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
simply wrap around so we can say that the behaviour is defined to round on
x86 CPUs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkt3kjsACgkQm4klUUKw07D7QwCfQH0jkVFEDAQMi9+t31JiQ449
4QMAn2M+QxWW3yf4WShHgmWjBCluBvun
=e8V1
-----END PGP SIGNATURE-----
 
A

Alf P. Steinbach

* Michael Tsang:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
pointer is defined to "crash the program with SIGSEGV".

Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
simply wrap around so we can say that the behaviour is defined to round on
x86 CPUs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkt3kjsACgkQm4klUUKw07D7QwCfQH0jkVFEDAQMi9+t31JiQ449
4QMAn2M+QxWW3yf4WShHgmWjBCluBvun
=e8V1
-----END PGP SIGNATURE-----

Your question, from the subject line, is

"Knowing the implementation, are all undefined behaviours become
implementation-defined behaviours?"

And it's cross-posted to [comp.lang.c] and [comp.lang.c++].

At least for C++ the answer is a definite maybe: theoretically it depends on the
implementation.

In practice the answer is a more clear "no", because it's practically impossible
for an implementation to clearly define all behaviors, in particular pointer
operations and use of external libraries.



Cheers & hth.,

- Alf
 
S

Seebs

Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
pointer is defined to "crash the program with SIGSEGV".

Not necessarily.
Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
simply wrap around so we can say that the behaviour is defined to round on
x86 CPUs.

That's not rounding, that's wrapping.

But no, it's not the case. These are not necessarily *defined* -- they may
merely be typical side-effects that are not guaranteed or supported.

Modern gcc can do some VERY strange things if you write code which might
dereference a null pointer. (For instance, loops which check whether a
pointer is null may have the test removed because, if it were null, it
would have invoked undefined behavior to dereference it...)

-s
 
M

Malcolm McLean

"Undefined behaviour" doesn't mean "exists in some metaphysical state
of indefiniteness" but "the C standard imposes no requirements on the
program's behaviour (and therefore the program is incorrect)". There
was a huge thread about this a few years back on gets.

So typically derefencing null will have the same effect each time any
particular program is run, probably the same effect on any particular
platform. Derefencing a wild pointer may have different effects,
particularly on a multi-taskign machine where exact pointer vlaues
vary from runto run.
 
R

Robert Fendt

dereference a null pointer. (For instance, loops which check whether a
pointer is null may have the test removed because, if it were null, it
would have invoked undefined behavior to dereference it...)

Sorry to interrupt, but since when is checking a pointer value
for 0 the same as deferencing it? Checking a pointer treats the
pointer itself as a value, and comparison against 0 is one of
the few things that are _guaranteed_ to work with a pointer
value. So if GCC really would remove a check of the form

if(!pointer)
do_something(*pointer);

or even

if(pointer == 0)
throw NullPointerException;

then GCC would be very much in violation of the standard. And
produce absolutely useless code, as well. What's the point of
having pointers in a language if you wouldn't even be able to
perform basic operations on them?

Regards,
Robert
 
A

Alf P. Steinbach

* Richard Heathfield:
Thread's subject line: Knowing the implementation, are all undefined
behaviours become implementation-defined behaviours?

No. For example, consider a stack exploit on gets(). There are systems
on which the behaviour could be absolutely anything at all, depending on
user input!6\b$10be5c39no carrier

:)


Cheers,

- Alf
 
B

Bo Persson

Robert said:
Sorry to interrupt, but since when is checking a pointer value
for 0 the same as deferencing it? Checking a pointer treats the
pointer itself as a value, and comparison against 0 is one of
the few things that are _guaranteed_ to work with a pointer
value. So if GCC really would remove a check of the form

if(!pointer)
do_something(*pointer);

or even

if(pointer == 0)
throw NullPointerException;

then GCC would be very much in violation of the standard. And
produce absolutely useless code, as well. What's the point of
having pointers in a language if you wouldn't even be able to
perform basic operations on them?

Yes, but there are cases where the compiler can determine that the
pointer is ALWAYS null or not-null, and remove code that would execute
otherwise. For example:

*pointer = 42;
if(pointer == 0)
throw NullPointerException;

is known never to throw the exception!


Bo Persson
 
E

Ersek, Laszlo

Checking a pointer treats the
pointer itself as a value, and comparison against 0 is one of
the few things that are _guaranteed_ to work with a pointer
value.

No, evaluating an invalid pointer is undefined behavior.

{
void *p;

p = malloc(1);
free(p);
p; /* UB */
!p; /* UB */
0 != p; /* UB */
}

See the C99 Rationale 6.3.2.3 Pointers for an informative (not
normative) description.

I believe that in this paragraph:

----v----
Regardless how an invalid pointer is created, any use of it yields
undefined behavior. Even assignment, comparison with a null pointer
constant, or comparison with itself, might on some systems result in an
exception.
----^----

"any use" denotes "any evaluation", and "assignment" means "assignment
FROM the invalid pointer". I'm fairly sure the following is valid:

{
int *ip;

ip = malloc(sizeof *ip);
free(ip);
sizeof ip;
sizeof *ip;
ip = 0;
ip;
!ip;
0 != ip;
}

Cheers,
lacos
 
R

Richard Tobin

Malcolm McLean said:
Derefencing a wild pointer may have different effects,
particularly on a multi-taskign machine where exact pointer vlaues
vary from runto run.

It's not a general characteristic of multi-tasking systems that
pointer values vary from run to run. Virtual memory has traditionally
been used to give all instances of a program indistinguishable address
spaces, and addresses will usually be the same.

Recently for security reasons some operating systems have started to
deliberately randomise the locations of, for example, shared
libraries, so pointers are now more likely to vary. (Fortunately this
can usually be disabled for debugging.)

-- Richard
 
R

Robert Fendt

Yes, but there are cases where the compiler can determine that the
pointer is ALWAYS null or not-null, and remove code that would execute
otherwise. For example:

*pointer = 42;
if(pointer == 0)
throw NullPointerException;

is known never to throw the exception!

Yes, that's static optimisation. Nothing wrong with that.
However, the posting I was commenting explicitely described
something different:

This would mean nothing else than the compiler removing
nullpointer checks solely on the grounds that a nullpointer
cannot be de-referenced legally. So the compiler would see a
pointer dereference, and decide "then it can't be null anyway,
since it's used later". And that's just bull, sorry.

Yes, if there's an unconditional pointer dereference and
_afterwards_ a check for null, the compiler could take this as a
hint that said pointer has been checked for null before the first
dereference and thus remove the superfluous check. So if you had
something like this:

MyType& obj = *pointer;
if (!pointer)
threw NullPointerException;

Since the dereference happens _before_ the check, the program
has already entered the domain of undefined behaviour, and the
check is moot (even if one has not 'used' the object reference
in any other way). If the author of the previous posting meant
that, then I agree (though I have doubts whether GCC really
optimises this agressively). But in that case his comment was at
least not very clear.

Regards,
Robert
 
B

Ben Bacarisse

Robert Fendt said:
Yes, if there's an unconditional pointer dereference and
_afterwards_ a check for null, the compiler could take this as a
hint that said pointer has been checked for null before the first
dereference and thus remove the superfluous check. So if you had
something like this:

MyType& obj = *pointer;
if (!pointer)
threw NullPointerException;

Since the dereference happens _before_ the check, the program
has already entered the domain of undefined behaviour, and the
check is moot (even if one has not 'used' the object reference
in any other way). If the author of the previous posting meant
that, then I agree (though I have doubts whether GCC really
optimises this agressively).

gcc does exactly that (with certain options). I think this is the
nature a recent Linux kernel bug: http://lkml.org/lkml/2009/7/6/19

The pointer use was ever so slightly less obvious but it led gcc to
conclude that the following test could be removed.

Given the cross-post, I should say that I have no idea if gcc does
this for the exact case you cite (which is C++) but I wanted to point
out that similar things are done.

<snip>
 
R

Robert Fendt

gcc does exactly that (with certain options). I think this is the
nature a recent Linux kernel bug: http://lkml.org/lkml/2009/7/6/19

It certainly looks that way. That's a nasty bugger to spot.
Given the cross-post, I should say that I have no idea if gcc does
this for the exact case you cite (which is C++) but I wanted to point
out that similar things are done.

Yes, I did not notice this whole thread had been crossposted to
comp.lang.c; a more appropriate example would then have been a
sizeof(*pointer) or something. Since sizeof in that case relies
only on static type information, one could assume it should work
whether the pointer is null or not. But the dereference itself
already makes the whole programm ill-formed (in case of a
nullpointer).

Regards,
Robert
 
J

James Kanze

And thus spake Ben Bacarisse <[email protected]>
Sun, 14 Feb 2010 13:41:23 +0000:
It certainly looks that way. That's a nasty bugger to spot.

Either the pointer can be null, or it cannot. If it can be
null, the first unit test which tests it with null should cause
a crash. If it cannot, then the test the g++ would have
removed is superfluous, and removing it shouldn't change
anything.

There are many other cases of undefined behavior which do affect
optimizations, however. Consider an expression like: f((*p)++,
(*q)++). Given this, the compiler "knows" that p and q do not
reference the same memory (since if they did, it would be
undefined behavior), which means that in other code in the
function, the compiler might have cached *p, and knows that it
doesn't have to update or purge its cached value if there is a
write through *q.
Yes, I did not notice this whole thread had been crossposted
to comp.lang.c; a more appropriate example would then have
been a sizeof(*pointer) or something. Since sizeof in that
case relies only on static type information, one could assume
it should work whether the pointer is null or not. But the
dereference itself already makes the whole programm ill-formed
(in case of a nullpointer).

Dereferencing a null pointer is only undefined behavior if the
code is actually executed. Something like sizeof(
f(*(MyType*)0) ) is perfectly legal, and widely used in some
template idioms (although I can't think of a reasonable use for
it in C).
 
M

Malcolm McLean

Dereferencing a null pointer is only undefined behavior if the
code is actually executed.  Something like sizeof(
f(*(MyType*)0) ) is perfectly legal, and widely used in some
template idioms (although I can't think of a reasonable use for
it in C).
Nulls are dereferenced to produce the offsetof macro hack in C.
 
E

Ersek, Laszlo

Nulls are dereferenced to produce the offsetof macro hack in C.

No, they are not.

I guess you mean something like this:

#define offsetof(type, member_designator) \
((size_t)&((type *)0)->member_designator)

Let's deal first with the conversion of the final pointer to size_t:

C99 6.3.2.3 Pointers, p6: "Any pointer type may be converted to an
integer type. Except as previously specified, the result is
implementation-defined. If the result cannot be represented in the
integer type, the behavior is undefined. The result need not be in the
range of values of any integer type."

Then wrt. dereferencing the null pointer:

C99 6.6 Constant expressions, p9: "An address constant is a null
pointer, [...]; it shall be created explicitly using the unary &
operator or an integer constant cast to pointer type, or [...]. The
[...] member-access . and -> operators, the address & and indirection *
unary operators, and pointer casts may be used in the creation of an
address constant, but the value of an object shall not be accessed by
use of these operators."

Perhaps this is relevant too:

C99 6.5.3.2 Address and indirection operators, p3: "[...] If the operand
is the result of a unary * operator, neither that operator nor the &
operator is evaluated and the result is as if both were omitted, except
that the constraints on the operators still apply and the result is not
an lvalue. [...]"

Cheers,
lacos
 
B

Ben Bacarisse

James Kanze said:
On Feb 14, 1:54 pm, Robert Fendt <[email protected]> wrote:

Dereferencing a null pointer is only undefined behavior if the
code is actually executed. Something like sizeof(
f(*(MyType*)0) ) is perfectly legal, and widely used in some
template idioms (although I can't think of a reasonable use for
it in C).

For a non-literal null, it is quite common:

new_ptr = realloc(old_ptr, new_length * sizeof *new_ptr);

will work regardless of the state of new_ptr (null, well-defined or
indeterminate).

[I know you know this: I am simple illustrating the point with a
common idiom.]
 
S

Seebs

Sorry to interrupt, but since when is checking a pointer value
for 0 the same as deferencing it?

It's not.

But if you dereference a pointer at some point, a check against it can
be omitted. If, that is, that dereference can happen without the check.

So imagine something like:

ptr = get_ptr();

while (ptr != 0) {
/* blah blah blah */
ptr = get_ptr();
x = *ptr;
}

gcc might turn the while into an if followed by an infinite loop, because
it *knows* that ptr can't become null during the loop, because if it did,
that would have invoked undefined behavior.

And there are contexts where you can actually dereference a null and not
get a crash, which means that some hunks of kernel code can become infinite
loops unexpectedly with modern gcc. Until the kernel is fixed, which I
believe it has been.

-s
 
S

Seebs

Either the pointer can be null, or it cannot. If it can be
null, the first unit test which tests it with null should cause
a crash. If it cannot, then the test the g++ would have
removed is superfluous, and removing it shouldn't change
anything.

Unless you're in a context where dereferencing null exhibits the undefined
behavior of giving you access to a block of memory.
Dereferencing a null pointer is only undefined behavior if the
code is actually executed. Something like sizeof(
f(*(MyType*)0) ) is perfectly legal, and widely used in some
template idioms (although I can't think of a reasonable use for
it in C).

Implementation of offsetof(), too, although that's not exactly safe.

-s
 
B

Ben Bacarisse

Malcolm McLean said:
Nulls are dereferenced to produce the offsetof macro hack in C.

Then I would say that it is not an example of what James was talking
about. In his C++ example, no null pointer is dereferenced.

Obviously there is a terminology issue here in that you might want to
say that sizeof *(int *)0 is a dereference of a null pointer because,
structurally, it applies * to such a pointer; but I would rather
reserve the word dereference for an /evaluated/ application of * (or []
or ->). I'd go so far as to say that any other use is wrong.
 
T

Thad Smith

Michael said:
Deferencing a NULL pointer is undefined behaviour,

Actually, dereferencing a null pointer _results in_ behavior undefined by
Standard C.

In answer to your subject line question "Knowing the implementation, are all
undefined behaviours become implementation-defined behaviours?", no.

In Standard C "implementation-defined behavior" means that the implementation
documents the behavior. Even if the behavior is consistent for a particular
implementation, it may not be documented.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top