How bad it is to dereference a null pointer

A

Armen Tsirunyan

Hi all,
Consider the following code:
<CODE>
#include <iostream>

struct X
{
void f()
{
std::cout << "f() called" << std::endl;
}
};

int main()
{
X* p = 0;
p->f();
X x = *p;
x.f();
}
</CODE>

it works( on MSVC9.0), and prints f() called two times. Well,
logically it should, because in X::f() 'this' is not used and
therefore theoretically this could be null; The copy construction also
is logical to succeed because its implicit definition is empty. On the
other hand, if X had a member the copy construction X x = *p woud
fail, because the "this" of the *p would be needed.

Now, my question is,
1. does the standard (the current one) say anything about
dereferencing null pointers?
2. Does the above code result in undefined behaviour?
 
A

Alf P. Steinbach /Usenet

* Armen Tsirunyan, on 30.09.2010 17:03:
Hi all,
Consider the following code:
<CODE>
#include<iostream>

struct X
{
void f()
{
std::cout<< "f() called"<< std::endl;
}
};

int main()
{
X* p = 0;
p->f();
X x = *p;
x.f();
}
</CODE>

it works( on MSVC9.0), and prints f() called two times. Well,
logically it should, because in X::f() 'this' is not used and
therefore theoretically this could be null; The copy construction also
is logical to succeed because its implicit definition is empty. On the
other hand, if X had a member the copy construction X x = *p woud
fail, because the "this" of the *p would be needed.

Now, my question is,
1. does the standard (the current one) say anything about
dereferencing null pointers?

It mentions in non-normative text (as I recall) that it's UB to dereference a
nullpointer, in general. That's somewhere right at the start, I think in the
definitions of terms. However, even if there's no normative text that says that
outright, absence of such text is just due to being an unnecessary clarification.

The standard makes a special exception for a typeid expression, where you can
safely dereference a nullpointer.

That explicit exception wouldn't be necessary if dereferencing nullpointers
wasn't UB in general.

2. Does the above code result in undefined behaviour?

In abundance.


Cheers & hth.,

- Alf
 
A

Armen Tsirunyan

* Armen Tsirunyan, on 30.09.2010 17:03:










It mentions in non-normative text (as I recall) that it's UB to dereference a
nullpointer, in general. That's somewhere right at the start, I think in the
definitions of terms. However, even if there's no normative text that says that
outright, absence of such text is just due to being an unnecessary clarification.

The standard makes a special exception for a typeid expression, where you can
safely dereference a nullpointer.

That explicit exception wouldn't be necessary if dereferencing nullpointers
wasn't UB in general.


In abundance.

Cheers & hth.,

- Alf

Yeah, the typeid would is an exception but it says it throws, not that
it is ok. It's just defined behavior of throwing an exception. But in
sizeof, I guess it would be defined and throw nothing, right?
Anyway, what's wrong with allowing to call any member function that
doesn't use 'this' with a null pointer?
More formally, I found this
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232
Is this proposal included in C++0x?
Thanks,
Armen.
 
J

Johannes Schaub (litb)

Armen said:
Yeah, the typeid would is an exception but it says it throws, not that
it is ok. It's just defined behavior of throwing an exception. But in
sizeof, I guess it would be defined and throw nothing, right?

In "sizeof", things are not evaluated. If you don't evaluate an expression,
you will never figure out that you dereferenced a null pointer instead of a
pointer to an actual object (the critical point is that you need to have an
lvalue refer to an object or function. You can only find out whether it does
by evaluation). So yeah, it's fine to dereference a null pointer within a
sizeof expression.
 
G

Goran

* Armen Tsirunyan, on 30.09.2010 17:03:








It mentions in non-normative text (as I recall) that it's UB to dereference a
nullpointer, in general. That's somewhere right at the start, I think in the
definitions of terms. However, even if there's no normative text that says that
outright, absence of such text is just due to being an unnecessary clarification.

The question then becomes whether calling a member function on a null
object reference counts as dereferencing the object (problem being, I
would think, that the mere absence of a * or a -> does not count as
dereferencing it, e.g. with sizeof or offsetof).

Goran.
 
G

Goran

Hi all,
Consider the following code:
<CODE>
#include <iostream>

struct X
{
   void f()
   {
      std::cout << "f() called" << std::endl;
   }

};

int main()
{
   X* p = 0;
   p->f();
   X x = *p;
   x.f();}

</CODE>

it works( on MSVC9.0), and prints f() called two times. Well,
logically it should, because in X::f() 'this' is not used and
therefore theoretically this could be null; The copy construction also
is logical to succeed because its implicit definition is empty. On the
other hand, if X had a member the copy construction X x = *p woud
fail, because the "this" of the *p would be needed.

Putting standard and theory aside, what about practice? First, if
object's method does not touch object's data, why isn't it static, or
even a free function? IOW, there's a design error in your code.
Second, what if you start doing these things and then add a member to
X? IOW, there's a maintenance error in your code.

I am reacting because good practice is more important than theory/
standard; only when good prectice seems to disagree with theory/
standard there is a problem, or a place for a question. And you didn't
obey good practice ;-)

Goran.
 
A

Alf P. Steinbach /Usenet

* Goran, on 01.10.2010 06:29:
The question then becomes whether calling a member function on a null
object reference counts as dereferencing the object (problem being, I
would think, that the mere absence of a * or a -> does not count as
dereferencing it, e.g. with sizeof or offsetof).

That does not make sense.

There is no such thing as a "null object reference".

There is no such thing as "dereferencing the object".

sizeof and offsetof are not examples of "absence of a * or a ->".

So in summary, I can't make heads or tails of what you intend to communicate here.


Cheers,

- Alf
 
A

Armen Tsirunyan

Putting standard and theory aside, what about practice? First, if
object's method does not touch object's data, why isn't it static, or
even a free function? IOW, there's a design error in your code.
Second, what if you start doing these things and then add a member to
X? IOW, there's a maintenance error in your code.

I am reacting because good practice is more important than theory/
standard; only when good prectice seems to disagree with theory/
standard there is a problem, or a place for a question. And you didn't
obey good practice ;-)

Goran.

This piece of code was invented by me for pure theoretical purposes,
so don't teach me good practice here :).
Now, has anyone even looked at the link I gave? There seems to be a
proposal
to add a null lvalue thing which would be a valid thing unless you
perform
an lvalue-to-rvalue conversion to it. So, I was wondering if this
thing is
in the new c++0x standard draft or not.
Thanks,
Armen
 
G

Goran

* Goran, on 01.10.2010 06:29:





That does not make sense.

There is no such thing as a "null object reference".

Heh, true. I was thinking something along the lines of:

TYPE& r = *(TYPE*)0;

I used word "reference" because there's no "calling a member function
on a pointer". There's calling it on an object that pointer points to,
which I poorly named object reference (a __null__ one, since it's a
null pointer).
There is no such thing as "dereferencing the object".

Whoops, that's true, my bad. Should have been "dereferencing the
pointer"
sizeof and offsetof are not examples of "absence of a * or a ->".

I was thinking e.g.:

TYPE* p=0;
sizeof(*p);
offsetof(p->field);
sizeof and offsetof are not examples of "absence of a * or a ->".

My bad again, that should have been "presence", not absence.

OK, so, to correct myself...

The question then becomes whether calling a member function on a null
object __pointer__ counts as dereferencing the __pointer__ (problem
being, I would think, that the mere __presence__ of a * or a -> does
not count as dereferencing it, e.g. with sizeof or offsetof).

Sorry about the confusion.

Goran.
 
G

Goran

This piece of code was invented by me for pure theoretical purposes,
so don't teach me good practice here :).

Sorry, that really wasn't my intention. I was rather thinking "even if
this works __and__ is legal, it should not be done; why even bother
asking"?

Hence the post.

Goran.
 
A

Armen Tsirunyan

Heh, true. I was thinking something along the lines of:

TYPE& r = *(TYPE*)0;

I used word "reference" because there's no "calling a member function
on a pointer". There's calling it on an object that pointer points to,
which I poorly named object reference (a __null__ one, since it's a
null pointer).


Whoops, that's true, my bad. Should have been "dereferencing the
pointer"


I was thinking e.g.:

TYPE* p=0;
sizeof(*p);
offsetof(p->field);


My bad again, that should have been "presence", not absence.

OK, so, to correct myself...

The question then becomes whether calling a member function on a null
object __pointer__ counts as dereferencing the __pointer__ (problem
being, I would think, that the mere __presence__ of a * or a ->  does
not count as dereferencing it, e.g. with sizeof or offsetof).

Sorry about the confusion.

Goran.

You have a good point there. Calling a member function on via a
pointer which is a null pointer is not dereferencing, is it?
Well, naturally it __involves__ dereferencing if the member function
cannot be made static, but what if it can? Is the wording in the
standard clear with this respect?
 
G

Goran

You have a good point there. Calling a member function on via a
pointer which is a null pointer is not dereferencing, is it?
Well, naturally it __involves__ dereferencing if the member function
cannot be made static, but what if it can? Is the wording in the
standard clear with this respect?

No idea about any of your questions. By looking around the word
"dereference" in 03 draft, I didn't find anything of interest. But
then, I am not much of a standard-dweller.

Goran.
 
J

JaredGrubb

Heh, true. I was thinking something along the lines of:

TYPE& r = *(TYPE*)0;

I used word "reference" because there's no "calling a member function
on a pointer". There's calling it on an object that pointer points to,
which I poorly named object reference (a __null__ one, since it's a
null pointer).


Whoops, that's true, my bad. Should have been "dereferencing the
pointer"


I was thinking e.g.:

TYPE* p=0;
sizeof(*p);
offsetof(p->field);


My bad again, that should have been "presence", not absence.

OK, so, to correct myself...

The question then becomes whether calling a member function on a null
object __pointer__ counts as dereferencing the __pointer__ (problem
being, I would think, that the mere __presence__ of a * or a ->  does
not count as dereferencing it, e.g. with sizeof or offsetof).

Sorry about the confusion.

First, "0" is a valid memory address and is not treated differently by
the compiler. For example, here's a write-up on how you can make Linux
create an object at address 0, and then you can write to it and call
it and whatever else you want to do.

http://blog.ksplice.com/2010/03/null-pointers-part-i/

There is nothing special about "0"; most OS/architectures/runtime
environments will not allocate objects at 0 because it's incredibly
"helpful" to have our programs crash when they pretend to have objects
at 0 -- in almost every case, this is a programming error and bad
things are going to happen.

Second, sizeof and offsetof do not dereference anything; they compute
values from *types*. So if p points to a type P, then sizeof(*p) and
sizeof(P) are equivalent; further if p2 is another pointer of type P,
sizeof(*p2) has the same value as the other two. That value is
computed at compile-time by the compiler and the constant value is
poked into the binary code itself. So at runtime, there is no pointer
left; it's just a number.

Jared
 
L

Luc Danton

First, "0" is a valid memory address and is not treated differently by
the compiler. For example, here's a write-up on how you can make Linux
create an object at address 0, and then you can write to it and call
it and whatever else you want to do.

http://blog.ksplice.com/2010/03/null-pointers-part-i/

There is nothing special about "0"; most OS/architectures/runtime
environments will not allocate objects at 0 because it's incredibly
"helpful" to have our programs crash when they pretend to have objects
at 0 -- in almost every case, this is a programming error and bad
things are going to happen.

0 _is_ a special pointer value in C (and consequently, in C++). Please
note that I am _not_ disagreeing with you on your other points: there
OS/architecture/runtime are that behave as you describe. But consider
that C and C++ have a very much simple notion of addresses: &a is the
address of an object a and that's about it (not even sure if the
Standard uses this terminology). There are however no notion of 'address
space' and the like, and thus an 'address 0' has no meaning in C or C++.
As you pointed out however, where it is defined it might be fine; but
that's not covered by the C and C++ Standards.

Furthermore, yes, a compiler _is_ free to treat 0 differently. The C++
Standard mandates that every pointer initialized from or assigned to 0
have the same special representation (actually there is one
representation per pointer type). An implementation could behave as
follows without triggering any assert and still be conforming:

int* pint = 0; // int null pointer
char* pchar = 0; // char null pointer

intptr_t pint_rep;
std::memcpy(&pint_rep, &pint, sizeof pint);
assert( pint_rep != 0 ); // integral zero

intptr_t pchar_rep;
std::memcpy(&pchar_rep, &pchar, sizeof pchar);
assert( pchar_rep != 0 ); // integral zero
assert( pchar_rep != pint_rep );

There are plenty of good items regarding null pointers on the C FAQ:
http://c-faq.com/null/index.html

such as:
5.5 How should NULL be defined on a machine which uses a nonzero bit
pattern as the internal representation of a null pointer?
5.17 Seriously, have any actual machines really used nonzero null
pointers, or different representations for pointers to different types?
5.19 How can I access an interrupt vector located at the machine's
location 0? If I set a pointer to 0, the compiler might translate it to
some nonzero internal null pointer value.
 
J

Johannes Schaub (litb)

Luc said:
0 _is_ a special pointer value in C (and consequently, in C++). Please
note that I am _not_ disagreeing with you on your other points: there
OS/architecture/runtime are that behave as you describe. But consider
that C and C++ have a very much simple notion of addresses: &a is the
address of an object a and that's about it (not even sure if the
Standard uses this terminology). There are however no notion of 'address
space' and the like, and thus an 'address 0' has no meaning in C or C++.
As you pointed out however, where it is defined it might be fine; but
that's not covered by the C and C++ Standards.

0 is not a pointer value. It's an integer value. As he said, there is
nothing magic about "0". The null pointer conversion will yield a special
pointer value when it is converted to a pointer type.
Furthermore, yes, a compiler _is_ free to treat 0 differently. The C++
Standard mandates that every pointer initialized from or assigned to 0
have the same special representation (actually there is one
representation per pointer type).

Though I don't really see any practical use, there is no requirement in the
Standard that "every pointer initialized from or assigned to 0 have the same
special representation" even if we talk about only one pointer type. The
only requirement is that all such pointers compare equal.

The following probably makes no sense, but consider an implementation that
has signed addresses, and which uses signed magnitude as representation and
0x0 as an internal representation of a null pointer. Another interpretation
could be -0x0 (highest bit set). I think that such a scenario could very
well be possible in practice.
 
J

Johannes Schaub (litb)

Armen said:
You have a good point there. Calling a member function on via a
pointer which is a null pointer is not dereferencing, is it?
Well, naturally it __involves__ dereferencing if the member function
cannot be made static, but what if it can? Is the wording in the
standard clear with this respect?

The Standard says "If E1 has the type 'pointer to class X,' then the
expression E1->E2 is converted to the equivalent form '(*(E1)).E2'; the
remainder of 5.2.5 will address only the first option (dot)'. See 5.2.5/3.
 
J

James Kanze

On Oct 1, 6:59 am, Goran <[email protected]> wrote:

[...]
In the past (I'm not sure about C++0x), calling a non-static
member function (the expression o.f()) involves evaluating the
expression to the left of the dot. If a -> is used, it is the
equivalent of (*p).f(), so *p is evaluated (if the entire
expression is evaluated). The result of *p is an lvalue, which
must refer to an object or a function---otherwise, the behavior
is undefined.

This undefined behavior only occurs if the expression is
"evaluated". Expressions which are operands of sizeof, for
example, are never evaluated; nor are expressions in a flow path
which is never executed. (In some ways, this can be considered
"run-time" undefined behavior. A compiler can't refuse to
compile code which contains it unless it can prove that the code
will be executed for all possible input.)
First, "0" is a valid memory address and is not treated
differently by the compiler.

A "null constant expression" converts to a pointer which is
guaranteed not to point to a valid object. The expression "0"
is a null constant expression. And the compiler is required to
treat it differently.
For example, here's a write-up on how you can make Linux
create an object at address 0, and then you can write to it
and call it and whatever else you want to do.

The author of the article doesn't seem to know C (or C++). But
the issue is irrelevant with regards to C++ -- when you start
playing games with specific system functions, you enter into the
realm of undefined behavior.
There is nothing special about "0";

There is according to the language standard.
most OS/architectures/runtime environments will not allocate
objects at 0 because it's incredibly "helpful" to have our
programs crash when they pretend to have objects at 0 -- in
almost every case, this is a programming error and bad things
are going to happen.

Most OS's today will arrange things so that the null pointer
constant 0 can be the address 0; it makes things a lot easier
for the compiler. But that's neither here nor there.
Second, sizeof and offsetof do not dereference anything; they
compute values from *types*.

The expressions in sizeof and typeof (if the type is not
polymorphic) are not evaluated, so there is no problem. The
macro offsetof doesn't even take a pointer, so there's no way of
giving it a null pointer.
 
J

James Kanze

On 02/10/2010 17:56, JaredGrubb wrote:

[...]
0 _is_ a special pointer value in C (and consequently, in C++).

No. 0 is an integral constant expression, not a pointer. It
converts into any pointer type, and the results of that
conversion is a pointer which cannot point to any valid object.

Note that this is only true for integral constant expressions
evaluating to 0. Given the following:

int i = 0;
char* p1 = reinterpret_cast<char*>(i);
char* p2 = reinterpret_cast<char*>(0);
assert(p1 == p2); // May fail!

[...]
Furthermore, yes, a compiler _is_ free to treat 0 differently.

It is required to treat 0 differently.
The C++ Standard mandates that every pointer initialized from
or assigned to 0 have the same special representation
(actually there is one representation per pointer type).

Just a nit, but the standard doesn't require that. All that it
requires is that all pointers initialized with 0 compare equal.
(Of course, the easiest way for them to compare equal is for
them to have the same representation, and I can't think of any
reason why a compiler might use different representations.)
An implementation could behave as follows without triggering
any assert and still be conforming:
int* pint = 0; // int null pointer
char* pchar = 0; // char null pointer
intptr_t pint_rep;
std::memcpy(&pint_rep, &pint, sizeof pint);
assert( pint_rep != 0 ); // integral zero
intptr_t pchar_rep;
std::memcpy(&pchar_rep, &pchar, sizeof pchar);
assert( pchar_rep != 0 ); // integral zero
assert( pchar_rep != pint_rep );

Exactly.
 
J

James Kanze

Luc Danton wrote:

[...]
0 is not a pointer value. It's an integer value. As he said, there is
nothing magic about "0". The null pointer conversion will yield a special
pointer value when it is converted to a pointer type.

And that is magic:). There's nothing magic about the value 0,
in itself, but there is magic when it's an integral constant
expression evaluating to 0.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top