Null reference

D

David W

I'm almost tearing my hair out. A colleague claimed that a null reference can exist, like
this:

void f( int& p )
{
printf( "%d\n", p );
}

int main (int argc, char *argv[])
{
int*p= NULL;

f( *p );

return 0;
}

I pointed him to the relevant part of the C++ standard that forbids this in a well-defined
program, but he countered that in the "real world" a reference can be "null". An argument
then ensued in which he managed to raise concerns with two other colleagues because the
crash in the above code is likely to occur where the reference is used, which could be
anywhere, not where the null pointer is dereferenced. One (who has had little experience
with C++) even said that references "shouldn't be used in safety critical code". The other
became concerned because he'd always assumed that a reference has to refer to a valid
object, and now he wonders whether he should test for a "null reference", at least with an
assert, wherever a function takes a reference parameter.

I've tried everything I can think of to return to some sanity, but to no avail. I've told
them:
- That the problem in the code above is not the reference but the dereference of a null
pointer, which is a normal bug that everyone knows to avoid.
- That I can't recall actually coming across a "null reference" bug in a real program in
over a decade of using C++
- That there are a million things you can do that can cause undefined behaviour, so why be
specifically concerned about this one?
- That there are a multitude of ways that a bug can manifest itself long after the code
that caused it executes (e.g., keep a pointer to an object that is subsequently deleted),
so why be specifically concerned about this one?

They are still not convinced that the "null reference" is not a potential problem that
deserves some attention, and I'm looking for ideas as to what else I can say to get it off
the radar completely, where it belongs (that's if the people here agree with me of
course).

David
 
A

Alf P. Steinbach

* David W:
I'm almost tearing my hair out. A colleague claimed that a null reference can exist, like
this:

void f( int& p )
{
printf( "%d\n", p );
}

int main (int argc, char *argv[])
{
int*p= NULL;

f( *p );

return 0;
}

I pointed him to the relevant part of the C++ standard that forbids this in a well-defined
program, but he countered that in the "real world" a reference can be "null". An argument
then ensued in which he managed to raise concerns with two other colleagues because the
crash in the above code is likely to occur where the reference is used, which could be
anywhere, not where the null pointer is dereferenced. One (who has had little experience
with C++) even said that references "shouldn't be used in safety critical code". The other
became concerned because he'd always assumed that a reference has to refer to a valid
object, and now he wonders whether he should test for a "null reference", at least with an
assert, wherever a function takes a reference parameter.

I've tried everything I can think of to return to some sanity, but to no avail. I've told
them:
- That the problem in the code above is not the reference but the dereference of a null
pointer, which is a normal bug that everyone knows to avoid.
- That I can't recall actually coming across a "null reference" bug in a real program in
over a decade of using C++
- That there are a million things you can do that can cause undefined behaviour, so why be
specifically concerned about this one?
- That there are a multitude of ways that a bug can manifest itself long after the code
that caused it executes (e.g., keep a pointer to an object that is subsequently deleted),
so why be specifically concerned about this one?

They are still not convinced that the "null reference" is not a potential problem that
deserves some attention, and I'm looking for ideas as to what else I can say to get it off
the radar completely, where it belongs (that's if the people here agree with me of
course).

It's formally undefined behavior to dereference a nullpointer.

But it can happen, and the result of using that reference is that Bad
Things Happen.

Another way to obtain an invalid reference is to destroy an object that
some other part of the code holds a reference to (the simplest way to do
this is to return a reference to a local variable), which yields a
dangling reference.

References don't buy you technical safety: null-references and dangling
references can exist -- but null-references can't exist in a valid
program, and dangling references can't be used in a valid program.

Instead of technical safety references buy you simplicity and clear
/communication of intent/, i.e. this is intended to never be null, plus
the possibility of unified notation (e.g. indexing, which can be applied
in template code), all which in turn buys you safety and productivity.

When your function is passed a null-reference or dangling reference you
know that it's an error in the calling code. When a null-pointer or
dangling pointer occurs you don't necessarily know that it's an error in
the calling code. Perhaps you need to handle null-pointers (and the
result is messy checking and deciding what to do or not in that case,
which complicates things, and leads to more of the same, more bugs).

Summing up, your first colleague was right that null-references can
exist, but not that they can exist in a valid program. And you were
right that that situation almost never occurs in practice. Because
apart from simpler notation, the point of a reference is to communicate
that it's intended to never be null or otherwise invalid, so nobody will
try to set it to null-reference.

Your colleague who maintained that references shouldn't be used in
safety-critical code got it exactly backwards.
 
D

David W

Alf P. Steinbach said:
It's formally undefined behavior to dereference a nullpointer.

But it can happen, and the result of using that reference is that Bad
Things Happen.

Another way to obtain an invalid reference is to destroy an object that
some other part of the code holds a reference to (the simplest way to do
this is to return a reference to a local variable), which yields a
dangling reference.

References don't buy you technical safety: null-references and dangling
references can exist -- but null-references can't exist in a valid
program, and dangling references can't be used in a valid program.

Instead of technical safety references buy you simplicity and clear
/communication of intent/, i.e. this is intended to never be null, plus
the possibility of unified notation (e.g. indexing, which can be applied
in template code), all which in turn buys you safety and productivity.

Good point.
When your function is passed a null-reference or dangling reference you
know that it's an error in the calling code. When a null-pointer or
dangling pointer occurs you don't necessarily know that it's an error in
the calling code. Perhaps you need to handle null-pointers (and the
result is messy checking and deciding what to do or not in that case,
which complicates things, and leads to more of the same, more bugs).

Summing up, your first colleague was right that null-references can
exist,

Well, actually this started when I asked an interviewee what the differences were between
pointers and references, and he said that one difference was that a reference couldn't be
null. After the interview my colleague claimed that he was wrong. In the context of the
interview - pure C++ questions without considering the usual implementation on actual
compilers - I would say that my colleague was wrong.
but not that they can exist in a valid program. And you were
right that that situation almost never occurs in practice. Because
apart from simpler notation, the point of a reference is to communicate
that it's intended to never be null or otherwise invalid, so nobody will
try to set it to null-reference.

Good point.
Your colleague who maintained that references shouldn't be used in
safety-critical code got it exactly backwards.

Thank you.

David
 
F

Frederick Gotham

David W posted:

They are still not convinced that the "null reference" is not a
potential problem that deserves some attention, and I'm looking for
ideas as to what else I can say to get it off the radar completely,
where it belongs (that's if the people here agree with me of course).


You are correct. Your colleagues are wrong. Plain and simple.

(1) It's undefined behaviour to dereference a null pointer.
(2) It's undefined behaviour to have a null reference.

What you and your colleagues need to debate is NOT whether a null reference
is a good thing, but rather whether you want to write portable, Standard
C++-compliant code.

At the moment, it is NOT portable, Standard C++-compliant code.

If you want some sort of null reference maybe try something like:


struct AlignedByte {

char unsigned * const p;

AlignedByte() : p(new char unsigned) {}

~AlignedByte() { delete p; }

} aligned_byte;

template<class T>
inline T &NullRef()
{
return reinterpret_cast<T&>(*aligned_byte.p);
}

template<class T>
inline bool IsNull(T &ref)
{
return reinterpret_cast<char unsigned const*>(&ref) == aligned_byte.p;
}

/* Here comes the usage demonstration code */

#include <string>
using std::string;

void SomeFunc(string &arg)
{
if (IsNull(arg)) return;

arg += "success";
}

int main()
{
string obj = "doctor";

SomeFunc(obj);

SomeFunc( NullRef<string>() );
}
 
F

Frederick Gotham

Frederick Gotham posted:
string obj = "doctor";


Opps, I was typing quickly and got carried away. I would never use that form
of initialisation for a class type, but rather:

string obj("doctor");
 
M

Michiel.Salters

Frederick said:
David W posted:


You are correct. Your colleagues are wrong. Plain and simple.

(1) It's undefined behaviour to dereference a null pointer.
(2) It's undefined behaviour to have a null reference.

No, there is no such thing a a "null reference" so it cannot be
undefined
behavior to have one. There are no such things as invisible pink
unicorns
in ISO C++ either, and therefore they aren't undefined behavior either.

Once you have undefined behavior in a C++ program, it stops being a
program that you can describe in standard C++ terms. "reference" in
this context is definitely a standard term, and doesn't apply anymore.
The best way to describe what actually happens at that point is
assembly terminology. E.g. one can talk about a SEGV.

HTH,
Michiel Salters
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

David said:
void f( int& p )
{
printf( "%d\n", p );
}

int main (int argc, char *argv[])
{
int*p= NULL;

f( *p );

return 0;
}
in safety critical code". The other became concerned because he'd always
assumed that a reference has to refer to a valid object, and now he
wonders whether he should test for a "null reference", at least with an
assert, wherever a function takes a reference parameter.

A better solution can be:

void f (int * p)
{
assert (p != 0); // Or throw an exception, or both
int & ref= * p;
(....)
}

With constructs like that, isolate the parts of the code that uses
references from the parts where pointers can be used without care.

Of course, is better to never use pointers without care.
 
N

Noah Roberts

David said:
They are still not convinced that the "null reference" is not a potential problem that
deserves some attention, and I'm looking for ideas as to what else I can say to get it off
the radar completely, where it belongs (that's if the people here agree with me of
course).

This is just something I've come to expect in the world of
proffessional programming. I've often run into brick walls in
arguments I think should be cut and dry. For instance whether to make
use of exceptions (argument against is that they are "slow" and cost
extra even when not fired - yet we have to use the stdlib so we pay all
costs anyway...I digress), or making use of boost and the stdlib (I
used to have to justify every use of the stdlib classes vector and
string to the lead - less and less now but it is still annoying..we
still don't use boost because we don't use "3rd party libraries" (ask
me how many wheels I've invented)).

I've just grown to expect that people have their own oppinions and
quite often those oppinions are based on outdated facts or just plain
ignorance to the real facts. It's not worth fighting beyond explaining
the facts and letting the descision get made. If they want to pay you
to go around and change all references to pointers and place asserts
then so be it. If they want to cripple their ability to write secure
code to get secure code then it's not your place to do otherwise unless
you are the lead; just like I'm often crippling the compiler's ability
to optimize and being slow to develop by reinventing "fast" code all
the damn time. I try to convince them that I'm not smarter than the
compiler writers or the people who wrote our implementation's stdlib
but they consistantly expect me to be and consistantly keep me from
using good abstractions because they have some idea that they won't be
efficient - no, no profiling occurs.

When someone asks you why your code is inefficient and error prone you
can explain that you could have done better if you weren't stuck with a
crippled version of the language...assuming you can show this. Just
document every case where you could have done a better or faster job
given the right tools. You might also consider getting your resume
together again before you get a bunch of bad habits in you; I've been
putting mine together as I want to find some place where there are
people I can learn from and I feel I have reached an upper limit of
learning at my current job.

Teamwork is nice. I like working in teams instead of alone; for the
most part I like the team I am in. But sometimes you have to do stupid
things because that is what the team decides...even though that might
just be one or two people with bigger salaries than you and the rest of
the team. So, I've said my piece and you seem to have said yours. The
other side's argument might be dumb, and in this case it certainly is,
but there isn't much that can be done...they don't seem convincable in
either of our situations. That's life.
 
N

Noah Roberts

Noah said:
they don't seem convincable in
either of our situations. That's life.

Of course, you could ask them how you are supposed to overload
operators without using references. If they want to remove references
then you can't and you can mention the crippling of the entire std
library (most containers and algorithms will be unusable). If they
want to use references only where it has to be done then ask them what
steps will need be taken to assure that their "safety concerns" are
addressed in those rather common cases. Once that is done ask them why
they don't just use those steps anywhere a reference is used instead of
making policy that they are never used.

By making them face the fact that they have to use references or
reinvent the entire standard library you should be a long way toward
winning them over. Then by making them explain what would be needed to
make references safe they will likely convince themselves that they
already are and you'll start checking your pointers before turning them
into references.

If that doesn't work, nothing will.
 
D

David W

Frederick Gotham said:
David W posted:




You are correct. Your colleagues are wrong. Plain and simple.

(1) It's undefined behaviour to dereference a null pointer.
(2) It's undefined behaviour to have a null reference.

What you and your colleagues need to debate is NOT whether a null reference
is a good thing,

We aren't debating that. No one wants to use a null reference. The debate is about the
possibility of creating one by accident and whether any effort should be made in
preventing it or detecting it. I have no problem with an assert to ensure that a pointer
is not null where it's dereferenced in places where the preceding code doesn't guarantee
that it's not null. But I have a big problem with wasting time and thought on references
as possibly being null or otherwise invalid. It's just too unlikely to bother with. If you
worry about null references then there are thousands of other equally unlikely potential
bugs you should worry about as well.

David
 
N

Noah Roberts

David said:
We aren't debating that. No one wants to use a null reference. The debate is about the
possibility of creating one by accident and whether any effort should be made in
preventing it or detecting it. I have no problem with an assert to ensure that a pointer
is not null where it's dereferenced in places where the preceding code doesn't guarantee
that it's not null. But I have a big problem with wasting time and thought on references
as possibly being null or otherwise invalid. It's just too unlikely to bother with. If you
worry about null references then there are thousands of other equally unlikely potential
bugs you should worry about as well.

The point is that there is only one way to create a null reference and
that is to dereference a null pointer. It is the dereferencing of the
null pointer that is the problem and at such a point the code is
_already_ generating undefined behavior. The test for null should be
at the point where the bug occurs, at the dereference of the pointer,
NOT at the reference. For one thing, at the later point the entire
thing is in an undefined state so the test for null is pretty much
pointless....it could pass and then proceed to blow up and still be
standard conformant.

Any program that has defined behavior will have a valid, non-null,
reference. Any program that has a null reference has been in an
undefined state for longer than that reference has existed. Catching a
null reference does not help you find the bug because, as was stated by
your oposition, the bug is somewhere else. Catch it where it happens,
not where you feel its effects.

In other words, where is the bug?

int * x = 0;
int & y = *x;
y = 5;

Now, insert a test where it _belongs_:

int * x = 0;
assert ( x );
int & y = *x;
y = 5;
 
D

David W

Noah Roberts said:
Of course, you could ask them how you are supposed to overload
operators without using references.

We haven't even got onto to discussing where you can't do without references. Only one
person has recommended not using references (so far), and he's a firmware guy who rarely
writes C++ at all. I hope it doesn't go any further.
If they want to remove references
then you can't and you can mention the crippling of the entire std
library (most containers and algorithms will be unusable). If they
want to use references only where it has to be done then ask them what
steps will need be taken to assure that their "safety concerns" are
addressed in those rather common cases.

I imagine they will want to do something like:
void f( const int &n )
{
assert(&n);
// ...
}

It might look simple and harmless, but what bothers me is the thinking that leads one to
want to do this. One of the advantages of using a reference is not having to worry about
checking for null! I'm also concerned about what other influences this thinking might
have, such as possibly having a policy of not using references unless absolutely necessary
for fear of someone forgetting the assert. It also makes an assumption about the
compiler's implementation of references (an entirely reasonable one I admit, but I'm still
not comfortable with it).

(I hasten to point out that the code in my first post was my colleague's "proof" that a
null reference is possible; I wouldn't have used printf to make a C++ point!)
Once that is done ask them why
they don't just use those steps anywhere a reference is used instead of
making policy that they are never used.

By making them face the fact that they have to use references or
reinvent the entire standard library you should be a long way toward
winning them over. Then by making them explain what would be needed to
make references safe they will likely convince themselves that they
already are and you'll start checking your pointers before turning them
into references.

If that doesn't work, nothing will.

Thanks for your input.

David
 
D

David W

Noah Roberts said:
The point is that there is only one way to create a null reference and
that is to dereference a null pointer. It is the dereferencing of the
null pointer that is the problem and at such a point the code is
_already_ generating undefined behavior. The test for null should be
at the point where the bug occurs, at the dereference of the pointer,
NOT at the reference.

Exactly. I made this point in the strongest possible terms at the outset, by quoting the
standard:
(8.3.2.4) ... A reference shall be initialized to refer to a valid object or function.
[Note: in particular, a null reference cannot exist in a well-defined program, because the
only way to create such a reference would be to bind it to the "object" obtained by
dereferencing a null pointer, which causes undefined behavior. ...]
For one thing, at the later point the entire
thing is in an undefined state so the test for null is pretty much
pointless....it could pass and then proceed to blow up and still be
standard conformant.

Yes, but my colleagues are thinking about our practical circumstances. They know how the
compiler works and are more concerned with that than with what the standard says, which
they regard as theoretical and not reflecting reality. They know the program will work as
expected until the reference is used. They are right, so it's difficult to argue against
this.
Any program that has defined behavior will have a valid, non-null,
reference. Any program that has a null reference has been in an
undefined state for longer than that reference has existed. Catching a
null reference does not help you find the bug because, as was stated by
your oposition, the bug is somewhere else. Catch it where it happens,
not where you feel its effects.

In other words, where is the bug?

int * x = 0;
int & y = *x;
y = 5;

Now, insert a test where it _belongs_:

int * x = 0;
assert ( x );
int & y = *x;
y = 5;

I agree. They might argue that you can't guarantee that everyone will remember the assert
every time, but if I'm lucky perhaps that won't occur to them.

David
 
F

Frederick Gotham

David W posted:
Exactly. I made this point in the strongest possible terms at the
outset, by quoting the standard:
(8.3.2.4) ... A reference shall be initialized to refer to a valid
object or function. [Note: in particular, a null reference cannot exist
in a well-defined program, because the only way to create such a
reference would be to bind it to the "object" obtained by dereferencing
a null pointer, which causes undefined behavior. ...]

Beautiful.



Yes, but my colleagues are thinking about our practical circumstances.
They know how the compiler works and are more concerned with that than
with what the standard says, which they regard as theoretical and not
reflecting reality. They know the program will work as expected until
the reference is used. They are right, so it's difficult to argue
against this.


OK, so their code works on a given platform -- BRILLIANT.

As far as the Standard is concerned, it's non-portable.

Any decision to continue using "null references" is a conscious decision to
flout the Standard (unless you plan on submitting a proposal... ?)

So what's more important:

(1) Using null references.
(2) Abiding by the Standard.
 
I

Ian Collins

David said:
The point is that there is only one way to create a null reference and
that is to dereference a null pointer. It is the dereferencing of the
null pointer that is the problem and at such a point the code is
_already_ generating undefined behavior. The test for null should be
at the point where the bug occurs, at the dereference of the pointer,
NOT at the reference.


Exactly. I made this point in the strongest possible terms at the outset, by quoting the
standard:
(8.3.2.4) ... A reference shall be initialized to refer to a valid object or function.
[Note: in particular, a null reference cannot exist in a well-defined program, because the
only way to create such a reference would be to bind it to the "object" obtained by
dereferencing a null pointer, which causes undefined behavior. ...]

For one thing, at the later point the entire
thing is in an undefined state so the test for null is pretty much
pointless....it could pass and then proceed to blow up and still be
standard conformant.


Yes, but my colleagues are thinking about our practical circumstances. They know how the
compiler works and are more concerned with that than with what the standard says, which
they regard as theoretical and not reflecting reality. They know the program will work as
expected until the reference is used. They are right, so it's difficult to argue against
this.
What exactly does the working of this particular version of you compiler
have to do with cresting a null reference?

They would be better off preventing the condition arising than
attempting to detect it. The standard isn't theoretical, there is only
one way to create a null reference. Just ensure you have comprehensive
unit tests.
 
D

David W

Ian Collins said:
What exactly does the working of this particular version of you compiler
have to do with cresting a null reference?

They expect that references will be implemented as pointers in all future versions, and
that assigning a reference to a dereferenced pointer will not actually perform the
dereference, and they are almost certainly right. That's the problem, really. There's no
other reasonable implementation and they know it.
They would be better off preventing the condition arising than
attempting to detect it. The standard isn't theoretical, there is only
one way to create a null reference. Just ensure you have comprehensive
unit tests.

Yes, checking at the point of dereference should be sufficient.

David
 
G

Guest

Frederick said:
David W posted:




You are correct. Your colleagues are wrong. Plain and simple.

(1) It's undefined behaviour to dereference a null pointer.
(2) It's undefined behaviour to have a null reference.

Just a nit, but isn't it possible to get a (platform-specific) null
reference without dereferencing a null pointer like this?

#include <cstring>
int main() {
struct { int &r, m; } s = { s.m };
std::memset(&s, '\0', sizeof(s));
/* s.r is probably a null reference now, even if you can't portably
rely on that */
// ...
}

And if you do it like this, and don't ever use s.r in any way, at what
point is the behaviour undefined?
 
A

Alf P. Steinbach

* Harald van Dijk:
Just a nit, but isn't it possible to get a (platform-specific) null
reference without dereferencing a null pointer like this?

#include <cstring>
int main() {
struct { int &r, m; } s = { s.m };

No, s is not of aggregate type. But that's just a nit... ;-) Put in a
default constructor and leave out the initializer.

std::memset(&s, '\0', sizeof(s));
/* s.r is probably a null reference now, even if you can't portably
rely on that */
// ...
}

And if you do it like this, and don't ever use s.r in any way, at what
point is the behaviour undefined?

Undefined behavior is undefined even if a compiler defines it.

Naturally, because otherwise there would be /no/ undefined behavior: any
construct that has formally undefined behavior results in some concrete
behavior on a concrete system, and that concrete behavior is a
definition of the behavior on that system.

However, a great many people have difficulty understanding this. For
example, there is an ongoing thread in [comp.std.c++] right now, where
this is the subject of a sub-thread. I think my explanation above, a
reduction to the absurd of the opposite position, is so simple that
anybody should understand it, but if not, I'm sure others will step in.
 
N

Noah Roberts

David said:
Yes, checking at the point of dereference should be sufficient.

It is, quite frankly, the only valid way. Just look at the hackery
required to check a "null" reference. You can't say assert( ref ).
You have to say assert( &ref ). This is just plain nonsensical. Can
they guarantee that will always work the way the expect and doesn't
cause undefined behavior of its own? Remember, the program is
_already_ in an undefined state if that assert fails and God only knows
what kind of behavior you will experience. There are things an
implementation could do that would cause you to crash before the assert
ever gets a chance to fire.

You don't assert that a reference is in a real location just like you
don't assert that a variable is in a real location. If your team is
really concerned that there could be a lot of null references in their
programs then they are writing erroneous code and not using pointers
correctly. There are NUMEROUS steps they should be taking already that
would make a null reference _impossible_ to occur and if they are
worried, then they are not taking those steps.

Your team is depending on undefined behavior _when_they_don't_have_to_.
It is already more correct and well-defined to assert a pointer's
validity while it is a pointer and this is when it should be being
done. Instead your team wishes to rely on undefined behavior to check
that pointer's validity later, when it is no longer a pointer??!!

You should be filling out your resume. Get out of that place.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top