Testing if a pointer is valid

I

ImpalerCore

Writing them in-line would be more error-prone.  Making them functions
lets you give each operation a name.

Right, encapsulating the operation as a function assigns a semantic
name to an underlying operation.  It is a textbook case of information
hiding.  The value of the routine is that if the internal structure
name changes for any reason, the name change only needs to be done to
the access routines, not everywhere in the code.  It can provide more
readability in the general case where you may find 'struct->what->does->this->mean'.  The compiler is typically clever enough in most cases

to optimize the function call away.
How would a macro be better than an inline function?

I find a macro to be more beneficial when one wants compiler
granularity to enable/disable the function.  For access routines to a
data structure, definitely not.

Actually, that's not entirely true. I use macros to access generic
containers with void* where the type information is passed on to
another function that uses 'sizeof (type)'.

\code snippet
void* gc_array_front( struct c_array* array )
{
void* p = NULL;

if ( array->size ) {
p = array->buffer;
}

return p;
}

#define c_array_front( array, type ) ( (type*)
(gc_array_front( (array) )) )
\end snippet

If I store the 'sizeof (type)' in the array structure itself, it's
possible to assert that the internal element size assigned when the
array was allocated matches the size of the type used in the access
function.

\code snippet
void* gc_array_front( struct c_array* array, size_t size )
{
void* p = NULL;

c_return_value_if_fail( array != NULL, NULL );
c_return_value_if_fail( size == array->element_size, NULL );

if ( array->size ) {
p = array->buffer;
}

return p;
}

#define c_array_front( array, type ) ( (type*)
(gc_array_front( (array), sizeof (type) )) )
\endcode

This can generate a constraint violation for c_array_front( ar, char )
if the original container array contains objects of type 'double', but
it cannot distinguish between invalid casting between 'int' and
'float' (if 'sizeof (int) == sizeof (float)').

It's a very special case though :)
 
J

jacob navia

Le 19/09/11 23:30, Ben Bacarisse a écrit :
I doubt that very much. Computers are predictable beasts, and you will
have these numbers dotted around in memory. They could get copied by
buggy code or they may simply be left lying around in deceptive places:

mytype *f(void)
{
mytype t;
/* ... */
return&t;
}

A function that uses the return from f may well find the magic number
still in place.


Sure, but most compilers nowadays emit a warning when they see such a
code.

Look I do NOT have a solution for finding ALL bugs. If I would
I would be multimillionaifre already and wouldn't be here...

I am not saying the technique has no merit

Thanks. That is all I am saying actually. It is NOT *THE*
silver bullet

(I happen to have been
looking at some code that does exactly this a couple of days ago) but
the probability of false positives is very unlikely to be as low as you
hope.

It is very low, for well chosen magic numbers WITH versions of free()
that clean up the released memory.
 
I

Ian Collins

Actually, that's not entirely true. I use macros to access generic
containers with void* where the type information is passed on to
another function that uses 'sizeof (type)'.

\code snippet
void* gc_array_front( struct c_array* array )
{
void* p = NULL;

if ( array->size ) {
p = array->buffer;
}

return p;
}

#define c_array_front( array, type ) ( (type*)
(gc_array_front( (array) )) )
\end snippet

If I store the 'sizeof (type)' in the array structure itself, it's
possible to assert that the internal element size assigned when the
array was allocated matches the size of the type used in the access
function.

\code snippet
void* gc_array_front( struct c_array* array, size_t size )
{
void* p = NULL;

c_return_value_if_fail( array != NULL, NULL );
c_return_value_if_fail( size == array->element_size, NULL );

if ( array->size ) {
p = array->buffer;
}

return p;
}

#define c_array_front( array, type ) ( (type*)
(gc_array_front( (array), sizeof (type) )) )
\endcode

This can generate a constraint violation for c_array_front( ar, char )
if the original container array contains objects of type 'double', but
it cannot distinguish between invalid casting between 'int' and
'float' (if 'sizeof (int) == sizeof (float)').

It's a very special case though :)

Sometimes somewhat disparagingly (but accurately) known as "poor man's
templates" :)
 
I

Ike Naar

Le 19/09/11 22:50, Ike Naar a ?crit :

Well, the chances of hitting a 64 bit number by chance for a well chosen
magic number ar 1 in 2^64...

Quite low really

Can you give some details about how your magic numbers are chosen?
Can they cope with, for instance, the following situation:

SomeType *a = malloc(10000 * sizeof *a);
if (a != NULL)
{
SomeType *p = a + 5000; /* p now a good pointer */
*p = SomeGoodValue(); /* good data and good magic number
at location p */
free(a);
*p; /* should fail, because p is now a bad pointer; will it fail?
has free() modified the data or magic number at location p? */
/* ... */
*p; /* what if the data at location p has become garbled, but the
magic number is has not? will you accept the bad data? */
}

or perhaps this situation:

SomeType a[1], b[2]; /* suppose a+2 and b+1 are
the same memory location */
b[1] = SomeGoodValue(); /* good value and magic number at b[1] */
a[2]; /* dereferencing bad pointer, but will that be detected?
after all, there's good data and a good checksum there */
 
J

jacob navia

Le 20/09/11 00:43, Ike Naar a écrit :
Can you give some details about how your magic numbers are chosen?
Can they cope with, for instance, the following situation:

SomeType *a = malloc(10000 * sizeof *a);
if (a != NULL)
{
SomeType *p = a + 5000; /* p now a good pointer */
*p = SomeGoodValue(); /* good data and good magic number
at location p */
free(a);
*p; /* should fail, because p is now a bad pointer; will it fail?
has free() modified the data or magic number at location p? */
/* ... */
*p; /* what if the data at location p has become garbled, but the
magic number is has not? will you accept the bad data? */
}

Most versions of free() destroy the memory contents. If they don't they
should be replaced ASAP at least in a debug setting.
or perhaps this situation:

SomeType a[1], b[2]; /* suppose a+2 and b+1 are
the same memory location */
b[1] = SomeGoodValue(); /* good value and magic number at b[1] */
a[2]; /* dereferencing bad pointer, but will that be detected?
after all, there's good data and a good checksum there */

Sorry but I do not have a solution for fixing all possible bugs. If I
would I would be richer than the owners of Google.


What do you expect? Case two is not really a library problem: you
pass it a correct pointer to a correct type, it can't possible
fix that bug!
 
B

Ben Bacarisse

jacob navia said:
Le 19/09/11 23:30, Ben Bacarisse a écrit :

Sure, but most compilers nowadays emit a warning when they see such a
code.

No compiler can emit a warning in all such cases. Obviously I did not
want to suggest that this simple function was a problem -- it's just
illustrative of the fact that the chance of seeing a valid magic number
is not anything like as low as 1 in 2**64.
Look I do NOT have a solution for finding ALL bugs. If I would
I would be multimillionaifre already and wouldn't be here...

Quite. I was commenting on your suggestion about the probabilities, and
making no comment about it not finding all bugs.

<snip>
 
J

Jorgen Grahn

Just as a note, I have found that using a void * pointer as a handle
is not all that great an idea. Better is to use a small struct that
is type specific. Then you can type check and, with a small bit of
thought, do type specific sanity checks. YMMV.

(Or "incomplete types", or whatever they are called, when you deal with
pointers to 'struct Foo' without having seen the definition of struct
Foo.)

Yes -- I'd be much more annoyed by a library which has a weak
interface with respect to types and constness, than by a library
which doesn't check that my pointers go somewhere meaningful.

/Jorgen
 
I

Ian Collins

Most versions of free() destroy the memory contents. If they don't they
should be replaced ASAP at least in a debug setting.

Really? I've yet to see one that does, the overhead would be dramatic
for large blocks. Not to mention knowing how to link a different free
from the standard library for debugging.

I did write an allocator that invalidated the memory segment on free,
which was an extremely effective (and efficient) debug aid. It was
limited by the 386 LDT to 4096 concurrent allocations, but that was
plenty in the application in question.
 
K

Keith Thompson

Ian Collins said:
Really? I've yet to see one that does, the overhead would be dramatic
for large blocks. Not to mention knowing how to link a different free
from the standard library for debugging.
[...]

Experiment shows that the glibc implementation of free() doesn't
modify the contents of the released memory. (Of course the behavior
of my experimental program is undefined.)
 
K

Kleuskes & Moos

Kleuskes & Moos said:
On 9/18/2011 20:32, Keith Thompson wrote:
(e-mail address removed) (Richard Harter) writes: [...]
Any language in which it is possible for a pointer to be invalid but
which provides no way to test for validity is fundamentally flawed.

What existing language has pointers and is not "fundamentally
flawed"?

Indeed. Testing a pointer by simple inspection can tell you whether it
is null or not. If it is null it is not valid. If not you simply can't
tell.

I actually consider NULL a valid value for a pointer, and of course
checking for NULL pointers, if called for, is not what i had in mind.

The phrase "valid pointer" is context-dependent.

Aye. But "valid value for a pointer" is not.
NULL is not valid as
the operand of the unary "*" operator; it's perfectly valid as the
argument of free().

True, but that's not what i said. 0 is a valid value for an integer, while
it's an invalid (rhs) operand for '/' and '%'.

-------------------------------------------------------------------------------
______________________________________
/ over in west Philadelphia a puppy is \
\ vomiting ... /
--------------------------------------
\
\
___
{~._.~}
( Y )
()~*~()
(_)-(_)
-------------------------------------------------------------------------------
 
K

Kleuskes & Moos

Le 19/09/11 19:42, Kenneth Brody a écrit :
Well, it depends where the point of failure is.

Suppose your library is expecting a pointer to "a" and is passed a
pointer to "b", a similar syructure but with slightly different layout.

1) The program crashes deep inside the library when processing the
pointer data. You have a stack trace that tells you:

0x488776AAD()
0x446655FF4()
MyFunction() myfunction.c 476
main() main.c 876

Now what?

Point the debugger (by JTAG if need be) at the stackframe containing
'MyFunction' and check the arguments used against the parameters required *).
2) When you pass a bad pointer the library opens a dialog box:
"Invalid argument to function Library_input(), argument two"

If the library already invoked undefined behavior (by crashing)
the state of your program is also undefined and opening a dialog
box may well result in another crash, thus hiding the details of
the original crash and preventing debugging most effectively.

I would, personally, object to using any library that does that.
You see?
Nope.

A crash isn't always a good thing.

A crash is never a good thing, but it's preferable to continuing
in an undefined state.

*) simple... The _real_ nasties require whipping out logic-analyzer
and squinting at the traffic on the address and data buses and/or
other data-lines, but they don't usually involve prefab libs. If you're
_really_ unlucky, there's only an old-fashioned oscilloscope or even two
leds available. Been there, done that and earned a decent living doing it.

-------------------------------------------------------------------------------
_________________________________________
/ ... If I had heart failure right now, I \
\ couldn't be a more fortunate man!! /
-----------------------------------------
\
\
___
{~._.~}
( Y )
()~*~()
(_)-(_)
-------------------------------------------------------------------------------
 
I

Ike Naar

Sorry but I do not have a solution for fixing all possible bugs. If I
would I would be richer than the owners of Google.

That sounds a lot more realistic than believing 64-bit
magic numbers reduce the chances of false positives to 2^-64.
 
J

jacob navia

Le 20/09/11 09:16, Kleuskes & Moos a écrit :
Point the debugger (by JTAG if need be) at the stackframe containing
'MyFunction' and check the arguments used against the parameters required *).

Yes, sure. You see two arguments. Are the pointers bad? The first comes
from an argument to this function, where were that created?

Up the stack.

Get the docs for the call to the library. Ahhh the two arguments
look ok... Maybe this is the consequence of a bug I corrected yesterday.

Let's come back to the situation before those changes.

Pull that version. Ahh Crash also, it wasn't that.
Let's look better at those documentation for that dammed library.

Ahhhh the arguments are inversed damm it!

I found it after 1 lost day. Well, that is developing.


If the library already invoked undefined behavior (by crashing)
the state of your program is also undefined and opening a dialog
box may well result in another crash, thus hiding the details of
the original crash and preventing debugging most effectively.

No. There isn't (precisely) any crash. The library cleanly reports a
failure in your program.
I would, personally, object to using any library that does that.


Nope.


Of course, you like software that crashes like that. OK.
A crash is never a good thing, but it's preferable to continuing
in an undefined state.

Maybe, maybe not. It depends on the importance of the library in the
software.

For instance, if you are doing a GPS appplication, a crash in the
graphics module that scrolls the map is not that important,
the GPS can continue to work... The main application is a module
that warns the driver if he/she goes beyond the speed limit for the
road. The map is just "to be nice".
 
K

Kleuskes & Moos

Le 20/09/11 09:16, Kleuskes & Moos a écrit :
Yes, sure. You see two arguments. Are the pointers bad? The first comes
from an argument to this function, where were that created?

How should I know? It's your example and I'm no psychic.
Up the stack.

Get the docs for the call to the library. Ahhh the two arguments look
ok... Maybe this is the consequence of a bug I corrected yesterday.

In that case you haven't fixed it.
Let's come back to the situation before those changes.

Pull that version. Ahh Crash also, it wasn't that. Let's look better at
those documentation for that dammed library.

Ahhhh the arguments are inversed damm it!

I found it after 1 lost day. Well, that is developing.

True, and if that took you a full day, don't apply for a job in the
firm i work for.
No. There isn't (precisely) any crash. The library cleanly reports a
failure in your program.

The premisse of your example was "The program crashes deep inside the library when processing the
pointer data". You're shifting goalposts, unless you have secretly invented some mathemagical way
of detecting bad pointers. I'm sure everybody would be greatly interested.
Of course, you like software that crashes like that. OK.

Yes, for the reasons already mentioned upthread several times.
Maybe, maybe not. It depends on the importance of the library in the
software.

Always. The one thing worse than a crash is an undefined state in which
your software might do *anything*. In one of the applications i worked at,
it could kill an unsuspecting person, while a crash would cause a
watchdog-timer to trigger, restarting the app and signaling the problem.
For instance, if you are doing a GPS appplication, a crash in the
graphics module that scrolls the map is not that important, the GPS can
continue to work... The main application is a module that warns the
driver if he/she goes beyond the speed limit for the road. The map is
just "to be nice".

So? Place the GPS-code in one process and the map in another, use a
watchdog-timer to restart. Software separated logically, problem solved.

Now consider your approach in a pace-maker going haywire due to your continuing
in an undefined state and placing random pulses on the atriums and ventricles.
I'm very sure neither the patient, nor the MD, your boss or even you will be
very happy with the result*).

So, no. You approach is anything but desirable.

*) in fact, there's a last ditch electronic defense prohibiting pulses being sent
too fast, but that's no absolute guarantee. Ill timed pulses may still provoke a
cardiac arrest or other serious problems.

-------------------------------------------------------------------------------
___________________________________
/ HELLO KITTY gang terrorizes town, \
\ family STICKERED to death! /
-----------------------------------
\
\
___
{~._.~}
( Y )
()~*~()
(_)-(_)
-------------------------------------------------------------------------------
 
M

Malcolm McLean

Always. The one thing worse than a crash is an undefined state in which
your software might do *anything*. In one of the applications i worked at,
it could kill an unsuspecting person, while a crash would cause a
watchdog-timer to trigger, restarting the app and signaling the problem.
That's not always true. There's no point deliberately crashing a video
game, for instance.

The hard problem is whether or not to allow an emergency save. If you
don't allow it, the user will lose all the work he did before the last
save. If you do allow it, you might save corrupted data which could
cause far more serious problems further down the line.
 
K

Kleuskes & Moos

That's not always true. There's no point deliberately crashing a video
game, for instance.

Better than continuing in an undefined state and opening your platform up
for possible exploits (e.g. buffer overrun). Besides, "deliberately crashing"
differs from "crashing on a bad pointer", which was the subject at hand.

Apart from that, there are few persons who would consider a video-game to be
critical for anything. Worst that can happen is a frustrated user demolishing
an innocent keyboard and making an ass of himself on YouTube.
The hard problem is whether or not to allow an emergency save. If you
don't allow it, the user will lose all the work he did before the last
save. If you do allow it, you might save corrupted data which could
cause far more serious problems further down the line.

The latter is sufficient reason _not_ to allow that, unless you can guarantee
the data to be saved isn't corrupted. The proper way to do things _like_ that
is to generate a dump _in_a_separate_file_, which might later be used to
restore at least some of the previous state.

But things like that only come into play if your app isn't really critical to
anyone and you have a damned good reason for doing so.

-------------------------------------------------------------------------------
________________________________________
/ My nose feels like a bad Ronald Reagan \
\ movie ... /
----------------------------------------
\
\
___
{~._.~}
( Y )
()~*~()
(_)-(_)
-------------------------------------------------------------------------------
 
M

Malcolm McLean

The latter is sufficient reason _not_ to allow that, unless you can guarantee
the data to be saved isn't corrupted. The proper way to do things _like_ that
is to generate a dump _in_a_separate_file_, which might later be used to
restore at least some of the previous state.

But things like that only come into play if your app isn't really critical to
anyone and you have a damned good reason for doing so.
Often you don't know how the app will be used. For instance a
spreadsheet might be used for an undergraduate research project, or it
might hold details of cancer patients in a drugs trial. The
requirements are the same.
Now lets say the program detects an invalid pointer. If it crashes
out, it destroys maybe six hours work by the undergraduate. That's a
major irritation. He won't use that package again and you've lost a
customer. If you offer to save, of course you'll warn "this data may
be corrupted". One would hope that the cancer trial people have
procedures. But a harassed clerk has just spent half an hour entering
new drug data. She chooses the save option, looks at the data, it
seems OK. 99% of the time, she'll be right to accept it. Will the
cancer trial people really catch that one? Is it your responsibility,
as spreadsheet programmer?
 
K

Kenny McCormack

Joe Pfeiffer said:
And in which those "pointers" have semantics somewhat different from
what C calls pointers. And, I expect, semantics somewhat different from
what most readers of this newsgroup infer from the word pointers.

Quite so.

I know of a language that has "pointers" - they call them pointers and use
the same "*" and "&" syntax that C uses - but these pointers are entirely safe
to use.

--
Is God willing to prevent evil, but not able? Then he is not omnipotent.
Is he able, but not willing? Then he is malevolent.
Is he both able and willing? Then whence cometh evil?
Is he neither able nor willing? Then why call him God?
~ Epicurus
 
I

ImpalerCore

Sometimes somewhat disparagingly (but accurately) known as "poor man's
templates" :)

Yeah, but sometimes "poor man's templates" can be still pretty good,
at least in C. While the macro technique sure doesn't beat C++
templates with its compile-time type verification, you can still
abstract a lot of the container grunt work that you see over and over
again that doesn't depend on type. From my experiments, using a macro
API to abstract access to type information was miles better than
sprinkling manual type-casting of void* pointers. It allows one to
have an opportunity to do some semblance of type-checking.

For the auto-resizing array, you can provide some type checking since
the container is localized in a single struct. Using a macro, one can
do something like the following.

\code snippet
struct c_array* gc_array_alloc( size_t sz, size_t n, const char*
name )
{
struct c_array* array = NULL;
...

/* alloc and assign 'name' to 'typename' struct member */
}

void* gc_array_front( struct c_array* array, const char* name )
{
void* p = NULL;

c_return_value_if_fail( array != NULL, NULL );
c_return_value_if_fail( strcmp( name, array->typename ) == 0,
NULL );

if ( array->size ) {
p = array->buffer;
}

return p;
}

#define c_array_new( type, n ) \
(gc_array_alloc( sizeof (type), (n), #type ))

#define c_array_front( array, type ) \
( (type*)(gc_array_front( (array), #type )) )
\endcode

I found this kind of setup to be quite fragile and computationally
expensive. You can try to improve upon strcmp with a string
comparison that ignores spaces, but it still can't deal with
typedefs. For this reason, I settled on using 'sizeof' as a hack to
verify that the type argument matches the array's internal type (at
least in size).

If there was a compiler extension that assigned a unique id to a
compiler type (kind of like a compile-time typeid), and it handled
typedefs properly as well, one could architect low cost run-time type
checking constraints into the generic array interface fairly easily,
ideally solving most of the type-mismatch issues with 'void*'. It
might even be cheap enough to use it in distributed containers like
linked lists and trees, provided users would use a macro API instead
of manual casting.

Best regards,
John D.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,091
Messages
2,570,605
Members
47,225
Latest member
DarrinWhit

Latest Threads

Top