Just how delicate are freed pointers?

J

James Kanze

|> > Because on some machine architectures, pointers may be validated
|> > at the time that they are loaded into a machine register instead
|> > of being validated each time they attempt to access memory.

|> I have heard that a lot before, but never ran across such
|> architecture. Could you at least give me one example?

Some implementations of C on the 80286.

|> Let's assume this is the case: is NULL a valid pointer? I don't
|> think so. It can't be dereferenced.

It is a valid pointer value. That doesn't mean that you can dereference
it. One past the end pointers are also valid pointer values.

|> Yet, it is clearly impossible that the following statement be not
|> valid or undefined behavior, because 1. it would make pointer
|> programming very difficult and 2. it would cause havoc in trillions
|> of lines of code all over the world:

|> p = NULL;

|> What do you think? Do you think that the "NULL" pointer is a special
|> case on the aforementioned architectures and therefore it's still ok
|> to assign it to some pointer variable?

It's up to the implementation, but they must do something to ensure that
loading a NULL pointer will not trigger an exception.

Back when the 80286 was a recent architecture, most of the
implementations I used mapped the page 0 to a page full of 0's, so that
dereferencing the null pointer actually worked. If they did this, of
course, there was no problem copying a null pointer.
 
J

James Kanze

|> Malcolm wrote:
|> > > The OP's original code is fine, although a little convoluted
|> > > about the order of operations.

|> > > The value of `a' will _not_ be changed by the compiler, the C
|> > > standard, _any_ remotely conforming implementation of `free()'
|> > > or the architecture (x86, Sparc, Power PC, etc.) the code is run
|> > > under.

|> > > You can do anything you want with the value of `a' with the
|> > > following restrictions:

|> > > 1. You cannot use it as a value to the `free()'
|> > > function again,
|> > > 2. You cannot de-reference it.

|> > You should read a thread before responding.

|> I did read the thread.

|> > As others have pointed out, this is true for the vast majority of
|> > implementations, but some will have trap values that trigger when
|> > an invalid pointer (other than NULL) is used in a calculation.
|> > This means that comparing a to b is illegal, as is incrementing a
|> > or doing anything else with it.

|> Sorry, the above is utter nonsense.

It's what the C standard says.

|> Pointer arithmetic is always legal no matter what the state of the
|> _value_ of that pointer. This is a fact.

Not according to the C standard.

[...]

|> It's not clear to me that some posters know to distinguish the
|> operations of "pointer arithmetic" and "dereferencing the result" of
|> such arithmetic...

It's not clear to me that some posters know C, as opposed to the
particular implementations they happen to use today.
 
S

Stephen L.

James said:
|> > Because on some machine architectures, pointers may be validated
|> > at the time that they are loaded into a machine register instead
|> > of being validated each time they attempt to access memory.

Back when the 80286 was a recent architecture, most of the
implementations I used mapped the page 0 to a page full of 0's, so that
dereferencing the null pointer actually worked. If they did this, of
course, there was no problem copying a null pointer.

You are saying (please correct me if I'm wrong) that
copying a pointer containing the NULL value is the
same as dereferencing the pointer...

You're confusing a NULL pointer, that when dereferenced
happens to point to a 0 (your words, not mine), and the
NULL pointer itself - you're treating them identically
in your post! If a pointer happens to _point_ to a 0,
that does not define the pointer as a NULL pointer.
You're also assuming that the value for a NULL pointer
is 0 - it depends on the hardware architecture.
You can google for architectures where a NULL pointer
is not represented by 0.

Copying a pointer, even if its value is the NULL value,
(also known as "pointer arithmethic") is _not_ the
same operation as dereferencing that pointer.

There's an interesting link in one of my other
posts along this subject worth reading.


HTH,

Stephen
 
J

James Kanze

|> James Kanze wrote:


|> > |> > Because on some machine architectures, pointers may be
|> > |> > validated at the time that they are loaded into a machine
|> > |> > register instead of being validated each time they attempt
|> > |> > to access memory.

|> <snip>

|> > Back when the 80286 was a recent architecture, most of the
|> > implementations I used mapped the page 0 to a page full of 0's, so
|> > that dereferencing the null pointer actually worked. If they did
|> > this, of course, there was no problem copying a null pointer.

|> You are saying (please correct me if I'm wrong) that copying a
|> pointer containing the NULL value is the same as dereferencing the
|> pointer...

No. I'm saying that at the time, most implementations allowed
dereferencing a null pointer, and that of course, if you support
dereferencing it (as an extension), then you can obviously copy it
(which the standard requires).

|> You're confusing a NULL pointer, that when dereferenced
|> happens to point to a 0 (your words, not mine), and the
|> NULL pointer itself - you're treating them identically
|> in your post!

I'm not confusing anything. I'm just explaining what happened on some
particular implementations I once used.

|> If a pointer happens to _point_ to a 0,
|> that does not define the pointer as a NULL pointer.

If the implementation defines all bits to zero as the representation of
a null pointer, then it does. That happened to be the case in the
implementation in question.

I'm aware that this isn't always the case, but the exception is rare
enough that I would have mentionned it had this been the case.

|> You're also assuming that the value for a NULL pointer is 0 - it
|> depends on the hardware architecture. You can google for
|> architectures where a NULL pointer is not represented by 0.

I am aware of architectures where the null pointer is not represented by
all bits being 0. (The integral constant 0 IS a null pointer. Always.
But I'm pretty sure that this isn't what you are trying to say.) I've
even programmed on some, although not in C.

To be clearer, on the implementations I used in the early 80's (when the
80286 was around), the implementation made a conscious decision to allow
dereferencing null pointers. It was a frequent case -- much of the
original software for Unix at the time counted on it, that NULL behaved
like "", for example. The standard decided not to support this use;
even Kernighan and Richie said it was wrong. But it happened to work in
Unix version 6 and version 7, and a lot of software was written which
counted on it. (I happened to be present when we suppressed this
"feature" in our Unix port, by modifying the kernel so that the page 0
was never mapped. We had to back out of the modification, because many
of the Unix toolkit started core dumping on us, and we didn't have time
to correct them before that release.)

|> Copying a pointer, even if its value is the NULL value, (also known
|> as "pointer arithmethic") is _not_ the same operation as
|> dereferencing that pointer.

I never said it was. You have to be able to read the pointer to be able
to do either; there are pointers which you cannot read, and there are
pointers which you can read, but cannot dereference (null pointers and
one past the end pointer).
 
G

Guillaume

Some implementations of C on the 80286.

Ok, I seem to remember the "selector" scheme and the descriptor tables
for the protected mode of 80x286 and above.

Loading an undefined selector in a selector register could indeed raise
an exception, although most compilers's implementations I have known
would bypass this exception either by masking it or by silently handling
it at run-time. Some other implementations would perform some
computation on pointers before actually loading a selector register,
insuring that it was a valid one beforehand.

Thus, I haven't ever seen an implementation on which assigning an
undefined "address" to a pointer would raise an exception. But of
course, there might have been some. I have looked in the C99 standard
and haven't really found a definite answer on potential problems
with assigning pointers, though. If you find the appropriate section,
feel free to point me to it.
 
A

Arthur J. O'Dwyer

In case b had been allocated sometime before in the same function,
then this is a potential memory leak

Of course it isn't. A memory leak is when you lose all pointers
to an allocated block of memory. (Hint: if a equals b, then a has
the same value as b.)

-Arthur
 
C

Christian Bau

"Stephen L. said:
The OP's original code is fine, although a
little convoluted about the order of operations.

The value of `a' will _not_ be changed by the
compiler, the C standard, _any_ remotely conforming
implementation of `free()' or the architecture
(x86, Sparc, Power PC, etc.) the code is run under.

You can do anything you want with the value of
`a' with the following restrictions:

1. You cannot use it as a value to the `free()'
function again,
2. You cannot de-reference it.

Interesting that you quote a post that explains quite nicely _why_
accessing "a" after "free (a)" is wrong, and then you say it is ok.
Well, it is not. You will mostly get away with it in this case, but it
is wrong. It will go wrong in the following cases, and perhaps in
others:

1. Unusual hardware.
2. C interpreter.
3. Brutally optimising C compiler.
 
C

Christian Bau

pete said:
Leor said:
Hmmm. Assuming a was valid before the free() and b still is at the
point of the comparison, I don't get why you think this is unsafe.
Personally, it seems strange to test the value of a pointer after it
has been freed (and your version is more conventional in that sense),
but since we're only testing the /pointer/ and not trying to access
the memory it is pointing to, why wouldn't the OP's code be "safe"?
-leor

N869
7.20.3 Memory management functions
[#1]
The value of a pointer
that refers to freed space is indeterminate.

So the correct thing for a compiler to do after call "free (a)" would be
to treat a the same as an uninitialised variable (uninitialised
variables have indeterminate values as well), giving a warning when you
try to use it and so on.
 
C

Christian Bau

Because on some machine architectures, pointers may be validated
at the time that they are loaded into a machine register instead
of being validated each time they attempt to access memory.

I have heard that a lot before, but never ran across such architecture.
Could you at least give me one example?

Let's assume this is the case: is NULL a valid pointer?
I don't think so. It can't be dereferenced.

Yet, it is clearly impossible that the following statement be not valid
or undefined behavior, because 1. it would make pointer programming very
difficult and 2. it would cause havoc in trillions of lines of code
all over the world:

p = NULL;

What do you think? Do you think that the "NULL" pointer is a special
case on the aforementioned architectures and therefore it's still ok
to assign it to some pointer variable?[/QUOTE]

It's very simple: The C Standard says that the value of a after free (a)
is indeterminate (except when a was a null pointer; in that case it is
still a null pointer). That has a very specific meaning; it means that
any use of the value leads to undefined behavior. On the other hand, a
null pointer is _not_ an indeterminate value. It can be used legally in
many situations, like assignment, comparison with == or != (but not with
<=, for example).

When the C Standard says it is indeterminate it is quite pointless to
think what hardware could do what kind of strange things. You just need
an optimising compiler which may assume that your code doesn't invoke
undefined behavior, and therefore can make any kind of assumptions.

For example, after calling free (a) any use of a is undefined behavior
unless a was a null pointer. Therefore if you use a after calling free
(a) an optimising compiler may assume that a is a null pointer. So if
you compare a == b after calling free (a), the compiler can change this
to b == NULL.
 
C

Christian Bau

"Stephen L. said:
I did read the thread.


Sorry, the above is utter nonsense.

Pointer arithmetic is always legal
no matter what the state of the _value_
of that pointer. This is a fact.

This is not a fact, this is ignorance and bullshit. Go and buy yourself
a copy of the C Standard; you can find it at www.ansi.com for $18.
 
C

Christian Bau

Some implementations of C on the 80286.

Ok, I seem to remember the "selector" scheme and the descriptor tables
for the protected mode of 80x286 and above.

Loading an undefined selector in a selector register could indeed raise
an exception, although most compilers's implementations I have known
would bypass this exception either by masking it or by silently handling
it at run-time. Some other implementations would perform some
computation on pointers before actually loading a selector register,
insuring that it was a valid one beforehand.

Thus, I haven't ever seen an implementation on which assigning an
undefined "address" to a pointer would raise an exception. But of
course, there might have been some. I have looked in the C99 standard
and haven't really found a definite answer on potential problems
with assigning pointers, though. If you find the appropriate section,
feel free to point me to it.[/QUOTE]

If you use any variable with an indeterminate value, then you get
undefined behavior. At that point, _anything_ can happen. Good
optimising compilers will also make optimisations based on the
assumption that your code doesn't invoke undefined behavior; if you make
that assumption wrong then again anything can happen.
 
G

Gordon Burditt

As others have pointed out, this is true for the vast majority of
Sorry, the above is utter nonsense.

Pointer arithmetic is always legal
no matter what the state of the _value_
of that pointer. This is a fact.

No, it isn't. It is perfectly legal for this code to abort:

#include <stdlib.h>
....
char *a;

a = malloc(1);
a; /* no smegfault here */
free(a);
a; /* smegfault here */

Loading the value of "a" into a register on a segmented platform
(e.g. x86 in 16-bit or 32-bit protected mode, where pointers include
a 16-bit segment number), which includes loading the segment portion
of the pointer into a segment register, can cause a processor trap
if there is no mapping for the segment portion. Notice that I'm
*NOT* dereferencing the pointer.

Few implementations work this way because it's a pain in the butt and
a lot of sloppy code out there won't work. But it's allowed per
ANSI C.
However, when any pointer (containing an
invalid value) is dereferenced, then each
architecture/implementation will behave
differently, sometimes with no error/warning
diagnostic at all (just some odd program
behavior).

It's not clear to me that some posters know
to distinguish the operations of "pointer
arithmetic" and "dereferencing the result"
of such arithmetic...

The point here is that both pointer arithmetic and dereferencing
the result can *BOTH* cause the same undefined behavior. Or
perhaps different undefined behavior.

Gordon L. Burditt
 
C

CBFalconer

Christian said:
.... snip ...

For example, after calling free (a) any use of a is undefined
behavior unless a was a null pointer. Therefore if you use a
after calling free (a) an optimising compiler may assume that
a is a null pointer. So if you compare a == b after calling
free (a), the compiler can change this to b == NULL.

Splutter. Hogwash. Bullshit. Merde. Schiesse. :) I think
you mean that such a change is one of the many undefined
behaviours available.
 
J

James Kanze

|> > Some implementations of C on the 80286.

|> Ok, I seem to remember the "selector" scheme and the descriptor
|> tables for the protected mode of 80x286 and above.

|> Loading an undefined selector in a selector register could indeed
|> raise an exception, although most compilers's implementations I have
|> known would bypass this exception either by masking it or by
|> silently handling it at run-time. Some other implementations would
|> perform some computation on pointers before actually loading a
|> selector register, insuring that it was a valid one beforehand.

|> Thus, I haven't ever seen an implementation on which assigning an
|> undefined "address" to a pointer would raise an exception. But of
|> course, there might have been some. I have looked in the C99
|> standard and haven't really found a definite answer on potential
|> problems with assigning pointers, though. If you find the
|> appropriate section, feel free to point me to it.

The most efficient way to copy a pointer on a 80286 would be:

LES AX,ptrSource
MOV ptrDest,AX
MOV ptrDest+2,ES

I've seen compilers which did this, although I forget off hand which
ones. (The Intel compilers, maybe?)
 
C

Chris Torek

(I removed the referent for the first "this" pronoun below, but I
think the idea still comes through.)

I've heard this stated before, and I completely understand that loading a
/garbage/ address into an address register under one of these architectures
is bad ju-ju.
Correct.

What makes me wonder in the case of the OP's scenario, however, is
my (admittedly, perhaps outdated) recollection of how memory
allocation works in the C libraries.

The C standards do not dictate how the underlying implementation
has to work -- so "how it works" underneath (whether past or present)
is at best a guideline.
Under the last systems where I took notice of such things, once
memory had been obtained from the system (on Unix, it was via an
sbrk() call, I believe), it remained allocated to the process.

Generally true. But these days some Unix and Unix-like systems
have effectively removed sbrk() entirely (there is always some
compatibility module of course), preferring to use mmap() to obtain
"anonymous" (swap-device-backed) virtual memory. The advantages
are that the memory need not be contiguous, can be mapped with
varying permissions -- various data areas might be marked "read-only
yet executable" or "read/write and *not* executable" -- and can be
altered or returned to the OS piecemeal if desired.
So I was thinking in terms of "once hardware-validated, always
hardware-validated" during the run of that process...even on those
sensitive architectures. AFAIK, this may no longer be true on modern
systems, though...anyone care to set me straight?

Suppose, then, you have a hardware architecture that verifies some
part(s) of a pointer, and a C support library in which, in at least
some cases, free() hands underlying page(s) back to the OS, which
then revokes all access. In this case, the bits stored in a pointer
"pre-free" represent a valid location, while those same bits stored
in the same pointer "post-free" no longer represent a valid location.

If the pointer-verification operation is performed on simple pointer
comparisons and copies, such comparisons and copies will indeed
trap the now-illegal value:

char *a, *b;
...
a = malloc(N);
b = a;
...
free(a);
if (a == b) ... /* invalid */
/* or */
b = a; /* invalid */

If the pointer-verification operation is applied to pointer-following
operations, things like:

free(a);
use(*a); /* invalid */

will trap. This "dereference trap" is considerably more common,
and is available on a number of Unix-like systems that use mmap()
and have "debug" versions of malloc() and free().

The C standards permit the pointer-verification step to occur
at any or all of these places. Most implementations, for various
good reasons, only do the last one -- but at least for debugging
purposes, the earlier verifications could also be useful. Even
on hardware that does not provide such verification automatically,
the compiler could insert "verify" steps for you.
 
L

Leor Zolman

Leor said:
Hmmm. Assuming a was valid before the free() and b still is at the
point of the comparison, I don't get why you think this is unsafe.
Personally, it seems strange to test the value of a pointer after it
has been freed (and your version is more conventional in that sense),
but since we're only testing the /pointer/ and not trying to access
the memory it is pointing to, why wouldn't the OP's code be "safe"?
-leor

N869
7.20.3 Memory management functions
[#1]
The value of a pointer
that refers to freed space is indeterminate.

Ok, thanks! That makes it pretty clear. In the copy of the C99 Standard I
have (ISO/IEC 9899, "Second edition", 1999-12-01) they've moved that to the
last line of 6.2.4/2.

In appendix J.2 ("The behavior is undefined in the following circumstances"
followed by 11 pages of circumstances), I did find this item (page 501):

The value of a pointer that refers to space deallocated by a
call to the free or realloc function is used (7.20.3).

The interesting thing is the reference is back to 7.20.3, where you saw the
statement back in the draft. It looks to me as if they moved the statement
but perhaps forgot to update the note in J.2.
-leor
 
J

John Cochran

SNIP...
I've heard this stated before, and I completely understand that loading a
/garbage/ address into an address register under one of these architectures
is bad ju-ju. What makes me wonder in the case of the OP's scenario,
however, is my (admittedly, perhaps outdated) recollection of how memory
allocation works in the C libraries. Under the last systems where I took
notice of such things, once memory had been obtained from the system (on
Unix, it was via an sbrk() call, I believe), it remained allocated to the
process. So I was thinking in terms of "once hardware-validated, always
hardware-validated" during the run of that process...even on those
sensitive architectures. AFAIK, this may no longer be true on modern
systems, though...anyone care to set me straight?
And I've dealt with a system that one you passed a pointer to free(), the
memory freed was returned to the system as a whole and could then be
allocated to any process that desired memory (the Amiga used this method).
 
L

Leor Zolman

Thanks, Chris. I've now (by Pete) also been pointed to the wording in the
Standard (6.2.4/2) that evidently leaves no wiggle room for allowing /any/
use of a free'd pointer. And it looks like what you've described in your
post would justify that.
-leor
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,142
Messages
2,570,819
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top