K
Keith Thompson
jimjim said:Hi, I am still here
g is passed by value to free( ) and this is why it continues to have the
same value -pointing to the same memory location- after free( ) is called (I
hope I ve got it right). What may cause the pointer to assume an
indeterminate value?
In C, the safest approach is to think of a pointer as an opaque
abstract entity that supports certain operations defined by the C
standard. If you happen to know something about how machine addresses
work in assembly language, that's great. 99% of the time a C pointer
will be implemented as a machine address, and operations on it will
work as you would expect. That knowledge can help you to understand
how C pointers work, and why they're defined as they are -- but the
language standard allows them to behave in ways that don't necessarily
match the "obvious" behavior of machine addresses.
A pointer object, like any object in C, has a value that can be viewed
as a sequence of unsigned bytes, and that can be displayed, for
example, in hexadecimal. There's no guarantee about what those bytes
are going to look like, but it can be instructive to examine them if
you want a more concrete example of how an implementation might work.
In the following program, I convert a pointer value to unsigned int
and display its value in hexadecimal. This happens to work as
"expected" on the platform I'm using, but don't count on this being
portable. The program assumes that it makes sense to display the
result in 8 hexadecimal digits; this happens to be true on the system
I'm using, but again, it's not guaranteed (there are plenty of systems
with 64-bit pointers, and other sizes are possible.)
#include <stdio.h>
#include <stdlib.h>
/*
* Warning: This program invokes undefined behavior.
*/
int main(void)
{
double *ptr;
ptr = malloc(sizeof *ptr);
*ptr = 9.25;
printf("After malloc(): ptr = 0x%08x, *ptr = %g\n",
(unsigned)ptr, *ptr);
free(ptr);
printf("After free(): ptr = 0x%08x, *ptr = %g\n",
(unsigned)ptr, *ptr);
*ptr = 111.125;
printf("After assignment: ptr = 0x%08x, *ptr = %g\n",
(unsigned)ptr, *ptr);
return 0;
}
When I compile and execute this program, it gives me the following
output:
After malloc(): ptr = 0x000209c0, *ptr = 9.25
After free(): ptr = 0x000209c0, *ptr = 9.25
After assignment: ptr = 0x000209c0, *ptr = 111.125
Now let's examine in a bit more detail what's going on here, both in
terms of the underlying hardware and in terms of the C language
standard.
The call to malloc() allocates memory space for a double object; we
assign the resulting address to the pointer object ptr. (In a
real-world program we'd want to check whether the malloc() call
succeeded, and probably bail out if it didn't.)
We then assign a value to the double object that ptr points to, and we
display (in a non-portable manner) the value of ptr and of what it
points to.
As it happens, pointers on the system I'm using are represented as
machine addresses (actually virtual addresses). malloc() allocated 8
(sizeof(double)) bytes of memory starting at address 0x000209c0.
The C runtime system has reserved that chunk of memory, guaranteeing
that it belongs to this program, that we can read and write it, and
that no other object overlaps it.
Now we call free(). By doing so, we're informing the runtime system
that we no longer need that chunk of memory, that we won't try to use
it again, and that the runtime system is now free to reallocate it for
other purposes. There's no guarantee about what the runtime system
will actually do with that chunk of memory; it could well remain
unused for the remainder of the execution of this program. Or it
could be immediately reallocated for use as temporary storage. By
calling free(), we're telling the runtime system that we don't care
what happens to that chunk of memory; we're done with it. (We're
*not* asking the runtime system to prevent us from trying to access it
again.)
But we still have the pointer value. Since free()'s argument is
passed by value, the variable ptr is referenced, but not modified. It
still contains the same bit pattern, 0x000209c0. (There's been some
debate about whether a sufficiently clever implementation might be
allowed to modify the value of ptr, but we'll assume that it can't.)
So what does 0x000209c0 mean? After the call to malloc(), it was the
address of a chunk of memory that we owned. After the call to free(),
in terms of the underlying hardware, it's still the address of the
same chunk of memory; the only difference is that it's memory that we
no longer own. The runtime system is free to do what it likes with
that chunk of memory, including marking it as read-only. If it
happens to do so, the assignment "*ptr = 111.125;" will likely crash
the program, triggering a segmentation fault or something similar.
But if it happens not to do anything with it immediately, attempts to
access it may still work.
Ok, so the chunk of memory allocated at 0x000209c0 is off-limits after
the call to free(). Attempts to refer to it may happen to work (as
they did when I ran my sample program), but they could just as easily
blow up.
But what about the contents of ptr itself? ptr isn't stored at
0x000209c0, it just points to it. The pointer object is in the local
stack frame of our main program, We should still be able to do
anything we like with the value of ptr, as long as we don't try to
dereference it, right?
Well, yes and no.
On the machine level, on *most* real-world implementations, that's
true. A pointer value is just a bunch of bits, and even though we no
longer own the memory it points to, we still own the pointer itself,
and we can still do things like compare it for equality to another
pointer value.
But the C language standard deliberately doesn't guarantee that.
What happens when we compare two pointer values? On some (probably
most) systems, we're just executing a machine instruction that does a
bit-by-bit comparison of the two values and tells us whether they're
equal. On others, though, there may be special machine instructions
for operating on address values. The program might load the values
into special-purpose address registers before comparing them, and the
very act of loading values into these registers might check whether
the pointers are valid, and trigger a trap if they aren't.
Here's another sample program:
#include <stdio.h>
#include <stdlib.h>
/*
* Warning: This program invokes undefined behavior.
*/
int main(void)
{
double *ptr;
double *copy;
ptr = malloc(sizeof *ptr);
if (ptr != NULL) {
printf("malloc() succeeded\n");
}
copy = ptr;
if (copy == ptr) printf("After malloc, ptr and copy are equal\n");
else printf("After malloc, ptr and copy are unequal (???)\n");
free(ptr);
if (copy == ptr) printf("After free, ptr and copy are still equal\n");
else printf("After free, ptr and copy are unequal (???)\n");
return 0;
}
and here's the output I got when I ran it:
After malloc, ptr and copy are equal
After free, ptr and copy are still equal
On the system I'm using, referring to the value of the pointer after
calling free() doesn't cause any problems; it works just as you might
expect. On a system with the kind of special handling of address
values that I described above, referring to the value of ptr after
calling free() could cause a trap as the value is loaded into a
special address register. Either system could have a conforming C
implementation; either behavior is consistent with the C language
standard, even though the latter might be surprising to many
programmers.
Before the call to free(), the bit pattern 0x000209c0 represents a
valid pointer value. After the call to free(), that same bit pattern
no longer represents a valid pointer value, and any attempt to
reference that value, even without dereferencing it, causes undefined
behavior.
You can't really learn about this kind of thing by running sample
programs and seeing whether they happen to work. Running any number
of sample programs is likely to give you the false impression that the
language makes guarantees that it really doesn't.
I still cant understand this :-(
The sizeof operator is a special case. Unless the operand is a
variable length array, the operand of sizeof is not evaluated. (Why?
Because the standard says it's not evaluated. Why does the standard
say so? Because there's no need to evaluate the operand; the compiler
needs to know the type of the operand, not its value, and the result
can be, and is, determined during compilation.) The expression
(sizeof *g) doesn't evaluate *g, so it doesn't cause any of the
problems that you might encounter if you did try to evaluate *g.
[...]
If there was a standardised code of conduct for use in the comp.lang.c, it
would have definitely describe this as an inappropriate behaviour
(and who am I to judge you,e?)
I don't understand why you have a problem with what I wrote above.
ERT made an absurd statement; I was mildly sarcastic in pointing out
the absurdity.
This is true . I clicked on the wrong newsgroup.
Fair enough; mistakes happen.
However Robert was kind enough to convert my code in C and answer my
question in terms of C in which I am also interested in. I was wondering how
is it possible to free( ), dereference the pointer and still access the data
which I had assigned before. His example answered exactly this. Then Keith
told me that I should not refer to or dereference a pointer; it may cause an
undefined behaviour. The bottom line is that may and I have learned a lot -
which is the whole point of newsgroups- even though I posted my question to
the wrong newsgroup. There is no need for people to get upset and be rude.
This is what Robert wants to communicate.
If that were all that Robert wants to communicate, we wouldn't have a
problem with him. But he repeatedly posts things here that have the
effect of disrupting this newsgroup. He often does so in a way that
*looks* like he's being kind and helpful to novices, but while doing
so he often posts subtle misinformation, some of which we're unable to
refute effectively because it's off-topic and outside our area of
expertise. When his misinformation is within the scope of
comp.lang.c, many of us feel obligated to spend the considerable time
necessary to correct his errors -- time that we'd much rather spend
doing something more constructive. Some of us are unwilling to let
his statements stand without response, because we're afraid that some
people will assume he's correct.
If you're so inclined, you might want to take a look through the
archives at groups.google.com. Look for things that ERT has posted,
and look at how we've responded to him. (There's a lot of it.) Some
of the responses are admittedly overreactions, but on the whole I
think we've done as good a job as can be expected in an anarchic forum
like this. If you can suggest a more effective way of dealing with
his behavior, we'd love to hear about it. But you really need to have
some understanding of ERT's history in comp.lang.c to understand why
we respond to him as we do.
And by the way, welcome to comp.lang.c. I hope you find it useful,
and I'm sorry that your most recent foray here has dumped you into the
middle of this brouhaha.