[union] Pointers to inherited structs are valid ?

  • Thread starter Maciej Labanowicz
  • Start date
T

Tim Rentsch

Philip Lantz said:

The undefined behavior occurred when &ptr->bar was executed (with ptr
equal to NULL). [snip]

Actually just slightly before -- when ptr is a null pointer,
evaluating ptr->bar is already undefined behavior.
 
G

Geoff

Shao Miller said:
On 1/16/2013 20:32, Geoff wrote: [...]
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.
[snip]
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.

I'm fairly sure Microsoft uses all-bits-zero for null pointers. Most
implementations do the same thing, though of course the standard doesn't
require it.

I think all-bits-zero is one null pointer value representation, but I
was talking about "trap representations" in practice (as opposed to a
discussion of those that depend on padding bits). In Windows NT
kernel-land, more often than not I see that when a null pointer is
trapped, it's actually _not_ all-bits-zero; differing in the LSB. The
debugger still calls it a null pointer.

I'm not sure that when Windows traps in the manner you speak of that it's a NULL
pointer exception. It's most likely an x86 protected mode exception being
reported due to dereference of a pointer outside the process allocated virtual
address space. This mechanism is outside the purview of the C standard.

Can you post a simple code example that exhibits the behavior you describe?
 
S

Shao Miller

There's nothing fuzzy about any of this. A pointer value is a null
pointer if and only if it compares equal to NULL.

Ok, I miscommunicated, then. I agree with your notion of a null pointer.
No.

Ignoring for the moment the fact that you can't create or access
that value without having invoked undefined behavior:

Minor nit-pick: Why not? One can modify the object representation.
Wouldn't we have to decide that it's a trap representation before
suggesting undefined behaviour?
The expression `(void*)0x0000000C == NULL` yields 0 (false).
This is not a "false negative"; it's a perfectly correct result
indicating that `(void*)0x0000000C` is not a null pointer.

Which is why I wrote "relative to the goal of indirect access."
If you want to determine whether a given pointer value may be safely
dereferenced, comparing it to NULL is not the way to do that.

So true! Windows NT has 'MmIsAddressValid' and some other methods.
However, this isn't recommended. It seems pretty common in NT that
there are functions taking an "optional" pointer value. The cheapest
way to test would be 'if (!ptr)'. But if the caller passes 0x0000000C,
we most likely will crash.
Please be clear: what exactly are you claiming?

Nothing much; sorry. Let me try to make some fun definitions, deriving
from the C ones (plus an omitted definition of "debugger"):

- Debug representation: An object representation which provides
information to a debugger. A debug representation can represent the
same possibilities as an indeterminate value (an unspecified value or a
trap representation).

- Nasty representation: A trap representation for a given type which,
when read by an lvalue expression with that type, causes the program to
terminate and provides an implementation-defined prompt for a debugging
opportunity with a debugger.

- VA32: Any pointer type with a representation that is 32 bits and
which has no trap representations.

- Unsafe pointer: Any pointer having a VA32 type and having a value
that does not refer to any object. If such a pointer value is used in
an attempt to access the stored value of a pointed-to object, the
behaviour is undefined.

- Lull pointer: An unsafe pointer having a value that compares
unequal with a null pointer and having a debug representation which,
when interpreted by a debugger, suggests (but does not guarantee) that a
recent operation expected a non-null pointer and was provided with a
null pointer. Such a pointer can lull a program into a false sense of
safety for future operations because it compares unequal with a null
pointer. Nevertheless, a lull pointer may be used for all purposes,
except that the note in the description for "unsafe pointer" still applies.

After Geoff's summary of what we might call "debug representations," I
thought I remembered 0x0000000C being another one. I wondered if it
might carry status information, in particular. Mr. Philip Lantz'
explanation makes it likely that it is what we might call a "lull
pointer". I shouldn't've argued with you that it wasn't a trap
representation; that was a mistake. All I meant was that it didn't
resemble what we might call a "nasty representation".

A lull pointer is an unsafe pointer. An unsafe pointer has a VA32 type.
A VA32 type has no trap representations. A nasty representation is a
trap representation. So: A lull pointer cannot have a nasty
representation, but still has a debug representation.

But who cares? :cool:}
 
S

Shao Miller

On 1/16/2013 20:32, Geoff wrote:
[...]
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.

[snip]
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.

I'm fairly sure Microsoft uses all-bits-zero for null pointers. Most
implementations do the same thing, though of course the standard doesn't
require it.

I think all-bits-zero is one null pointer value representation, but I
was talking about "trap representations" in practice (as opposed to a
discussion of those that depend on padding bits). In Windows NT
kernel-land, more often than not I see that when a null pointer is
trapped, it's actually _not_ all-bits-zero; differing in the LSB. The
debugger still calls it a null pointer.

I'm not sure that when Windows traps in the manner you speak of that it's a NULL
pointer exception. It's most likely an x86 protected mode exception being
reported due to dereference of a pointer outside the process allocated virtual
address space. This mechanism is outside the purview of the C standard.

Can you post a simple code example that exhibits the behavior you describe?

I thought it was obvious that the mechanism was via page fault and that
the bits of the attempted address were examined in order to determine
some useful information about the nature of a recent problem. All I was
wondering was if 0x0000000C had a particular meaning for debugging, like
your values above.

Mr. Philip Lantz suggested that the origin of such a thing is from a
pointer resulting from a computation involving a null pointer. (See
below.) Having read his post, this seems pretty obvious to be the
likely case, to me.
 
S

Shao Miller

I don't recall Geoff saying that anything was unexpected. That's your
phrase that you admit you have not defined. I can't fathom what you
think is unexpected behaviour or what point that would made. Should I
be able to work it out?

No, I'm explaining the nature of my mistake. "I thought it would be
obvious." <- Wrong.
Geoff's post looks simple and correct: the standard permits trap
representations that produce undefined behaviour so an implementation is
permitted to use special values to trigger interesting effects (either
in a debugger or else where).

I agree that his post does look that way and that C does allow for that.
There's a subtle bit here, though: If we're discussing a program, once
that program invokes undefined behaviour, anything goes. However, if
I'm not mistaken, with Windows NT, if you have a pointer value which
does not point to an object, the program will continue to operate as per
the C semantics until such a time (if ever) as that pointer might be
used for indirect access.

So it is imprecise to say that such a value has a trap representation,
because the behaviour is still well-defined. Otherwise, the last
sentence of N1570's 6.5.3.2p4 is redundant:

"If an invalid value has been assigned to the pointer, the behavior
of the unary * operator is undefined.102)"

But anyway, a pointer value pointing to no object can still be "trapped"
by Windows during indirection and can still provide useful information
to a Windows debugger. It just does not precisely match C's trap
representation, for behavioural differences.
No I read in context and try to respond in context as well. I don't
think I missed the context.

Well then that's my mistake.
It could be an unspecified value (you are explicitly using C terms here)
if it is a valid pointer value. It may well be. Is it? It may equally
well not be. It could be either a trap representation or an unspecified
value but you seem to suggest that it is one not the other. You seem to
suggest that one possibility is more likely than the other for reasons
that are spurious. The best evidence that it might not be a trap
representation is that it's a valid pointer, but you give no evidence for
that -- quite the contrary in fact.

I'd say that it is not a trap representation.

Take Windows NT's 'IRP' structure. It has a sub-member called
'Tail.Overlay.DriverContext', which is an array of 4 'void *'. This is
one of _very_few_ places where a driver can associate information with
an IRP, and is extremely valuable for that reason.

The implementation defines the results of casting an appropriately-sized
integer value to a 'void *', so such an integer can be "passed" via this
mechanism. We _certainly_ would not wish to believe that this results
in undefined behaviour, so we certainly would not wish to believe that
such a result is a trap representation.
B: "Here are Microsoft's documented [non-C] trap values..."

If this was Geoff, he correctly called then "trap representation"
(rather than values) and he referred to "they" meaning the C committee.
He was talking at C trap representations not "[non-C] trap values".

This was in regards to the lower part of his post where he specifically
says "Microsoft documents their compiler trap values".
Me: "Nice summary! Say, isn't there another one, 0x0000000C, as a
[non-C] trap for [non-C] null pointers? I don't remember..."

C: "Oh no, that wouldn't be a [C] null pointer."

Me: "Yes, well I was talking about B's [non-C] trap values. I see a
Microsoft debugger catch these things and call them [non-C] null
pointers."

I think you missed Geoff's point.

I doubt it.
 
K

Keith Thompson

Shao Miller said:
Ok, I miscommunicated, then. I agree with your notion of a null pointer.


Minor nit-pick: Why not? One can modify the object representation.
Wouldn't we have to decide that it's a trap representation before
suggesting undefined behaviour?

Good point. Yes, you can modify the representation of a pointer object
without undefined behavior:

int *p
*(uintptr_t*)&p = 0x0000000C;

Some, but not all, operations that result in such a pointer value have
undefined behavior.
Which is why I wrote "relative to the goal of indirect access."

Comparing a pointer to NULL, if you don't know in advance that it's
either a null pointer or a valid pointer to some object, does not meet
"the goal of indirect access". The phrase "relative to the goal of
indirect access" is meaningless in this context, or just plain wrong.
You might as well talk about "comparing an integer to zero relative to
the goal of determining whether it's even".
If you want to determine whether a given pointer value may be safely
dereferenced, comparing it to NULL is not the way to do that.

So true! Windows NT has 'MmIsAddressValid' and some other methods.
However, this isn't recommended. It seems pretty common in NT that
there are functions taking an "optional" pointer value. The cheapest
way to test would be 'if (!ptr)'. But if the caller passes 0x0000000C,
we most likely will crash.
Please be clear: what exactly are you claiming?

Nothing much; sorry. Let me try to make some fun definitions, deriving
from the C ones (plus an omitted definition of "debugger"):
[snip]

- VA32: Any pointer type with a representation that is 32 bits and
which has no trap representations.

What makes you think that 32-bit pointers on MS Windows have no
trap representations? `(void*)0x0000000C` almost certainly is a
trap representation for the implementation in question. (I'm using
the phrase "trap representation" as the C standard uses it; I lack
interest in any other meanings *unless* some documentation uses
that exact phrase with a different meaning.)
- Unsafe pointer: Any pointer having a VA32 type and having a value
that does not refer to any object. If such a pointer value is used in
an attempt to access the stored value of a pointed-to object, the
behaviour is undefined.

Such a pointer value is either a null pointer or a trap representation.

[...]
But who cares? :cool:}

I no longer do. Whatever relevant claims you were making, you've now
quietly backed away from them. We could have avoided wasting a great
deal of time if you'd done so sooner.
 
G

glen herrmannsfeldt

(snip, someone wrote)
(then I wrote)
I think you've misread what glen wrote (memcpy vs. memcmp?). I believe
he meant something like this:
uintptr_t x = 0x0000000C;
void *ptr;
memcpy(&ptr, &x, sizeof ptr);
if (ptr == NULL) { ... }
Yes.

But I don't think glen's suggestion is particularly relevant to
the current discussion. It would be for an implementation in which
pointer-to-integer conversion did something other than just copying
the bits, but I don't believe the implementation we're dealing with
(Microsoft's) does that.

Yes, but if you wanted to test that, even if you don't believe
it very likely, seems to me that would work.

I might also test 0x0C000000 in case someone got the endianness
wrong.
The standard permits all sorts of exotic pointer behavior:
null pointers with representations other than all-bits-zero,
pointer conversions that do something other than just copying
or reinterpreting the representation, equality operators that
do something other than a simple bit-by-bit comparison, etc.
And any of those things could lead to a pointer with the same
representation as the integer 0x0000000C being a null pointer.
As far as I know, Microsoft's implementation doesn't do any of
these things; a pointer value with an all-bits-zero representation
is the one and only null pointer for a given pointer type, a pointer
whose representation looks like 0x0000000C is not a null pointer, and
nothing in Microsoft's documentation refers to it as a null pointer.
(Code that dereferences a pointer with that representation is
reasonably assumed, by the debugger, to be the result of referring
to a member of a structure "pointed to" by a null pointer.)

As far as I know, too, but then I never looked. Well, I haven't
used any MS compilers for long enough that I didn't have any
reason to look.
Or perhaps I've misunderstood the point Shao Miller is trying
to make.

Does seem that it could be useful for a (person using a) debugger
to know where an unexpected NULL pointer came from. The cost isn't
all that high, though you would probably have to do it consistently.
(Someone might link to a non-debug version of a library).

-- glen
 
G

glen herrmannsfeldt

(snip, someone wrote)
I thought it was obvious that the mechanism was via page fault and that
the bits of the attempted address were examined in order to determine
some useful information about the nature of a recent problem. All I was
wondering was if 0x0000000C had a particular meaning for debugging, like
your values above.

If anyone ever used large model (48 bit pointers, with a 16 bit segment
selector and 32 bit offset) on 80386 and later processors, there are
many interesting things one could do.
Mr. Philip Lantz suggested that the origin of such a thing is from a
pointer resulting from a computation involving a null pointer. (See
below.) Having read his post, this seems pretty obvious to be the
likely case, to me.

I have no idea how MS does the page tables, though.

If the whole page with high bits zero is invalid, one might be able
to do some interesting things.

-- glen
 
B

Ben Bacarisse

Shao Miller said:
On 1/21/2013 11:31, Ben Bacarisse wrote:

I agree that his post does look that way and that C does allow for
that. There's a subtle bit here, though: If we're discussing a
program, once that program invokes undefined behaviour, anything goes.
However, if I'm not mistaken, with Windows NT, if you have a pointer
value which does not point to an object, the program will continue to
operate as per the C semantics until such a time (if ever) as that
pointer might be used for indirect access.

Of course. The "C semantics" are exactly as you describe: "anything
goes". If such a program caused the machine to halt, that, too, would
be operating as per "C semantics".
So it is imprecise to say that such a value has a trap representation,
because the behaviour is still well-defined. Otherwise, the last
sentence of N1570's 6.5.3.2p4 is redundant:

"If an invalid value has been assigned to the pointer, the behavior
of the unary * operator is undefined.102)"

Presumably you read the footnote so you must be aware of all the invalid
values that this clause covers.

A particular bit-pattern (0xC in this case, I think) can either
represent a valid value (for the pointer type in question), an invalid
value, or it can be a trap representation. These three possibilities
are, at a particular time in the program's execution, mutually
exclusive. The middle category, invalid pointer values, needs 6.5.3.2
p4.
But anyway, a pointer value pointing to no object can still be
"trapped" by Windows during indirection and can still provide useful
information to a Windows debugger. It just does not precisely match
C's trap representation, for behavioural differences.

What's the difference? The behaviour you describe here matches what you
can expect from what C calls a trap representation.
Well then that's my mistake.


I'd say that it is not a trap representation.

But you don't say why. The "it" above presumably refers to what we've
been talking about -- that 0xC bit-pattern. Your only evidence that
it's not a trap representation seems to be that nothing "odd" happens
until you dereference it. That's entirely consistent with it being a
trap representation.
Take Windows NT's 'IRP' structure. It has a sub-member called
Tail.Overlay.DriverContext', which is an array of 4 'void *'. This is
one of _very_few_ places where a driver can associate information with
an IRP, and is extremely valuable for that reason.

The implementation defines the results of casting an
appropriately-sized integer value to a 'void *', so such an integer
can be "passed" via this mechanism. We _certainly_ would not wish to
believe that this results in undefined behaviour, so we certainly
would not wish to believe that such a result is a trap representation.

Are you saying that something can't be a trap representation because
something other than C defines what happens when it's used? That's
exactly why, in part, C leaves so many things undefined (the behaviour
of trap representations being one such thing) so that implementations
are free to do useful things in such situations.

<snip>
 
K

Keith Thompson

Shao Miller said:
I agree that his post does look that way and that C does allow for
that. There's a subtle bit here, though: If we're discussing a
program, once that program invokes undefined behaviour, anything goes.
However, if I'm not mistaken, with Windows NT, if you have a pointer
value which does not point to an object, the program will continue to
operate as per the C semantics until such a time (if ever) as that
pointer might be used for indirect access.

I (mostly) agree with that.
So it is imprecise to say that such a value has a trap representation,
because the behaviour is still well-defined. Otherwise, the last
sentence of N1570's 6.5.3.2p4 is redundant:

"If an invalid value has been assigned to the pointer, the behavior
of the unary * operator is undefined.102)"

No. Accessing an object with a trap representation has undefined
behavior. What that means is that the behavior is not defined *by the C
standard*; see N1370 3.4.3, the definion of the phrase "undefined
behavior". Another entity can certainly define behavior for such
accesses.

For example, (void*)0x0000000C is very likely a trap representation
under Windows NT. The behavior of applying unary "*" to that value is
undefined, in the sense that the C standard does not define its
behavior. Windows NT can define the behavior if it likes; that doesn't
cause it to cease being a trap representation.
But anyway, a pointer value pointing to no object can still be
"trapped" by Windows during indirection and can still provide useful
information to a Windows debugger. It just does not precisely match
C's trap representation, for behavioural differences.

Yes, it does.

[...]
Take Windows NT's 'IRP' structure. It has a sub-member called
Tail.Overlay.DriverContext', which is an array of 4 'void *'. This is
one of _very_few_ places where a driver can associate information with
an IRP, and is extremely valuable for that reason.

The implementation defines the results of casting an
appropriately-sized integer value to a 'void *', so such an integer
can be "passed" via this mechanism. We _certainly_ would not wish to
believe that this results in undefined behaviour, so we certainly
would not wish to believe that such a result is a trap representation.

I'm afraid that what you wish is irrelevant. It has undefined behavior.
A program that depends on the behavior of some construct whose behavior
is not defined by the C standard is not portable. There's nothing wrong
with that; sometimes you *need* to write non-portable code, code that
depends on guarantees made by a given environment but not by the
language standard.

Don't forget that "undefined behavior" does not* mean "this will crash".
It means exactly what the C standard says it means in 3.4.3: "behavior,
upon use of a nonportable or erroneous program construct or of erroneous
data, for which this International Standard imposes no requirements".

A given construct causing Microsoft's debugger to consistently display a
"NULL_CLASS_PTR_DEREFERENCE" message is perfectly consistent with that
construct having undefined behavior as defined by the C standard. If
you don't understand that, you don't understand undefined behavior.
B: "Here are Microsoft's documented [non-C] trap values..."

If this was Geoff, he correctly called then "trap representation"
(rather than values) and he referred to "they" meaning the C committee.
He was talking at C trap representations not "[non-C] trap values".

This was in regards to the lower part of his post where he
specifically says "Microsoft documents their compiler trap values".

Here's Geoff's article:

https://groups.google.com/group/com...6919e10?dmode=source&output=gplain&noredirect

The "compiler trap values" that Microsoft documents are not the same
thing as C "trap representations" (that's probably why they have
a *different name*, and Geoff merely said that they're *analagous"
to C trap representations. For example, "Clean Memory" is filled
with 0xCD bytes, which can easily represent a valid value for some
type (certainly for unsigned char).
Me: "Nice summary! Say, isn't there another one, 0x0000000C, as a
[non-C] trap for [non-C] null pointers? I don't remember..."

C: "Oh no, that wouldn't be a [C] null pointer."

Me: "Yes, well I was talking about B's [non-C] trap values. I see a
Microsoft debugger catch these things and call them [non-C] null
pointers."

I think you missed Geoff's point.

I doubt it.

If you think Geoff was using the phrase "null pointer" or "trap
representation" to refer to anything other than a null pointer
or trap representation as defined by the C standard, then yes,
you missed Geoff's point.

"I see a Microsoft debugger catch these things and call them [non-C]
null pointers." -- I don't believe you have seen that.
 
S

Shao Miller

Comparing a pointer to NULL, if you don't know in advance that it's
either a null pointer or a valid pointer to some object, does not meet
"the goal of indirect access". The phrase "relative to the goal of
indirect access" is meaningless in this context, or just plain wrong.
You might as well talk about "comparing an integer to zero relative to
the goal of determining whether it's even".

status_t some_func(input_t * input, output_t ** output) {
output_t * new_item;
status_t status;

/* Do stuff. Populate 'new_item', 'status' */

if (output) {
/* Caller wants to refer to the result */
*output = new_item;
}
return status;
}

Here, if 'output' does not compare equal to a null pointer, we'll
attempt indirection on it. If the caller passed an argument for
'output' which they believed to be one of {null pointer, pointer to an
object}, then our test here yields a false positive for the latter if
what they actually passed was the third possibility: A non-null pointer
to no object.
What makes you think that 32-bit pointers on MS Windows have no
trap representations? `(void*)0x0000000C` almost certainly is a
trap representation for the implementation in question. (I'm using
the phrase "trap representation" as the C standard uses it; I lack
interest in any other meanings *unless* some documentation uses
that exact phrase with a different meaning.)

Well I didn't quite say that. I was referring to a particular subset of
32-bit pointers. I would certainly consider the representation
0x00000001 to be a trap representation for an 'int *' in 32-bit Windows.

For VA32 (which would correspond to 'void *', 'char *', etc.), the
reason I would think that is that I've over a decade of use, I guess. I
could be wrong! But here's a starting-point, perhaps:

http://msdn.microsoft.com/en-us/library/k26sa92e.aspx
Such a pointer value is either a null pointer or a trap representation.

Wait, are you saying that any pointer that does not match one of {null
pointer, points to an object} must necessarily be a trap representation?
I no longer do. Whatever relevant claims you were making, you've now
quietly backed away from them.

I haven't intentionally backed away from any claims. I wish I knew what
claims you might be referring to.

If you're asking about "null pointer", I thought Mr. Philip Lantz
already answered that when he said: "This traps in the debugger, and the
debugger reports a "null pointer dereference" at address 0x0000000c."
We could have avoided wasting a great
deal of time if you'd done so sooner.

I really don't know what you're talking about. Here's my recollection
(quotes aren't actual quotes, but interpretations):

Me: "I thought null pointers having 0x0C as the least-significant byte
was \"a thing,\" too, but now I can't remember having seen that
documented anywhere."

You: "I'm fairly sure Microsoft uses all-bits-zero for null pointers."

Me: "Yes I didn't mean via the strict C definition, but via a practiced
usage of the term, with Microsoft."

You: "Null pointers and trap representations aren't the same thing."

Me: "I'm talking about a pointer value that behaves like a null pointer."

You: "Does it compare equal to 'NULL'?"

Me: "Good point; no, it doesn't. But it still doesn't point to any
object and only invoked undefined behaviour once used for indirection.
That makes it similar to a null pointer, in my view. What do you think?"

You: "It seems more like a trap representation."

Me: "(Well, it isn't, strictly speaking...)"

My last one shouldn't've followed so closely behind the question before
it, because the question was confusing. I don't know if there's a
problem other than confusion. I hope not.
 
S

Shao Miller

Of course. The "C semantics" are exactly as you describe: "anything
goes". If such a program caused the machine to halt, that, too, would
be operating as per "C semantics".

No, it wouldn't. The C semantics do not include undefined behaviour.

"[...what a constraint violation is...] Undefined behavior is
otherwise indicated in this International Standard by the words
‘‘undefined behavior’’ or by the omission of any explicit definition of
behavior. There is no difference in emphasis among these three; they all
describe ‘‘behavior that is undefined’’."

I used the word "subtle." There is _no_chance_ that _anything_other_
than what _is_ described by the C Standard will happen. No undefined
behaviour. No trap representation.
Presumably you read the footnote so you must be aware of all the invalid
values that this clause covers.

A particular bit-pattern (0xC in this case, I think) can either
represent a valid value (for the pointer type in question), an invalid
value, or it can be a trap representation. These three possibilities
are, at a particular time in the program's execution, mutually
exclusive. The middle category, invalid pointer values, needs 6.5.3.2
p4.

You appear to be agreeing with me. Did you think I meant something else?
What's the difference? The behaviour you describe here matches what you
can expect from what C calls a trap representation.

The difference is that the behaviour is _defined_, instead of
_undefined_. Yes, the behaviour for both can match. No, the
expectation is different between them; you don't know what to expect
from undefined behaviour.
But you don't say why. The "it" above presumably refers to what we've
been talking about -- that 0xC bit-pattern. Your only evidence that
it's not a trap representation seems to be that nothing "odd" happens
until you dereference it. That's entirely consistent with it being a
trap representation.

I say why right below. I didn't realize that you couldn't easily accept
this and that you required evidence.
Are you saying that something can't be a trap representation because
something other than C defines what happens when it's used? That's
exactly why, in part, C leaves so many things undefined (the behaviour
of trap representations being one such thing) so that implementations
are free to do useful things in such situations.

I'm not suggesting that at all. This is a case of Standard behaviour
plus implementation-defined behaviour. Since Keith asked for it, I dug
it up from Microsoft:

"an integral type can be converted to a pointer type according to the
following rules:

- If the integral type is the same size as the pointer type, the
conversion simply causes the integral value to be treated as a pointer
(an unsigned integer)."

Does that help in any way? Can the subject of the parentheses have a
trap representation if all 32 bits are value bits?
 
S

Shao Miller

No. Accessing an object with a trap representation has undefined
behavior.

I don't understand this as an answer for "it is imprecise to say that
such a value has a trap representation, because the behaviour is still
well-defined." I don't understand it as an answer for "Otherwise, the
last sentence of N1570's 6.5.3.2p4 is redundant." Did you think that I
didn't think that accessing a trap representation was undefined behaviour?
What that means is that the behavior is not defined *by the C
standard*; see [N1570] 3.4.3, the definion of the phrase "undefined
behavior". Another entity can certainly define behavior for such
accesses.

That is true and is also not applicable to the case under discussion.
For example, (void*)0x0000000C is very likely a trap representation
under Windows NT.

I don't know why you find that to be likely. Maybe I've just done too
much x86?
The behavior of applying unary "*" to that value is
undefined, in the sense that the C standard does not define its
behavior. Windows NT can define the behavior if it likes; that doesn't
cause it to cease being a trap representation.

I am thankful for your patience with the discussion. I think I
understand that what was missing in the discussion so far was
"implementation-defined", which _also_ allows Windows NT to define the
behaviour.
Yes, it does.

No, it doesn't. :) The behaviour that is being discussed _is_ defined,
_until_ the indirection. It is not undefined before that point.
I'm afraid that what you wish is irrelevant.

Give me a break. How unfortunate it would be for Microsoft to market
their compiler as "taking care of all your undefined behaviour needs!"
Better confidence is possible.
It has undefined behavior.

Maybe you're right (I've been wrong!), but I don't see why you believe
that strongly enough to assert it.
[...more about undefined behaviour...]
This was in regards to the lower part of his post where he
specifically says "Microsoft documents their compiler trap values".

Here's Geoff's article:

https://groups.google.com/group/com...6919e10?dmode=source&output=gplain&noredirect

The "compiler trap values" that Microsoft documents are not the same
thing as C "trap representations" (that's probably why they have
a *different name*, and Geoff merely said that they're *analagous"
to C trap representations. For example, "Clean Memory" is filled
with 0xCD bytes, which can easily represent a valid value for some
type (certainly for unsigned char).

Did you think I meant something else? That is _precisely_ why I said it
was a "Wonderful, wonderful summary!" That is why I was talking about
this "analogous" thing, and not trap representations. I tried to
clarify that a few times, but only managed to confuse. Sorry about that.
Me: "Nice summary! Say, isn't there another one, 0x0000000C, as a
[non-C] trap for [non-C] null pointers? I don't remember..."

C: "Oh no, that wouldn't be a [C] null pointer."

Me: "Yes, well I was talking about B's [non-C] trap values. I see a
Microsoft debugger catch these things and call them [non-C] null
pointers."

I think you missed Geoff's point.

I doubt it.

If you think Geoff was using the phrase "null pointer" or "trap
representation" to refer to anything other than a null pointer
or trap representation as defined by the C standard, then yes,
you missed Geoff's point.

No, I don't think that. I have no idea why you might think that I think
that.
"I see a Microsoft debugger catch these things and call them [non-C]
null pointers." -- I don't believe you have seen that.

"These things" == the subject that Geoff had most recently discussed:
"[non-C] trap values".
 
K

Keith Thompson

Shao Miller said:
status_t some_func(input_t * input, output_t ** output) {
output_t * new_item;
status_t status;

/* Do stuff. Populate 'new_item', 'status' */

if (output) {
/* Caller wants to refer to the result */
*output = new_item;
}
return status;
}

Here, if 'output' does not compare equal to a null pointer, we'll
attempt indirection on it. If the caller passed an argument for
'output' which they believed to be one of {null pointer, pointer to an
object}, then our test here yields a false positive for the latter if
what they actually passed was the third possibility: A non-null pointer
to no object.

Right, because, as I said, checking whether a pointer is null does not
reliably check whether it may be dereferenced. It's a false negative
for the question "May I safely dereference this pointer?". It's not a
false negative for the question "Is this a null pointer?"

The test yields a false negative because it's not a valid test for what
it's trying to test. (No such valid test is possible in portable C.)
Well I didn't quite say that. I was referring to a particular subset of
32-bit pointers. I would certainly consider the representation
0x00000001 to be a trap representation for an 'int *' in 32-bit Windows.

Yes, you did quite say that. You said the pointer type "has no trap
representations".
For VA32 (which would correspond to 'void *', 'char *', etc.), the
reason I would think that is that I've over a decade of use, I guess. I
could be wrong! But here's a starting-point, perhaps:

http://msdn.microsoft.com/en-us/library/k26sa92e.aspx

That looks a lot like the C Standard's requirements for pointer
conversions, with some extra information about how Microsoft's compiler
performs such conversions. Unlike the standard, it doesn't mention trap
representations.
Wait, are you saying that any pointer that does not match one of {null
pointer, points to an object} must necessarily be a trap representation?

I believe so, yes.

Given a pointer value that is neither a null pointer nor a pointer to an
object, what criteria would cause you to claim that it's *not* a trap
representation?
I haven't intentionally backed away from any claims. I wish I knew what
claims you might be referring to.

If you're asking about "null pointer", I thought Mr. Philip Lantz
already answered that when he said: "This traps in the debugger, and the
debugger reports a "null pointer dereference" at address 0x0000000c."

I don't remember that particular statement. It's been a long thread.

/* Let struct foo have a member m at offset 12 (0xc) */
struct foo *ptr = NULL;
do_something_with(foo->m);

The evaluation of `foo->m` has undefined behavior because the value of
`foo` is a null pointer. It's likely that the generated code will take
the value stored in `foo` (0x00000000), add the offset 12 to it
(yielding 0x0000000c), and then attempt to dereference the resulting
pointer value. If the program is being executed under the debugger,
this is likely to cause a trap. The debugger sees an attempt to
dereference address 0x0000000c and, quite reasonably, infers that it was
probably the result of accessing a member of a structure or class, at an
offset of 12 bytes, via a null pointer. The debugger may well have
other information available that makes that inference stronger.

(Conceivably a compiler could generate code to test the value of ptr
against 0x00000000 before attempting to add the offset to it; that could
catch the error slightly sooner and more directly, but at a considerable
performance cost.)

I believe you've been implying that this means that 0x0000000c is
a null pointer. It isn't. Seeing Philip's remark, quoted above,
it's a little clearer why you might have thought so.

[snip]

A pointer with the value 0x0000000c is not a null pointer, nor does it
point to any object. It is, I believe, a trap representation. In
certain contexts, the existence of such a pointer may imply that there's
been an attempt to dereference a null pointer; the null pointer itself
(in this particular implementation) has the representation 0x00000000.

Finally, a correction to something I may or may not have implied
earlier. Creating such a pointer, by evaluating `(void*)0x0000000c`,
does not itself have undefined behavior; 6.3.2.3p5 says that the
conversion may yield a trap representation, not that the conversion
itself has undefined behavior. Any attempt to *use* such a pointer
value does have undefined behavior.

0x0000000c;
/* obviously ok, no UB */

(void*)0x0000000c;
/* Creates a trap representation, then discards it; no UB */

p = (void*)0x0000000c;
/* Attempts to access a trap representation, UB */

if ((void*)0x0000000c) == NULL) ...
/* UB */

It's very likely that the third statement will quietly store the
expected value in `p`, and that the last will evaluate the condition as
false; these are entirely consistent with the behavior being undefined.
 
K

Keith Thompson

Shao Miller said:
I'm not suggesting that at all. This is a case of Standard behaviour
plus implementation-defined behaviour. Since Keith asked for it, I dug
it up from Microsoft:

"an integral type can be converted to a pointer type according to the
following rules:

- If the integral type is the same size as the pointer type, the
conversion simply causes the integral value to be treated as a pointer
(an unsigned integer)."

Microsoft's wording here implies that a pointer is an unsigned integer.

Microsoft is wrong. (It may be just sloppy wording.)

A pointer may well have the same *representation* as an unsigned
integer, and many of the same behaviors (which is probably what they
meant), but there are two distinct things.
Does that help in any way? Can the subject of the parentheses have a
trap representation if all 32 bits are value bits?

The standard's discussion of "value bits" applies only to integer types.
It doesn't say enough about how pointers are represented for the concept
of "value bits" to meaningfully apply to pointers.

Yes, the result of converting an integer to a pointer can be a trap
representation.
 
G

Geoff

I thought it was obvious that the mechanism was via page fault and that
the bits of the attempted address were examined in order to determine
some useful information about the nature of a recent problem. All I was
wondering was if 0x0000000C had a particular meaning for debugging, like
your values above.

Mr. Philip Lantz suggested that the origin of such a thing is from a
pointer resulting from a computation involving a null pointer. (See
below.) Having read his post, this seems pretty obvious to be the
likely case, to me.
struct {
int a, b, c, d;
} *p = NULL;

p->d = 0;

Generates: mov dword ptr ds:[0Ch],0

The answer is simple. It is not a special representation or a trap value within
the context of C or even of the OS, it's a consequence of the construction of
the compiled code.

In the case above, 0x0000000c is the result of the addition of the base pointer
into the structure with the offset within the structure, in this case decimal 12
bytes. If you change p->d = 0; to p->a = 0; the result is an unhandled exception
yielding 0x00000000.

There is no significance to the value of that address, other than it is the
actual null (base) pointer plus the offset into the member (if any).

The access is actually trapped by the memory protection hardware within the x86
since the memory address being accessed is outside the process address space.
 
T

Tim Rentsch

Keith Thompson said:
Finally, a correction to something I may or may not have implied
earlier. Creating such a pointer, by evaluating `(void*)0x0000000c`,
does not itself have undefined behavior; 6.3.2.3p5 says that the
conversion may yield a trap representation, not that the conversion
itself has undefined behavior. Any attempt to *use* such a pointer
value does have undefined behavior. [snip examples]

Strictly speaking the result of a conversion cannot be a trap
representation, because the result of a conversion is a value
and a trap representation is not a value but an object
representation. The Standard is being careless in how it
uses the term here.

Despite that, the intention seems clear, namely, that converting
any value where the result cannot be used as a pointer (for
example, for comparison to NULL) is undefined behavior,
regardless of whether the resulting value is used or discarded
(ie, converted to (void)). There isn't any point in being
allowed to produce a value but then not be able to do anything
with it, or even just store it. Furthermore this is how we
expect actual hardware would work -- it is trying to construct
a bogus address value that is likely to cause a trap, not doing
a store operation to put the bogus result in memory.

A quote from the C99 Rationale document might be illuminating
here:

Since pointers and integers are now considered incommensurate,
the only integer value that can be safely converted to a
pointer is a constant expression with the value 0.

Considering the context in which it was made, this statement
seems exactly on point.
 
G

glen herrmannsfeldt

(snip)
In the case above, 0x0000000c is the result of the addition of the base pointer
into the structure with the offset within the structure, in this case decimal 12
bytes. If you change p->d = 0; to p->a = 0; the result is an unhandled exception
yielding 0x00000000.

Reminds me of my first program using strchr().
(Or maybe the BSD index().)

After being used to the PL/I INDEX, and since I wanted the
position, not the pointer, I subtracted the first argument:

j=index(str,'x')-str;

Then when I had to test if it wasn't found,

if(j+str==NULL)

-- glen
 
S

Shao Miller

I thought it was obvious that the mechanism was via page fault and that
the bits of the attempted address were examined in order to determine
some useful information about the nature of a recent problem. All I was
wondering was if 0x0000000C had a particular meaning for debugging, like
your values above.

Mr. Philip Lantz suggested that the origin of such a thing is from a
pointer resulting from a computation involving a null pointer. (See
below.) Having read his post, this seems pretty obvious to be the
likely case, to me.
struct {
int a, b, c, d;
} *p = NULL;

p->d = 0;

Generates: mov dword ptr ds:[0Ch],0

The answer is simple. It is not a special representation or a trap value within
the context of C or even of the OS, it's a consequence of the construction of
the compiled code.

In the case above, 0x0000000c is the result of the addition of the base pointer
into the structure with the offset within the structure, in this case decimal 12
bytes. If you change p->d = 0; to p->a = 0; the result is an unhandled exception
yielding 0x00000000.

There is no significance to the value of that address, other than it is the
actual null (base) pointer plus the offset into the member (if any).

Mr. Philip Lantz' post makes that clear. Your post here makes it
abundantly clear. :)
The access is actually trapped by the memory protection hardware within the x86
since the memory address being accessed is outside the process address space.

(Outside of an accessible memory "pool", using Windows terminology.)
And in fact, even addresses that do reference into a memory pool can
cause traps ("paging," as I'm sure you know). And we wouldn't call
those trap representations either, I hope.
 
T

Tim Rentsch

Ben Bacarisse said:
A particular bit-pattern (0xC in this case, I think) can either
represent a valid value (for the pointer type in question), an invalid
value, or it can be a trap representation. These three possibilities
are, at a particular time in the program's execution, mutually
exclusive. The middle category, invalid pointer values, needs 6.5.3.2
p4.

Actually there are four kinds of pointer values (with the
understanding that "value" here includes some that cannot
be used definedly):

1. unusable (any use is undefined behavior)
2. null pointers (can be compared for equality/inequality)
3. equality, pointer arithmetic, relational (eg <) comparison
4. like 3 but also can be dereferenced

Type 3 values are, eg, pointers one past the end of an array, or
non-null values returned from doing a malloc(0). Type 4 values
are regular pointers to objects.

The footnote makes it clear that the phrase 'invalid value' used
in 6.5.3.2 p4 means categories 1-3 above. However this meaning
of the phrase is meant to apply only to this section.

If a pointer value is stored, the resulting object representation
can be a trap representation only for values of type 1. (Of
course any attempt to store a value of type 1 could store anything
at all, including some representation of any of the above
categories.)

If an object is read as a pointer type, and the resulting value
is of type 1 (and assuming there wasn't anything else causing
undefined behavior), then the object representation of that
object must be (or have been) a trap representation (when
considered as the type used to do the access). This follows from
the definition of trap representation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,955
Messages
2,570,117
Members
46,705
Latest member
v_darius

Latest Threads

Top