[union] Pointers to inherited structs are valid ?

  • Thread starter Maciej Labanowicz
  • Start date
T

Tim Rentsch

Shao Miller said:
Look harder. Think more. Write less.

Please don't resort to this sort of personally-directed nonsense
as you've done before. If you don't have an answer, please simply
say so. If you really think I've missed something, it'd certainly
be more helpful to point it out instead of implying laziness or
stupidity.

If you think I write too much, well, I think you write too little
Standard, and too much "Mr. T. Rentsch knows best." Unfortunately,
that doesn't work for me, as your knowledge isn't directly accessible
to me. I'm sorry if that makes our discussions difficult! If you
choose to help me to understand your valuable perspective, I'll be
appreciative.

Just in case you're nit-picking an error in the code that hardly
seems relevant to the meat of the question, please allow me to offer
the corrected code:

void reinterpret(void) {
union {
void * vp;
char * cp;
} u;
u.vp = &u;
u.cp = u.cp + 1;
/* Hmm ^^^^ */
}

int main(void) {
reinterpret();
return 0;
}

Otherwise, would anyone else please point out what I might've missed
about whether or not the above example results in undefined behaviour?
The "shall"[6.5p7] is outside of a constraint, so that'd seem to be
undefined behaviour if the lvalue under consideration is 'u.cp'. If
the lvalue is 'u', then its union type _is_ permitted by 6.5p7 (as
acknowledged in a previous post, above), but it'd be good to know
_which_ is the lvalue under consideration.

Let me offer a longer comment explaining what I was trying to say
and why. I preface this with a disclaimer that none of what
follows is meant as a statement of fact but merely my perceptions
and opinions.

I think you have a genuine interest in learning and understanding
C and what the Standard says about the language, and a sincere
desire to participate in discussion in both main newsgroups for
that.

Unfortunately, how you express yourself gets in the way of doing
that. Based on your writing, you seem like someone who is a
careless reader, a lazy writer, and who tends to think with his
mouth more than with his brain. Upon choosing to write, you write
the first thing that pops into your head, wandering like an
meandering river until you arrive at some destination, perhaps
related to what prompted you to start writing in the first place,
and perhaps not. More than any other poster in clc/csc that I am
aware of, you post followups to your own comments, giving second
thoughts, third thoughts, afterthoughts, tangential thoughts, and
(of course) corrections. There isn't anything wrong with doing
any of these things, but doing so as often as you do gives the
impression that you don't think through what you want to say when
you first say -- that is, write -- it.

Just as important is the matter of _how_ you say what you want to
communicate. Many times in reading your writing I don't know what
point you're trying to make or what question you want answered.
Even worse, sometimes I'm not sure _you_ know. This discourages
me from trying to read what you are writing, because it takes so
much effort to try to read it. It seems like either you don't
understand how to express yourself clearly, or you aren't willing
to make the effort to do so.

Besides that, a lot of times you ask questions that it seems like
you could answer yourself if you just took the time to do so. An
example came up recently in comp.std.c where you asked about a
change in wording in a paragraph describing pointer conversions.
This question was easily answerable in only a few minutes either
by doing a text search or by looking in the index. And I don't
think this is an isolated example. It's the relative frequency
that matters -- everyone has a blind spot occasionally, but it
seems to occur more rarely for most people than it does for you.
This further reduces my motiviation to try to read your comments
or put effort into crafting a reply.

The suggestions I gave earlier weren't meant as criticism or as a
complaint about your writing. It's true they were born largely
out of exasperation, but my intention was to offer helpful advice.
If you choose to disregard that advice, well that's up to you.
However, I don't feel any obligation to try to help someone who
not only ignores my attempts to be helpful but also asks in a way
that's easy for him but makes things harder for the people he is
asking. You want to reduce your confusion about this example? I
made suggestions that I thought would help you do that. You want
my help in addressing future confusions? Following, or even
clearly making an earnest effort of trying to follow, those same
suggestions is also the best way to do that. You want help but
don't want to change what you do in asking for it? In that case
you shouldn't expect me to try to help or to respond in some
particular way just because it happens to suit what you want.
 
S

Shao Miller

The type unsigned char does not have trap representations. There
are no exceptions. Types that don't have trap representations
never have a trap representation.

In C11, accessing a variable like 'c' above before it has been
initialiized is undefined behavior. But that is because C11
added (relative to, eg, N1256) a specific statement regarding
such cases, stating explicitly that the behavior is undefined;
it has nothing to do with provenance or trap representations.
Indeed, seeing that this proviso was added in C11 makes it
obvious that DR 260 doesn't apply to cases like the example
above, because otherwise there would be no reason to add it.

Now I think I understand what you are talking about, here. I think you
are discussing a trap representation as if it is associated with a type,
whereas I was discussing it as associated with an indeterminate value.

" 3.19.2
1 indeterminate value
either an unspecified value or a trap representation"

We could probably agree that the value of 'c' is indeterminate.

So you would say that in C11, 'c' has an unspecified value, which, once
read, leads to undefined behaviour, and that it does not have a trap
representation, which, once read, would not lead to undefined behaviour
because of the exemption for character type lvalues.

I would say that the value of 'c' happens to have a bit pattern that is
identical to a bit pattern representing an unspecified value, but that
reading it still leads to undefined behaviour (see committee discussion
above), just as a trap representation would for non-character types of
lvalues.

So if I'd typed "indeterminate value," it mightn't've been clear that I
was emphasizing the undefined behaviour of a read. If I'd typed
"unspecified value," same problem. So I chose to type "trap
representation." But to have avoided disagreement, I suppose I ought to
have typed something longer than that.

I don't know why you might think that DR #260 and DR #338 are so
different. Both cases involve the provenance of an object's value, and
how the mapping of object representation to value isn't the whole story.
DR #260 happened to come 6 years earlier and happens to explain it
pretty nicely, in my opinion. My point way above was that since pointer
representations are opaque (unlike the only other type of scalar), then
it is convenient that "more" of "the whole story" _can_ be encoded directly.

But if you say that my use of "trap representation" in the two previous
posts above is a misuse, I will not offer an argument against that. :)
Thank you for clarifying and for [albeit indirectly] referring to the
change brought about by DR #338. (I hope.)
 
S

Shao Miller

[some thoughtful and genuine criticisms]

Just in case my private e-mail goes into your spam folder: I sincerely
thank you for your criticisms; certainly much food for thought. I have
no complaint about such criticisms except that they're off-topic, _here_. :)
 
T

Tim Rentsch

Shao Miller said:
Now I think I understand what you are talking about, here. I think
you are discussing a trap representation as if it is associated with a
type, whereas I was discussing it as associated with an indeterminate
value.

" 3.19.2
1 indeterminate value
either an unspecified value or a trap representation"

Whether a given object representation is a trap representation
is defined only the context of a particular type.

The term "indeterminate value" is defined in terms of (among
other things) trap representations, not the other way around.
We could probably agree that the value of 'c' is indeterminate.

That may be true, but it has no bearing on what I said about
trap representations.
So you would say that in C11, 'c' has an unspecified value, which,
once read, leads to undefined behaviour, and that it does not have a
trap representation, which, once read, would not lead to undefined
behaviour because of the exemption for character type lvalues.

No. The reading takes place only when the behavior is, or might be,
defined. In cases like this one the undefined behavior occurs
before there is any attempt at reading (alternatively, instead
of an attempt at reading). Whether c holds a legal value or
not, or any value at all, is completely immaterial.
I would say that the value of 'c' happens to have a bit pattern that
is identical to a bit pattern representing an unspecified value, but
that reading it still leads to undefined behaviour (see committee
discussion above), just as a trap representation would for
non-character types of lvalues. [snip elaboration]

PLease read 6.3.2.1 p2 again carefully. No reading takes place
(obviously not counting the possibility that anything could have
taken place because the behavior is undefined). What value c
has, or whether it has a value, or whether storage has even been
allocated for c, has no bearing either on what happens or how
the Standard describes what happens.
 
T

Tim Rentsch

Shao Miller said:
[some thoughtful and genuine criticisms]

Just in case my private e-mail goes into your spam folder: I
sincerely thank you for your criticisms; certainly much food
for thought.

You're welcome, although I didn't mean to criticize, only give my
own impressions. But in any case I hope you find them helpful.
I have no complaint about such criticisms except that they're
off-topic, _here_. :)

I usually don't think about whether something is "on topic" or
not. If I think it will on the average provide a benefit to
those people who are likely to read it, typically it gets
posted. Of course I'm not always right in my judgments in
that respect, but, oh well, nobody's perfect.
 
S

Shao Miller

Whether a given object representation is a trap representation
is defined only the context of a particular type.

Yes I think I grok your model (but could be mistaken)... If I
understand you correctly, type T has _exactly_ 2 to the power of (sizeof
(T) * CHAR_BIT) possible object representations. At some time S during
execution, those object representations can be partitioned thusly: Valid
values, trap representations. Furthermore, different object
representations can represent the same value.

Have I described your model correctly? Does S make any difference in
your model? At time S, is the partitioning the same for all objects
with type T, or can it be different for different objects?
The term "indeterminate value" is defined in terms of (among
other things) trap representations, not the other way around.

Agreed.


That may be true, but it has no bearing on what I said about
trap representations.

I assume that you are referring to the equivalent of "An object with
type 'unsigned char' cannot possess an object representation that is a
trap representation for that type, as there are no trap representations
for 'unsigned char'"
So you would say that in C11, 'c' has an unspecified value, which,
once read, leads to undefined behaviour, and that it does not have a
trap representation, which, once read, would not lead to undefined
behaviour because of the exemption for character type lvalues.

No. The reading takes place only when the behavior is, or might be,
defined. In cases like this one the undefined behavior occurs
before there is any attempt at reading (alternatively, instead
of an attempt at reading). Whether c holds a legal value or
not, or any value at all, is completely immaterial.
I would say that the value of 'c' happens to have a bit pattern that
is identical to a bit pattern representing an unspecified value, but
that reading it still leads to undefined behaviour (see committee
discussion above), just as a trap representation would for
non-character types of lvalues. [snip elaboration]

PLease read 6.3.2.1 p2 again carefully. No reading takes place
(obviously not counting the possibility that anything could have
taken place because the behavior is undefined). What value c
has, or whether it has a value, or whether storage has even been
allocated for c, has no bearing either on what happens or how
the Standard describes what happens.

Ok, I've re-read it. I don't understand what you've just said in the
last sentence and in the sentence further above it "Whether c...". I'd
better return to "We could probably agree that the value of 'c' is
indeterminate". Is this true?

I'll assume that you're uninterested in my explanation of why I used the
term "trap representation", since you know it to be one thing and I
meant something else.

I'll assume that you're uninterested in discussing potential
similarities between DR #260 and DR #338, since you've already stated
that you do not believe DR #260 is relevant, and explained why.

Once I've understood what you mean by "trap representation," perhaps I
can adjust my "Surely if..." accordingly.

Thank you.
 
T

Tim Rentsch

Shao Miller said:
Shao Miller said:
On 1/12/2013 17:52, Tim Rentsch wrote:

On 1/6/2013 23:00, Tim Rentsch wrote:
Surely if, in

void somefunc(void) {
unsigned char c;
/* ... */
}

[snip]

I would say that the value of 'c' happens to have a bit pattern that
is identical to a bit pattern representing an unspecified value, but
that reading it still leads to undefined behaviour (see committee
discussion above), just as a trap representation would for
non-character types of lvalues. [snip elaboration]

PLease read 6.3.2.1 p2 again carefully. No reading takes place
(obviously not counting the possibility that anything could have
taken place because the behavior is undefined). What value c
has, or whether it has a value, or whether storage has even been
allocated for c, has no bearing either on what happens or how
the Standard describes what happens.

Ok, I've re-read it. I don't understand ... [snip]

Think, man, think! This isn't that hard of a problem.
 
S

Shao Miller

Shao Miller said:
On 1/12/2013 17:52, Tim Rentsch wrote:

On 1/6/2013 23:00, Tim Rentsch wrote:
Surely if, in

void somefunc(void) {
unsigned char c;
/* ... */
}

[snip]

I would say that the value of 'c' happens to have a bit pattern that
is identical to a bit pattern representing an unspecified value, but
that reading it still leads to undefined behaviour (see committee
discussion above), just as a trap representation would for
non-character types of lvalues. [snip elaboration]

PLease read 6.3.2.1 p2 again carefully. No reading takes place
(obviously not counting the possibility that anything could have
taken place because the behavior is undefined). What value c
has, or whether it has a value, or whether storage has even been
allocated for c, has no bearing either on what happens or how
the Standard describes what happens.

Ok, I've re-read it. I don't understand ... [snip]

Think, man, think! This isn't that hard of a problem.

Too long; didn't read. :)

But seriously, I'd tried to break this part of the discussion into tiny
little pieces. My previous post had 4 questions. Three of them could
be answered with "yes" or "no" and the other also had two possible
answers. Is that so much to ask?













Instead, you type a one-liner that answers none of the questions,
doesn't demonstrate that agreement or understanding are even possible,
presents me to be stupid, causes me to respond with a disproportionate
amount of text in an attempt to pull some teeth, while at other times
you complain about the amount I type. That hardly seems fair.

I'll tell you what I think: I think you've misunderstood what I don't
understand about what you said. I didn't say that I don't understand
your point about the UB coming before an act of reading.

I don't know if you are trying to hint that this undefined behaviour is
at translation-time, or something. It can't always be.

unsigned char scary(int x) {
unsigned char c;

if (x == 42)
c = '\0';
return c;
}

What I didn't understand is:

- Why you'd type "Whether c holds a legal value or not, or any value
at all, is completely immaterial". 6.3.2.1p2 seems to indicate that
it's _not_ immaterial. If it's initialized or assigned-to, it
_certainly_ holds a valid value (short of any prior UB).

- Why you'd type "What value c has, or whether it has a value, or
whether storage has even been allocated for c, has no bearing either on
what happens or how the Standard describes what happens". 6.3.2.1p2
seems to indicate that it _does_ have a bearing on what happens. If
it's initialized or assigned-to, it _certainly_ holds a valid value and
has storage (short of any prior UB).

That is to say, if we can prove[2] it has a valid value, then we can
deduce that it was initialized or assigned-to, because that proof[2]
must involve such operations. Please remember that I was originally
talking about the possibility that 'c' could have a trap
representation[1]. If that[1] is false, then the proof[2] of this other
business _needn't_ involve such operations; there're only unspecified
values allowed. But you can't establish that trap representation[1] is
false by explaining how some other form of undefined behaviour is true,
so I didn't understand the relevance of your two statements.

It looked like you were making general statements, but perhaps you were
addressing _only_ the code example at the top, which _could_ be detected
at translation-time?

What about the rest of my previous post? Instead of giving a response
to each of the bits that might lead to some common ground, you've given
a cryptic response to one bit. It's like taking a sound-bite of a
politician saying "Uhhh" and promoting their stupidity in a campaign
against them, because stupid people frequently say "uhhh". If you'd
snipped after four more words, it'd be slightly different. But perhaps
this is where you snip in your mind, as well. "He doesn't understand
something, so he needs to think more!" :)
 
G

Geoff

unsigned char scary(int x) {
unsigned char c;

if (x == 42)
c = '\0';
return c;
}

Eh? What is the value of c when x is not 42?
What is contained in object c prior to the 'if' statement?
What code is executed when x is not 42?
 
T

Tim Rentsch

Shao Miller said:
Shao Miller said:
On 1/14/2013 15:04, Tim Rentsch wrote:

On 1/12/2013 17:52, Tim Rentsch wrote:

On 1/6/2013 23:00, Tim Rentsch wrote:
Surely if, in

void somefunc(void) {
unsigned char c;
/* ... */
}

[snip]

I would say that the value of 'c' happens to have a bit pattern that
is identical to a bit pattern representing an unspecified value, but
that reading it still leads to undefined behaviour (see committee
discussion above), just as a trap representation would for
non-character types of lvalues. [snip elaboration]

PLease read 6.3.2.1 p2 again carefully. No reading takes place
(obviously not counting the possibility that anything could have
taken place because the behavior is undefined). What value c
has, or whether it has a value, or whether storage has even been
allocated for c, has no bearing either on what happens or how
the Standard describes what happens.

Ok, I've re-read it. I don't understand ... [snip]

Think, man, think! This isn't that hard of a problem.

Too long; didn't read. :)

But seriously, I'd tried to break this part of the discussion into
tiny little pieces. My previous post had 4 questions. Three of them
could be answered with "yes" or "no" and the other also had two
possible answers. Is that so much to ask?

You're confusing what you want with which aspects
I believe are worth addressing.
 
S

Shao Miller

Eh? What is the value of c when x is not 42?
What is contained in object c prior to the 'if' statement?
What code is executed when x is not 42?

In order, excluding the first question: "Indeterminate", "an
indeterminate value", "'return c;'".

But don't be fooled, this indeterminate value is not just any ordinary
indeterminate value, it is one that we can never know, since using the
lvalue in a context in which it'd normally result in a read causes
undefined behaviour. Depending on who you ask, it does or doesn't look
like a trap representation, but quacks just like one, but a little
earlier than a TR would, in C11.

The C11 additional sentence to 6.3.2.1p2 was prompted by Defect Report
#338. The submitter suggested that "indeterminate value" be amended:

"either an unspecified value or a trap representation; or in the case
of an object of automatic storage duration whose address is never taken,
a value that behaves as if it were a trap representation, even for types
that have no trap representations in memory (including type unsigned char)"

In SC22WG14.11380, Mr. Douglas Gwyn suggests, "Trap rep. was an
unfortunate choice of name, having no necessary connection with
trapping; it was only meant to describe any bit configuration that would
not be a valid representation for the type."

In N1300.pdf, Mr. Clark Nelson suggested that besides a "variable"
having a valid value or a trap representation, another possible state
was "uninitialized", and that either of the other two states could still
apply.
 
G

Geoff

In order, excluding the first question: "Indeterminate", "an
indeterminate value", "'return c;'".

But don't be fooled, this indeterminate value is not just any ordinary
indeterminate value, it is one that we can never know, since using the
lvalue in a context in which it'd normally result in a read causes
undefined behaviour. Depending on who you ask, it does or doesn't look
like a trap representation, but quacks just like one, but a little
earlier than a TR would, in C11.

The C11 additional sentence to 6.3.2.1p2 was prompted by Defect Report
#338. The submitter suggested that "indeterminate value" be amended:

"either an unspecified value or a trap representation; or in the case
of an object of automatic storage duration whose address is never taken,
a value that behaves as if it were a trap representation, even for types
that have no trap representations in memory (including type unsigned char)"

In SC22WG14.11380, Mr. Douglas Gwyn suggests, "Trap rep. was an
unfortunate choice of name, having no necessary connection with
trapping; it was only meant to describe any bit configuration that would
not be a valid representation for the type."

In N1300.pdf, Mr. Clark Nelson suggested that besides a "variable"
having a valid value or a trap representation, another possible state
was "uninitialized", and that either of the other two states could still
apply.

I think what they might have been driving at with "trap representation" was the
possibility that the environment (e.g., a debugger or a debugging runtime) could
initialize the contents to a determinate value. This would be analogous to
Microsoft's debugger putting 0xCC, 0xDD, 0xCF values into various portions of
the process memory space, allowing coders to "trap" uninitialized variables. I
think the proper word would be "detect".

I think the language of the specification would have to allow this facility but
not define it.

Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.

0xDD Dead Memory Memory that has been released with delete or
free.
Used to detect writing through dangling
pointers.

0xED or Aligned Fence 'No man's land' for aligned allocations.
0xBD Using a different value here than 0xFD
allows the runtime to detect not only
writing outside the allocation,
but to also detect mixing alignment-specific
allocation/deallocation routines with the
regular ones.

0xFD Fence Memory Also known as "no mans land." This is used to
wrap the allocated memory (surrounding it
with a fence) and is used to detect indexing
arrays out of bounds or other accesses
(especially writes) past
the end (or start) of an allocated block.

0xFD or Buffer slack Used to fill slack space in some memory buffers
0xFE (unused parts of `std::string` or the user buffer
passed to `fread()`). 0xFD is used in VS 2005 (maybe
some prior versions, too), 0xFE is used in VS 2008
and later.

0xCC When the code is compiled with the /GZ option,
uninitialized variables are automatically assigned
to this value (at byte level).


// the following magic values are done by the OS, not the C runtime:

0xAB (Allocated Block?) Memory allocated by LocalAlloc().

0xBAADF00D Bad Food Memory allocated by LocalAlloc() with LMEM_FIXED,but
not yet written to.

0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().
 
S

Shao Miller

I think what they might have been driving at with "trap representation" was the
possibility that the environment (e.g., a debugger or a debugging runtime) could
initialize the contents to a determinate value. This would be analogous to
Microsoft's debugger putting 0xCC, 0xDD, 0xCF values into various portions of
the process memory space, allowing coders to "trap" uninitialized variables. I
think the proper word would be "detect".

I think the language of the specification would have to allow this facility but
not define it.

While that is certainly consistent with my experiences (having worked
with Microsoft environments for over a decade) where there isn't always
the luxury of having padding bits, and so seems an intuitive notion of
"trap representation" in _practice_, it doesn't seem 100% consistent
with Mr. Douglas Gwyn's note nor with perhaps the strictest reading of
the Standard, as Mr. T. Rentsch points out upthread. Oh well, we can
call it something else in discussion, such as "trappable-unspecified
value", so the term isn't confused with a Standard-defined term.
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.

0xDD Dead Memory Memory that has been released with delete or
free.
Used to detect writing through dangling
pointers.

0xED or Aligned Fence 'No man's land' for aligned allocations.
0xBD Using a different value here than 0xFD
allows the runtime to detect not only
writing outside the allocation,
but to also detect mixing alignment-specific
allocation/deallocation routines with the
regular ones.

0xFD Fence Memory Also known as "no mans land." This is used to
wrap the allocated memory (surrounding it
with a fence) and is used to detect indexing
arrays out of bounds or other accesses
(especially writes) past
the end (or start) of an allocated block.

0xFD or Buffer slack Used to fill slack space in some memory buffers
0xFE (unused parts of `std::string` or the user buffer
passed to `fread()`). 0xFD is used in VS 2005 (maybe
some prior versions, too), 0xFE is used in VS 2008
and later.

0xCC When the code is compiled with the /GZ option,
uninitialized variables are automatically assigned
to this value (at byte level).


// the following magic values are done by the OS, not the C runtime:

0xAB (Allocated Block?) Memory allocated by LocalAlloc().

0xBAADF00D Bad Food Memory allocated by LocalAlloc() with LMEM_FIXED,but
not yet written to.

0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.
 
K

Keith Thompson

Shao Miller said:
On 1/16/2013 20:32, Geoff wrote: [...]
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.
[snip]
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.

I'm fairly sure Microsoft uses all-bits-zero for null pointers. Most
implementations do the same thing, though of course the standard doesn't
require it.
 
S

Shao Miller

Shao Miller said:
On 1/16/2013 20:32, Geoff wrote: [...]
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.
[snip]
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.

I'm fairly sure Microsoft uses all-bits-zero for null pointers. Most
implementations do the same thing, though of course the standard doesn't
require it.

I think all-bits-zero is one null pointer value representation, but I
was talking about "trap representations" in practice (as opposed to a
discussion of those that depend on padding bits). In Windows NT
kernel-land, more often than not I see that when a null pointer is
trapped, it's actually _not_ all-bits-zero; differing in the LSB. The
debugger still calls it a null pointer.
 
K

Keith Thompson

Shao Miller said:
Shao Miller said:
On 1/16/2013 20:32, Geoff wrote: [...]
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.
[snip]
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.

I'm fairly sure Microsoft uses all-bits-zero for null pointers. Most
implementations do the same thing, though of course the standard doesn't
require it.

I think all-bits-zero is one null pointer value representation, but I
was talking about "trap representations" in practice (as opposed to a
discussion of those that depend on padding bits). In Windows NT
kernel-land, more often than not I see that when a null pointer is
trapped, it's actually _not_ all-bits-zero; differing in the LSB. The
debugger still calls it a null pointer.

A null pointer is not a trap representation; it's a perfectly valid
pointer value (that can't legally be dereferenced).

What exactly do you mean when you say "The debugger still calls it a
null pointer"? Does the debugger actually use the phrase "null pointer"
to refer to a pointer value that compares unequal to NULL?
 
S

Shao Miller

Shao Miller said:
On 1/16/2013 20:32, Geoff wrote:
[...]
Microsoft documents their compiler trap values:
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.

[snip]
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().

Wonderful, wonderful summary! I thought null pointers having 0x0C as
the least-significant byte was "a thing," too, but now I can't remember
having seen that documented anywhere.

I'm fairly sure Microsoft uses all-bits-zero for null pointers. Most
implementations do the same thing, though of course the standard doesn't
require it.

I think all-bits-zero is one null pointer value representation, but I
was talking about "trap representations" in practice (as opposed to a
discussion of those that depend on padding bits). In Windows NT
kernel-land, more often than not I see that when a null pointer is
trapped, it's actually _not_ all-bits-zero; differing in the LSB. The
debugger still calls it a null pointer.

A null pointer is not a trap representation; it's a perfectly valid
pointer value (that can't legally be dereferenced).

Agreed. And that's kind of what I was getting at... That this
representation is:

1. A valid null pointer representation (behaves as a null pointer in C
code; no undefined behaviour from using this pointer value unless
dereferencing)

2. Very consistently _not_ all-bits-zero, suggesting a practical use for
a debugger to trap (as opposed to a same-sized integer/object with
all-zeroes)
What exactly do you mean when you say "The debugger still calls it a
null pointer"? Does the debugger actually use the phrase "null pointer"
to refer to a pointer value that compares unequal to NULL?

It absolutely does. (I see it more often than I'd like, some days. ;) )
I suspect that it's an instance of trap representations in practice,
as opposed to "trap representations" in C theory.
 
K

Keith Thompson

Shao Miller said:
Agreed. And that's kind of what I was getting at... That this
representation is:

1. A valid null pointer representation (behaves as a null pointer in C
code; no undefined behaviour from using this pointer value unless
dereferencing)

2. Very consistently _not_ all-bits-zero, suggesting a practical use for
a debugger to trap (as opposed to a same-sized integer/object with
all-zeroes)

When you say it "behaves as a null pointer in C code", what exactly do
you mean by that?

Does it compare equal to NULL? If not, then it *doesn't* behave as a
null pointer in C code.
It absolutely does. (I see it more often than I'd like, some days. ;) )
I suspect that it's an instance of trap representations in practice,
as opposed to "trap representations" in C theory.

A lot of technical terms have specific meanings in C that may not apply
in other contexts. If a debugger refers to something other than a C
null pointer as a "null pointer", then (a) it's a pity that a tool that
deals with C source code uses a term in a manner that's inconsistent
with the C meaning of the term, but (b) it doesn't have much direct
bearing on C (i.e., on anything relevant to this newsgroup).

Can you point to some documentation that discusses this use of the term
"null pointer"?
 
G

glen herrmannsfeldt

Keith Thompson said:
When you say it "behaves as a null pointer in C code", what exactly do
you mean by that?
Does it compare equal to NULL? If not, then it *doesn't* behave as a
null pointer in C code.

I can imagine a system that masks off some bits before comparing
it to NULL. Not that I know any that actually do it.

(snip)
A lot of technical terms have specific meanings in C that may not apply
in other contexts. If a debugger refers to something other than a C
null pointer as a "null pointer", then (a) it's a pity that a tool that
deals with C source code uses a term in a manner that's inconsistent
with the C meaning of the term, but (b) it doesn't have much direct
bearing on C (i.e., on anything relevant to this newsgroup).
Can you point to some documentation that discusses this use of the term
"null pointer"?

-- glen
 
S

Shao Miller

When you say it "behaves as a null pointer in C code", what exactly do
you mean by that?

Does it compare equal to NULL? If not, then it *doesn't* behave as a
null pointer in C code.

Yeah, come to think of it, you're right: I guess it wouldn't've behaved
_exactly_ the same.
A lot of technical terms have specific meanings in C that may not apply
in other contexts. If a debugger refers to something other than a C
null pointer as a "null pointer", then (a) it's a pity that a tool that
deals with C source code uses a term in a manner that's inconsistent
with the C meaning of the term, but (b) it doesn't have much direct
bearing on C (i.e., on anything relevant to this newsgroup).

Well I have to offer partial disagreement with (a). I think that the
practical usage of some terms finds Standard C in a good position to
consider adjusting its definitions. :) WinDbg can handle C++ and
assembly too, so the terms it uses seem more likely to be "as practiced"
rather than "as preached".
Can you point to some documentation that discusses this use of the term
"null pointer"?

I thought null pointers having 0x0C as the least-significant byte was "a thing," too, but now I can't remember having seen that documented anywhere.

Sorry for the confusion. I was talking about a recollection, as shown
in the sentence, above. What I'm recalling is associated with
"NULL_CLASS_PTR_DEREFERENCE", but please see the sentence above,
regarding documentation.

I thought I remembered the LSB 0x0C pattern was used as "trap
representation" for "null pointers", but where neither of these terms
are quite what they are in C. If I ever come across such documentation,
I'll share it. Maybe Geoff has a clue?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,567
Members
47,203
Latest member
EmmaSwank1

Latest Threads

Top