D
Dietmar Schindler
Tim said:Let's explore that. Characters are things like a, b, c, .... A
multibyte character is a sequence of bytes, ie a sequence of
memory locations. The set of characters is of some fixed size;
there are only so many of them. The set of (sequences of) memory
locations is of potentially unbounded size; certainly there can
be more memory locations than there are characters. If the set
of characters is smaller than the set of memory locations, how
can the set of memory locations be a subset of the set of
characters? If the set of memory locations is not a subset of
the set of characters, how can a memory location be a character?
A sequence of memory locations can represent characters, but they
are not themselves characters.
When you wrote "A multibyte character is a sequence of bytes", you
omitted an important qualification. 3.7.2 defines "multibyte character"
as "sequence of one or more bytes representing a member of the extended
character set ...". Sequences of bytes not representing a member of the
character set are not multibyte characters. Therefore, I can't follow
your reasoning at this point.
Similarly, the bit patterns in an eight-bit byte can represent
the numbers from zero to 255. We don't mean that a memory
location IS a number; a memory location -- or lots of different
memory locations -- can REPRESENT a number by having a certain
bit pattern stored in it (or them). There are only 256 numbers
between 0 and 255, but there are a lot more than 256 memory
locations -- there aren't enough numbers to go around so that
every memory location is a number. Or do you mean to say two
distinct memory locations can be the same, because they are
both (for example) the number 5?.
I think you're using "is" in the sense of "can stand in for".
And in that sense, I think I would agree with you -- a sequence
of memory locations (and the bit patterns stored therein) can
stand in for a character. But in technical writing the word "is"
is normally used in a different sense: if an X is a Y, then
the set of X's is a subset of the set of Y's. That sense of "is"
doesn't fit with what your statement about multibyte characters
implies.
Actually, I think it fits quite well; if I expand the statement "a
multibyte character is a character", I get "the set of sequences of one
or more bytes representing a member of the extended character set is a
subset of the set of members of a set of elements used for the
organization, control, or representation of data". I admit this could
probably be expressed less complicated, but I consider it to be correct.
Yes, I suspect the thread has lasted as long as it has because
your understanding of language and logic differs from that of
some other readers (mine in particular, but also some others).
Here's an example. We store the bit pattern 1000000, with a
value of 0 (on a signed magnitude implementation), into an
eight-bit byte. Subsequent reads of the object alternately
return 00000000, 10000000, 00000000, ..., all of which still have
the value 0. Is this behavior allowed by the statement that "an
object retains its last-stored value"?
I guess it's allowed when the contents of the byte are interpreted as
having a signed type and it's disallowed otherwise. But, I'm sorry, I
don't see the relevance of this example.
Second example:
int n;
unsigned char *p = (void*) &n;
p[0] = 0, p[1] = 1, p[2] = 2, p[3] = 3;
What is the last-stored value (in the sense of 3.17) of the
object n? Suppose the value (in the sense of 3.17) of n is
0x00010203; how could this value have been stored, since
no expression in the running program yields that value?
The answer to your first question here is implementation-defined or
undefined, of course. I hope you don't think I dismiss the second one
too thoughtlessly (I really thought quite some time over it), but the
answer is provided by the code snippet above, and again, I don't see why
this example should force anyone to believe that 3.17 could not apply to
the meaning of value in "last-stored value".
The statements you give support my assertion that the term
"value" is used with different senses in different places in
the Standard. What do you think "value" means in the
statements you quote, since it can't mean the same thing as the
definition in 3.17 p1?
In my opinion, in these statements [snipped to one representative]
"An object exists, has a constant address,25) and retains its
last-stored value throughout its lifetime."
the meaning of "value" is not at all different from its definition in
3.17. When I insert the definition into the statements, I get
[snipped to one representative]
"An object exists, has a constant address,25) and retains its
last-stored precise meaning of the contents of the object when
interpreted as having its specific type throughout its lifetime."
This may sound a little clumsy, but it is not illogical, is it?
Part of my problem with your reformulation is that I don't know
what the revised wording is meant to say. What is it that's
being stored? Is it a meaning, or a bit pattern? Let's look at
a slightly different wording:
"An object exists, has a constant address,25) and retains the
precise meaning of its last-stored contents of the object when
interpreted as having its specific type throughout its lifetime."
Does your statement mean the same thing as the rewording, or does
it mean something different? The rewording makes it clear that
what is being stored is "contents" rather than "meaning". A
"value" under 3.17 is a "meaning", so a 3.17-value is not what's
being stored (if the rewording is accurate). So which is it? Is
the statement about retaining the last-stored value inconsistent
with the 3.17 notion of value, or do you mean something different
by your statement than my proposed rewording? If the latter, then
what rewording would you propose?
It appears to me that although your rewording changes what is being
stored from "meaning" to "contents", what is retained ("meaning")
remains the same, and so the observable behaviour of a program would
remain the same, making the two versions of the statement functionally
equivalent.
And remember, you answered my question
that way:
Can't in C. Even functions like memcpy() act on "arrays
of character type and other objects treated as arrays of
character type" (7.21.1 p1).
So, we can't store anything without "meaning", i. e. type.
The statement about retaining the last-stored value is, in my opinion,
as well as all other statements in the standard, not inconsistent with
the 3.17 notion of value.
I wouldn't say I find your revised definitions illogical,
but they do seem (at least to me) nonsensical.
... I've tried to divine what
it is you've been trying to say, and some of that is expressed
above (how I think you mean "is", for example). So let's see if
I can articulate what I think is the Standard is trying to say.
Disclaimer: very off-the-cuff remarks follow.
The word "value" is used in several senses in the Standard.
The definition given in 3.17 uses "value" in the sense of
how we are to interpret the contents of an object, when
interpreted in a particular context (namely, as what type).
Many places in the Standard use "value" not in the sense of 3.17,
but the sense of "raw bit pattern". Thus, when we say that an
object retains its last-stored "value", this wording is meant
to say that an object retains the last bit pattern stored into
that object. Similarly, we say a write access modifies an
object even if the raw bit pattern being stored matches the
bit pattern already in the object.
There is some confusion between the two senses, and that
confusion has prompted some esoteric discussion in the comp.*.c
newsgroups. For example, suppose a particular address is stored
in a pointer variable, and then the memory at that address is
deallocated (by calling free(), for example). Is the bit pattern
that was stored in the pointer allowed to change? I remember
there being discussion at some point on this question, although I
don't remember the outcome. However, part of what prompted the
discussion in the first place -- and what made it difficult -- is
the confusion over which sense of "value" is meant in statements
like "an object retains its last-stored value". If "value" means
"raw bit pattern" you reach one conclusion, and if "value" means
value-ala'-3.17 you (can) reach another. Some of the debate
centered around this confusion, although I don't remember if it
was ever explicitly identified.
Note that I'm not saying that one sense is the true sense and the
other sense is wrong; there is some ambiguity in the Standard
itself, and that ambiguity deserves (IMO) a DR (which hopefully
would result in revised and improved wording, but let's not get
into that right now). But -- and here is the important thing --
many places in the Standard where "value" is used are best read
as though "value" meant "raw bit pattern". And that sense of
value is definitely different than "value" as defined in 3.17.
Does that all make sense?
I would be exaggerating here if I claimed that all made sense to me, but
it certainly did to some extent, and I understand your position better
now.
Unfortunately (or fortunatly, as it may also be perceived) I'll not be
able to continue this interesting (at least for me, maybe not so for
others) discussion for the next two weeks, because I'll be on vacation.
So long!