sizeof array via pointer

M

Mark S.

Alf said:
This is also where the (int) cast came in handy - PR((int)cf[4])
actually displays the 0 instead of nothing (which could mean anything).

Well you didn't do that in the code presented so far, so it appears to
be some after-the-fact rationalization. But instead of the evil cast
consider just adding 0. For example. Only when your code is almost
chemically free of casts can you have any confidence that it is correct.
Cast are just very very evil.
Believe it or not I did play around with the char-array when I started
the exercise (since I didn't really have any experience with them) and
used it to see the values for various chars. ;)

But I will try to keep away from cast in the future, I promise!

Because the string literal is read-only. The compiler might place it in
read-only memory. And not only might but is likely to.
Hmm, I hear what you are saying but I really wonder why this was
designed this way. I thought all I was doing was defining an array (of
chars) with the perk that '\0' was added at the end. If they are so
limited, why use them at all (unless you really just need a read-only
variable)? After all, manipulation of text is a pretty basic task, isn't
it? I must admit I'm quite confused about this.
It's a good question.

The positive point about this exercise is that in addition to providing
some familiarity with pointers and indexing, it's about what has to be
done at the lowest level in order to get you a modifyable copy of a
zero-terminated string when all you have is a pointer to that string.

But generally, in C++ you can do that much more easily using
std::string. :)
I guess I will learn more about this later then (Vol. I of TICPP only
has a very short paragraph concerning strings).
One doesn't.

One ensures that it is freed, as appropriate.

Mainly, in C++ the way to do that is to use smart pointers and standard
library containers. Some programmers (e.g. James Kanze) also use garbage
collection. But no matter how you do it, in good C++ code there is
seldom any 'delete' to be seen anywhere (done via containers or smart
pointers), and 'new', if at all present, is wrapped in some function.
Okay, I guess I'll learn about smart pointers etc. later as well and
keep your words in mind for now.

Thanks again for all your time!
 
J

James Kanze

Alf P. Steinbach wrote:

[...]
Hmm, I hear what you are saying but I really wonder why this
was designed this way.

Because it's a literal. (Historically, it wasn't this way in
early C. But constants that aren't quickly render a program
unreadable, so it was changed. When you see "abcd" in the code,
it means "abcd", and not "xyz".)
I thought all I was doing was defining an array (of chars)
with the perk that '\0' was added at the end. If they are so
limited, why use them at all (unless you really just need a
read-only variable)? After all, manipulation of text is a
pretty basic task, isn't it? I must admit I'm quite confused
about this.

You use them as constants. Just as you use 42 or 3.14159 as
constants. You wouldn't like it if something like f( 42 )
actually passed 36 to the function, and it doesn't make any more
sense for f( "abcd" ) to actually pass "xyz" to the function.

[...]
The main reason there is seldom a delete seen anywhere is that
there is very little dynamic allocation in C++. If it's for
variable sized structures, like strings, the allocation and
delete are hidden deep down in the class internals. And
otherwise, it's a question of entity objects, which, often at
least, manage their own lifetimes. But based on your questions,
I don't think you're at the point yet where you need to be
concerned with them; you probably shouldn't be worrying about
dynamically allocated memory at all for the moment.
 
A

Alf P. Steinbach

* blargg:
You're saying that formally, the following isn't clearly well-behaved?

char* p = new char;
delete p;
p = new char;

Yes, it isn't /clearly/ well-behaved ;-).

As I wrote, formally it's a bit problematic.

For the C++ standard, unfortunately, only defines the term "used", not what the
thing is "used" for... And so the ODR, §3.2/2, explains that in the last
statement above p is "potentially evaluated", which it goes on to explain means
that it is "used". And §5.3.5/4, about delete, explains that the pointer at this
point has been rendered "invalid", so, we have a "use" of an "invalid" value.

And we're then in one of four or five situations:

* Someone is able to find the place in the C++ standard, or even the C
standard, that prohibits some kinds of using (such as copying) invalid
pointer values, and allows others (such as overwriting). In which case one
could perhaps reason more clearly about things. But until then, unknown.

* One assumes that since such wording is not to be found in either
standard, /any/ use whatsoever is Undefined Behavior, in which case
the above is formally UB -- ouch!

* One assumes that since such wording is not to be found in either
standard, /any/ use is well-defined except when rendered UB by rules
for UB of usage results (e.g. this would apply to dereferencing). But
this would mean that passing invalid pointer values around would be
OK. And we "know" that it's not, or at least, that's an opinion that
is often bandied about, with arguments based on checking behavior of
some computer architectures that presumably C++ should support.

* One assumes that since such wording is not to be found in either
standard, any use that requires /inspection/ of the value is UB,
while other uses are only UB if they're rendered UB by rules for
results. In other words, one essentially assumes that overwriting is OK,
but copying is not. As I understand it this is the usual assumption.

* One assumes something else, perhaps some ultra-refined view of things.

As I understand it there is an effort to delineate the cases of UB for the C
standard.

But as far as I know, there's no such effort for C++.


Or that this isn't clearly ill-behaved (uses indeterminate value)?

char* p = new char;
delete p;
assert( p );

Ah, formally, for C++, sorry, no! :) There is wording that *implies* to some
degree that dereferencing an invalid pointer value is UB, and at least it's my
impression, vaguely remembering things, also wording that implies that copying
an invalid pointer value is UB. But then there is also e.g. the explicit
allowed-to-dereference-nullpointer for typeid rule, and there's not even any
clear definition of "invalid" or "valid" re pointer values. It's a mess. IMHO.


[snip]
I'm not trying to upset you with my posts, by the way.

OK. :)


Cheers, & hth.,

- Alf
 
J

James Kanze

* blargg:

That's false, of course. There's no problem using the pointer,
as long as there's no lvalue to rvalue conversion.
Yes, it isn't /clearly/ well-behaved ;-).
As I wrote, formally it's a bit problematic.

Not at all.
For the C++ standard, unfortunately, only defines the term
"used", not what the thing is "used" for... And so the ODR,
§3.2/2, explains that in the last statement above p is
"potentially evaluated", which it goes on to explain means
that it is "used". And §5.3.5/4, about delete, explains that
the pointer at this point has been rendered "invalid", so, we
have a "use" of an "invalid" value.

And how is that related to the issue at hand? There's no rule
that you can't use a pointer once it's been deleted. At least,
I've never seen any.

There is a rule that:
If the argument given to a deallocation function in the
standard library is a pointer that is not the null
pointer value (4.10), the deallocation function shall
deallocate the storage referenced by the pointer,
rendering invalid all pointers referring to any part of
the deallocated storage. The effect of using an invalid
pointer value (including passing it to a deallocation
function) is undefined.

But it's not using p which is undefined, it's using its value.
And if I read §4.1 correctly, the "value" of an object is the
rvalue which results from the lvalue to rvalue conversion.
(First sentence of the second paragraph.) Now, I agree that
the standard explains this in a sort of backhanded way, and
could be a lot clearer. But I don't see any other
interpretation which is really possible.

[...]
Ah, formally, for C++, sorry, no! :)

That's clearly undefined behavior. (And I'm aware of machines
where it would cause a core dump.)
There is wording that *implies* to some degree that
dereferencing an invalid pointer value is UB,

He's not dereferencing the pointer. However, if he were:

The wording (§5.3.1) isn't perfect:
The unary * operator performs indirection: the
expression to which it is applied shall be a pointer to
an object type, or a pointer to a function type and the
result is an lvalue referring to the object or function
to which the expression points.
The "type"s in the first sentence certainly suggests that
all that we're concerned with is the type of the pointer,
and not whether it actually points to something. However,
the wording here doesn't have to be perfect, because the
unary * operator takes an rvalue, and the lvalue to rvalue
conversion results in undefined behavior here, either
because of §3.7.3.2 (in this case, using the pointer value
of a pointer to a deleted object), §4.1 (if he were
dereferencing an uninitialized pointer). Dereferencing a
null pointer (which is a valid pointer value) is taken care
of in §4.10/1.
and at least it's my impression, vaguely remembering
things, also wording that implies that copying an invalid
pointer value is UB.

You don't have to "vaguely remember". Anything which uses
the "pointer value" (in other words, requires an lvalue to
rvalue conversion) is undefined behavior.
But then there is also e.g. the explicit
allowed-to-dereference-nullpointer for typeid rule, and
there's not even any clear definition of "invalid" or
"valid" re pointer values. It's a mess. IMHO.

It's not nearly as bad as you make out, although you do have
to look in several different places. The key, however, is
always the *value*, which in the case of an lvalue
expression (e.g. the name of a variable) is the result of an
lvalue to rvalue conversion.

As for the rule concerning typeid, there's no special rule
for invalid values either. It does have a special rule
concerning null pointers---I don't really know why. (It
makes less sense than the rule in C that &a is legal even
when i is one past the end of the array.) But C++ has a lot
of special rules.
 
A

Alf P. Steinbach

* James Kanze:
That's false, of course.

If it was then I'd be happy.

Because it has to do with the quality of the standard, that one should be able
to be confident that an answer to any such basic issue should be there somewhere
without adding a programmer's intuition and sense of what's reasonable.

But alas, at least not as far as I can see.

However, let's note for other readers that we're really discussing something
akin to the question of maximum number of mites on a dog's snout without the dog
sneezing.

A middle ages teologician (speling?) would, presumably, feel right at home in
such a discussion... But, it can be fun. One can even learn something.

There's no problem using the pointer,
as long as there's no lvalue to rvalue conversion.

Depends on whether you mean formally or in practice.

In practice we know what what's reasonable and intended, so it's not really an
issue, except for that QA aspect noted above.

But formally, AFAICS the standard is just silent.

Not at all.


And how is that related to the issue at hand? There's no rule
that you can't use a pointer once it's been deleted. At least,
I've never seen any.

You're quoting one directly below. :)

There is a rule that:
If the argument given to a deallocation function in the
standard library is a pointer that is not the null
pointer value (4.10), the deallocation function shall
deallocate the storage referenced by the pointer,
rendering invalid all pointers referring to any part of
the deallocated storage. The effect of using an invalid
pointer value (including passing it to a deallocation
function) is undefined.

But it's not using p which is undefined, it's using its value.

I'd agree except the C++ standard doesn't seem to guarantee that the value isn't
"used" in e.g. an assignment to the pointer object.

On the contrary, §4.1/1 just says that lvalue-to-rvalue conversion /can/ occur
(in unspecified contexts), and §4/5 (non-normative text) notes that some
conversions are /suppressed/, that is, cannot occur, then gives as example
suppression of lvalue-to-rvalue conversion of the operand to unary &, and goes
on to tell us that "Specific exceptions are given in the descriptions of those
operators and contexts", which exceptions, however, seem nowhere to be found.

The §4/5 note indicates that even where a programmer's intuition and sense of
reasonableness, such as for unary &, makes it absolutely clear that there is no
lvalue-to-rvalue conversion, the standard will still point it out especially, or
least, was intended to point it out especially.

And that means that at least the standard's authors shared the view that for the
formal picture it's not enough to understand what's reasonable: that for the
formal any suppression of lvalue-to-rvalue must be stated directly, even for
operators like unary & or (my example) = assignment, in the way §4/5 indicates.

Now, §5/8 tells us that for an operator that "expects" an rvalue operand,
lvalue-to-rvalue is performed. So we would have had a set of cases where the
conversion is guaranteed (which is the opposite of the desired formal guarantee,
but at least something!), except that checking out e.g. + there's no wording
that it expects rvalue operands, so formally this set is rather empty...

Of course for the in-practice we know from programmer's intuition and notions of
what's reasonable that + not only accepts rvalue operands but does expect them,
that this is an example of what §5.8 is going on about, but AFAICS it's not
stated. And so for the in-practice we can deduce that +=, defined by equivalence
to an expression involving = and +, involves an operation that internally
expects an rvalue operand, and so that, still speaking about the in-practice,
for += the /value/ is "used". But at the formal level += is described with
exactly the same language as =, namely that (one reasonably understands that in
spite of the absolute-sounding wording this must surely be meant to apply only
to the built-in operators, intent, intent!) it requires an lvalue left hand side
operand -- and so if that "requires lvalue" is the language to use to decide
whether §5/8's "expects rvalue" applies, then formally §5/8 doesn't apply to +=,
bummer!

At the formal level this is similar to how dereferencing a nullpointer is in
practice known to be UB in general (the single exception for C++98 being
typeid), and where the standard even refers to itself as saying, somewhere
unspecified, that it's UB, without actually saying it anywhere.

And if I read §4.1 correctly, the "value" of an object is the
rvalue which results from the lvalue to rvalue conversion.
(First sentence of the second paragraph.) Now, I agree that
the standard explains this in a sort of backhanded way, and
could be a lot clearer. But I don't see any other
interpretation which is really possible.

I agree if you replace "possible" with "reasonable".

But we already know that the in-practice is what we expect it to be, there's no
in-practice problem whatsoever.

It's the formal that's problematic, which doesn't and can't depend on notions of
reasonableness.

[...]
Ah, formally, for C++, sorry, no! :)

That's clearly undefined behavior. (And I'm aware of machines
where it would cause a core dump.)

Uhm, you're actually right.

And I was wrong about this one.

Sorry.

He's not dereferencing the pointer. However, if he were:

The wording (§5.3.1) isn't perfect:
The unary * operator performs indirection: the
expression to which it is applied shall be a pointer to
an object type, or a pointer to a function type and the
result is an lvalue referring to the object or function
to which the expression points.
The "type"s in the first sentence certainly suggests that
all that we're concerned with is the type of the pointer,
and not whether it actually points to something. However,
the wording here doesn't have to be perfect, because the
unary * operator takes an rvalue,

Not that I disagree for the in-practice, but formally, where do you get that the
'*' operator takes an rvalue (the standard's "expects" an rvalue)?

and the lvalue to rvalue
conversion results in undefined behavior here, either
because of §3.7.3.2 (in this case, using the pointer value
of a pointer to a deleted object), §4.1 (if he were
dereferencing an uninitialized pointer). Dereferencing a
null pointer (which is a valid pointer value) is taken care
of in §4.10/1.

No, the UB of dereferencing a nullpointer is AFAIK nowhere taken care of in the
current standard, except for allowing it for typeid.

It may be that it's taken care of in the latest C++0x draft.

However, as late as N2800 it doesn't seem to have been taken care of.

You don't have to "vaguely remember". Anything which uses
the "pointer value" (in other words, requires an lvalue to
rvalue conversion) is undefined behavior.

Right, sorry, as noted earlier.

Except, of course, if as you hinted at earlier we really get pedantically into
whether an lvalue-to-rvalue conversion formally uses "the value".

Then at the formal level we're again on the slippery slope, since the text that
defines the conversion neglects to state both that this is a usage and that it
involves the value... :)

It's not nearly as bad as you make out, although you do have
to look in several different places. The key, however, is
always the *value*, which in the case of an lvalue
expression (e.g. the name of a variable) is the result of an
lvalue to rvalue conversion.

One hopes. He he. :)

As for the rule concerning typeid, there's no special rule
for invalid values either. It does have a special rule
concerning null pointers---I don't really know why. (It
makes less sense than the rule in C that &a is legal even
when i is one past the end of the array.) But C++ has a lot
of special rules.


Oh yes.


Cheers & hth.,

- Alf (retroactively building an argument, he he :), it really is The Holy
Standard! )
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,426
Latest member
MrMet

Latest Threads

Top