Objects and expressions - nitpicky questions

W

William L. Bahn

===========
(N869, p8) An object is a region of data storage in the execution
environment, the contents of which can represent values.

When explaining this to students, is it reasonable to say
something like:

From a practical standpoint, objects are things that have
addresses - addresses that can be computed. This doesn't
necessarily mean that we can both read and write to that object -
we may be able to do one, the other, or both (if we can't do
either, it's rather pointless).

So a string literal is an object. It is sitting in memory when
your program is running. We can determine the address of it. We
can read from it. We shouldn't write to it (doing so invokes
undefined behavior).

But a numerical constant, such as the 3 in

y = x + 3;

is not an object. It is not (at least not guaranteed to be) at a
particular region of data storage. It is very possibly a value
embedded in an instruction's op code. Trying to compute the
address of 3 will invoke undefined behavior.


That last clause, about "the contents of which can represent
values". Is that basically rhetorical or it is important to the
definition of an object. Asked another way, what would be
incorrect about just saying that an object is a region of data
storage in the execution environment? What things (worthy of
making the distinction) would that include that would not be
included as objects under the actual definition?
===========

(N869, p68 - subclause 6.5) An expression is a sequence of
operators and operands that specifies computation of a value, or
that designates an object or a function, or that generates side
effects, or that performs a combination thereof.

So what about just a constant?

The line:

32;

Does nothing, but it's a legal expression, right? Yet is it not
an operand (since there is not operator), it is not an operator,
it denotes neither an object nor a function, and it generates no
side effects. So, technically, this isn't an expression, correct?

But the next subclause, 6.5.1, specifically says that an
identifier (provided it denotes an object or a function), a
constant, a string literal (which denotes an object), and a
parenthesized expression are all primary expressions. This is in
agreement with the basic definition of an expression except for
the case of a constant.

Is this a subtle oversight in the basic definition of an
expression? Or am I just not reading something right?
 
O

osmium

William said:
(N869, p8) An object is a region of data storage in the execution
environment, the contents of which can represent values.

When explaining this to students, is it reasonable to say
something like:

I think an instructor in C should simply avoid using the word "object".
Since the word has different meanings in different contexts only a language
lawyer should need to use this word in the context of C. Language lawyers
develop themselves, they aren't students in the generally accepted meaning
of "student". There are enough other words so this one can be avoided,
example: l value, r value; I can't imagine a situation where an ordinary
student must be told what a C object is.

If a student specifically asks, only then should the instructor answer that
it is one of those nasty, annoying words like "stack" or "moot", and then
elaborate.
 
C

CBFalconer

William L. Bahn said:
(N869, p8) An object is a region of data storage in the execution
environment, the contents of which can represent values.

This happens to be section 3.15. The section number is immune to
actual pagination, which can vary, so it is better to use them in
citing references.
 
K

Keith Thompson

William L. Bahn said:
But a numerical constant, such as the 3 in

y = x + 3;

is not an object. It is not (at least not guaranteed to be) at a
particular region of data storage. It is very possibly a value
embedded in an instruction's op code. Trying to compute the
address of 3 will invoke undefined behavior.

Trying to compute the address of 3 (&3) doesn't invoke undefined
behavior; it's a constraint violation requiring a diagnostic, because
3 isn't an lvalue.

(There's a discussion now in comp.std.c about the C99 standard's
definition of "lvalue". By a literal reading of the definition, 3 is
both an lvalue and a modifiable lvalue. But it's universally agreed
that that's at most a flaw in the wording of the standard. 3 really
isn't an lvalue; the debate is over whether the standard correctly
expresses that fact.)

[...]
(N869, p68 - subclause 6.5) An expression is a sequence of
operators and operands that specifies computation of a value, or
that designates an object or a function, or that generates side
effects, or that performs a combination thereof.

So what about just a constant?

The line:

32;

Does nothing, but it's a legal expression, right? Yet is it not
an operand (since there is not operator), it is not an operator,
it denotes neither an object nor a function, and it generates no
side effects. So, technically, this isn't an expression, correct?

But the next subclause, 6.5.1, specifically says that an
identifier (provided it denotes an object or a function), a
constant, a string literal (which denotes an object), and a
parenthesized expression are all primary expressions. This is in
agreement with the basic definition of an expression except for
the case of a constant.

Is this a subtle oversight in the basic definition of an
expression? Or am I just not reading something right?

In my opinion, it's subtle oversight in the basic definition of an
expression. In my opinion (and this is part of the discussion over in
comp.std.c), the definitions in section 3 of the standard are
reasonably good ones, but some of the definitions scattered through
the text of the standard, with the defined term in italic type, are
incomplete. Some of them seem to me to be statements about the term
being defined, but not actually definitions of the term. (See
"expression" and "lvalue".)
 
E

Eric Sosman

William said:
===========
(N869, p8) An object is a region of data storage in the execution
environment, the contents of which can represent values.

There's a long and heated thread on this topic currently
raging in both comp.lang.c and comp.std.c. You may want to
peruse at least part of it.
When explaining this to students, is it reasonable to say
something like:

From a practical standpoint, objects are things that have
addresses - addresses that can be computed. This doesn't
necessarily mean that we can both read and write to that object -
we may be able to do one, the other, or both (if we can't do
either, it's rather pointless).

So a string literal is an object. It is sitting in memory when
your program is running. We can determine the address of it. We
can read from it. We shouldn't write to it (doing so invokes
undefined behavior).

But a numerical constant, such as the 3 in

y = x + 3;

is not an object. It is not (at least not guaranteed to be) at a
particular region of data storage. It is very possibly a value
embedded in an instruction's op code. Trying to compute the
address of 3 will invoke undefined behavior.

The difference disappears if you're careful to distinguish
source-code constructs from their run-time effects. A string
literal is a source-code notation, a communication from you to
the compiler. It is defined as causing the compiler to create
an anonymous char[] array with the specified contents (or in
one specialized use, to provide the initializer for a named
char[] array). The char[] array is an object, but the literal
itself is just notation.

In the second case, the `3' is also notation, a notation
whose effect is defined as creating a value. Some machines
may implement this by creating an anonymous int initialized
to three, but this object (if it exists) is an implementation
detail and not mandated by the language.

The `+', of course, is also a notation, and it is defined
as causing an addition operation to occur on a pair of operands.
Nobody thinks of the addition as an object, or tries to take
its address -- and yet, it's conceivable that it might actually
have one! Imagine a machine that lacks an ADD instruction but
has an adder wired up to a memory location in the manner of
some I/O devices. The compiler might implement addition by
storing the operands to the "summand" locations and fetching
a result from the "sum" location; one could claim that the
(volatile) memory cells of the adder are an "object" at a
certain address. But even if an implementation backs `+'
with a hidden object, that object is as invisible to the C
language as is the hypothetical object that backs the `3'.
That last clause, about "the contents of which can represent
values". Is that basically rhetorical or it is important to the
definition of an object. Asked another way, what would be
incorrect about just saying that an object is a region of data
storage in the execution environment? What things (worthy of
making the distinction) would that include that would not be
included as objects under the actual definition?

I'm as puzzled as you are: if something can't represent
values, it doesn't seem that it could be called "data storage."
Rhetoric, I'd guess, or extra redundant explanatory clarification.
===========

(N869, p68 - subclause 6.5) An expression is a sequence of
operators and operands that specifies computation of a value, or
that designates an object or a function, or that generates side
effects, or that performs a combination thereof.

This is Section 6.5, paragraph 1. The same language appears
in the final Standard.
So what about just a constant?

The line:

32;

Does nothing, but it's a legal expression, right? Yet is it not
an operand (since there is not operator), it is not an operator,
it denotes neither an object nor a function, and it generates no
side effects. So, technically, this isn't an expression, correct?

But the next subclause, 6.5.1, specifically says that an
identifier (provided it denotes an object or a function), a
constant, a string literal (which denotes an object), and a
parenthesized expression are all primary expressions. This is in
agreement with the basic definition of an expression except for
the case of a constant.

Is this a subtle oversight in the basic definition of an
expression? Or am I just not reading something right?

I can't find any way to construe `32' as an object -- my
whole argument above claims that it isn't one! So I think
you've spotted a contradiction between the English of 6.5/1
and the formal grammar of 6.5.1/1: Pass "GO," collect $200.
 
D

Dan Pop

In said:
Trying to compute the address of 3 (&3) doesn't invoke undefined
behavior; it's a constraint violation requiring a diagnostic, because
3 isn't an lvalue.

It's actually *both* a constraint violation (6.5.3.2p1) and undefined
behaviour (6.3.2.1p1).

According to 6.3.2.1p1, 3 *is* an lvalue, the wording is crystal clear:

1 An lvalue is an expression with an object type or an incomplete
type other than void...

As long as 3 is an expression of type int, it is also an lvalue, according
to C99's definition of lvalue.
(There's a discussion now in comp.std.c about the C99 standard's
definition of "lvalue". By a literal reading of the definition, 3 is
both an lvalue and a modifiable lvalue. But it's universally agreed
that that's at most a flaw in the wording of the standard. 3 really
isn't an lvalue; the debate is over whether the standard correctly
expresses that fact.)

C99 clearly expresses the fact that 3 *is* an lvalue. One that invokes
undefined behaviour when used as such, but an lvalue, nevertheless.

The C89 definition of lvalue made sense, C99 "fixed" something that wasn't
broken.

Dan
 
M

Malcolm

William L. Bahn said:
That last clause, about "the contents of which can represent
values". Is that basically rhetorical or it is important to the
definition of an object. Asked another way, what would be
incorrect about just saying that an object is a region of data
storage in the execution environment? What things (worthy of
making the distinction) would that include that would not be
included as objects under the actual definition?
I think what it is trying to say is that an "object" must be something
humanly meaningful. A Martian reverse engineer looking at a typical computer
program would just see bits going in and out of the registers and being read
and written from memory. However if he knows a bit of human psychology then
he can begin to distinguish things like "strings" and "floating point
numbers", maybe even structures. If he understands human social life, then a
name, national insurance number, payroll number and salary represent an
"employee".
 
K

Keith Thompson

C99 clearly expresses the fact that 3 *is* an lvalue. One that invokes
undefined behaviour when used as such, but an lvalue, nevertheless.

It's worse than that; according to the literal wording, it invokes
undefined behavior when evaluated as an expression, whether it's in a
context requiring an lvalue or not. The wording is:

An lvalue is an expression with an object type or an incomplete
type other than void; if an lvalue does not designate an object
when it is evaluated, the behavior is undefined.

This is the definition of "lvalue". The best interpretation I've been
able to come up with is that it's an incomplete definition (or, if you
prefer, that it isn't really a definition at all). As a statement
about lvalues, there's nothing wrong with it; as a complete
definition, it misses the critical idea than an lvalue refers to an
object.
The C89 definition of lvalue made sense, C99 "fixed" something that wasn't
broken.

Actually, there was a problem with the C89/C90 definition.

The C90 standard says:

An lvalue is an expression (with an object type or an incomplete
type other than void) that designates an object.

The problem is that *ptr is an lvalue if ptr points to an object, but
is not an lvalue if ptr is a null or invalid pointer. A strict reading
of C90 implies that
*ptr = 42;
is a constraint violation (requiring a compile-time diagnostic) if ptr
is a null pointer (which occurs at run time).

The change in C99 was an attempt to make lvalue-ness something that
can be determined during compilation; *ptr is an lvalue even if ptr is
a null pointer, but attempting to use *ptr as an lvalue invokes
undefined behavior. But in dropping the requirement that an lvalue
must designate an object, C99 inadvertently dropped the idea that (at
least potentially) designating an object is what lvalues are all
about.

The debate is about the wording of the standard, not about the actual
intent. No implementer is going to follow the literal wording of the
standard and allow
1=3;
to compile without a diagnostic. Which is why it's really a topic for
comp.std.c, not for comp.lang.c

I encourage anyone who wants to discuss this further to see the
"sequence points" thread over in comp.std.c. (That thread has split
into two separate discussions; we probably should have started a new
thread for the lvalue discussion.)
 
O

Old Wolf

Keith Thompson said:
Trying to compute the address of 3 (&3) doesn't invoke undefined
behavior; it's a constraint violation requiring a diagnostic, because
3 isn't an lvalue.

Constraint violations cause undefined behaviour (the compiler is still
permitted to generate an executable, after emitting the diagnostic).

Anyway, in C99 it is very clearly not a constraint violation, and
requires no diagnostic (whether this was intentional, is another
matter).
 
O

Old Wolf

[Editing by me]
(N869, p68 - subclause 6.5) An expression is a sequence of
[zero or more] operators and [one or more] operands that
specifies computation of a value, or that designates an object
or a function, or that generates side effects, or that performs
a combination thereof.

32;

Does nothing, but it's a legal expression, right? Yet is it not
an operand (since there is not operator), it is not an operator,
it denotes neither an object nor a function, and it generates no
side effects. So, technically, this isn't an expression, correct?

It squeaks into the "one operand, no operators" category. It
'computes' the value 32.
 
K

Keith Thompson

William L. Bahn said:
(N869, p68 - subclause 6.5) An expression is a sequence of
[zero or more] operators and [one or more] operands that
specifies computation of a value, or that designates an object
or a function, or that generates side effects, or that performs
a combination thereof.

32;

Does nothing, but it's a legal expression, right? Yet is it not
an operand (since there is not operator), it is not an operator,
it denotes neither an object nor a function, and it generates no
side effects. So, technically, this isn't an expression, correct?

It squeaks into the "one operand, no operators" category. It
'computes' the value 32.

C99 6.4.6p2 says:

An _operand_ is an entity on which an operator acts.

There is no operator in "32;", so 32 is not an operand.

(Of course 32 is an expression, as the grammar makes clear; the
definition of "expression" is flawed.)
 
D

Dan Pop

In said:
Constraint violations cause undefined behaviour (the compiler is still
permitted to generate an executable, after emitting the diagnostic).

Anyway, in C99 it is very clearly not a constraint violation, and
requires no diagnostic (whether this was intentional, is another
matter).

Wrong!

Constraints

1 The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue
that designates an object that is not a bit-field and is not
declared with the register storage-class specifier.

Even if 3 is an lvalue (by the C99 definition), it is not one that
designates an object, so &3 is a constraint violation.

Dan
 
D

Dan Pop

In said:
(e-mail address removed) (Dan Pop) writes:
[...]
C99 clearly expresses the fact that 3 *is* an lvalue. One that invokes
undefined behaviour when used as such, but an lvalue, nevertheless.

It's worse than that; according to the literal wording, it invokes
undefined behavior when evaluated as an expression, whether it's in a
context requiring an lvalue or not.

This is an abusive interpretation. If the context doesn't require an
lvalue, no lvalue is evaluated so there is no undefined behaviour.

Compare to the definition of null pointer constant, which is also
context independent, yet not every instance of 0 is treated as a null
pointer constant.
The wording is:

An lvalue is an expression with an object type or an incomplete
type other than void; if an lvalue does not designate an object
when it is evaluated, the behavior is undefined.

This is the definition of "lvalue". The best interpretation I've been
able to come up with is that it's an incomplete definition (or, if you
prefer, that it isn't really a definition at all). As a statement
about lvalues, there's nothing wrong with it; as a complete
definition, it misses the critical idea than an lvalue refers to an
object.

Wrong. The possibility that an lvalue does not refer to an object is
*explicitly* addressed by the last part of the definition.
Actually, there was a problem with the C89/C90 definition.

The C90 standard says:

An lvalue is an expression (with an object type or an incomplete
type other than void) that designates an object.

The problem is that *ptr is an lvalue if ptr points to an object, but
is not an lvalue if ptr is a null or invalid pointer. A strict reading
of C90 implies that
*ptr = 42;
is a constraint violation (requiring a compile-time diagnostic) if ptr
is a null pointer (which occurs at run time).

Therefore, no diagnostic is required at translation time.

Dan
 
K

Keith Thompson

In said:
(e-mail address removed) (Dan Pop) writes:
[...]
C99 clearly expresses the fact that 3 *is* an lvalue. One that invokes
undefined behaviour when used as such, but an lvalue, nevertheless.

It's worse than that; according to the literal wording, it invokes
undefined behavior when evaluated as an expression, whether it's in a
context requiring an lvalue or not.

This is an abusive interpretation. If the context doesn't require an
lvalue, no lvalue is evaluated so there is no undefined behaviour.

An lvalue can be used in a context that doesn't require one.

In the following:
int x;
x = 3;
the context of the RHS of the assignment does not require a constant
expression; nevertheless, 3 is still a constant expression.
Compare to the definition of null pointer constant, which is also
context independent, yet not every instance of 0 is treated as a null
pointer constant.

Not every instance of 0 is *treated as* a null pointer constant, but I
see nothing in the standard that says 0 is a null pointer constant
only if it's used in a context that requires one. If you assume that
the phrase "null pointer constant" must have something to do with
pointers, you might assume that the 0 in "x = 0;" obviously isn't a
null pointer constant, but we have to go by the definition of the term
in the standard. (It doesn't *matter* that the 0 in "x = 0;" is a
null pointer constant, since it's not converted to a pointer type.)

A definition of a term "foo" should tell you what is a foo and what is
not a foo. For foo == "null pointer constant", the definition tells
you that 0 is a null pointer constant; it doesn't say that 0 in a
non-pointer context is not a null pointer constant. Since this is a
bit peculiar but doesn't create any logical inconsistencies (unlike
the problem with "lvalue"), I'm willing to accept that. (It's also
peculiar that 1 is a decimal-constant and 0 isn't, but again that
doesn't cause any problems.)
Wrong. The possibility that an lvalue does not refer to an object is
*explicitly* addressed by the last part of the definition.

Ok, I was a little sloppy there. The critical idea is that an lvalue
*potentially* designates an object. I think an lvalue is (roughly) an
expression that would designate an object if its subexpressions had
certain valid values, but I'm sure there are holes to be shot in that
definition. *ptr is an lvalue because it can designate an object if
ptr is a valid non-null object pointer. ptr[10] is an lvalue because
it can designate an object if ptr points to (or is) an array of at
least 11 elements. And so forth. 3 shouldn't be considered an lvalue
because it can't designate an object, even if the value of 3 changes.
Therefore, no diagnostic is required at translation time.

C90 is internally inconsistent in this area. If ptr is a null
pointer, *ptr does not designate an object. C90 says an lvalue must
designate an object, so *ptr is not an lvalue, so "*ptr = 42;"
violates the constraint that the LHS of an assignment must be an
lvalue. A translation time diagnostic is therefore required for a
condition that cannot be determined at translation time. (I suppose
an implementation could meet the requirements by always issuing a
diagnostic, but that's not the intent.) This is due to a flaw in the
C90 definition of "lvalue". The C99 definition corrects this flaw but
introduces a new one.
 
K

Keith Thompson

In <[email protected]>
Keith Thompson said:
[...]
But a numerical constant, such as the 3 in

y = x + 3;

is not an object. It is not (at least not guaranteed to be) at a
particular region of data storage. It is very possibly a value
embedded in an instruction's op code. Trying to compute the
address of 3 will invoke undefined behavior.

Trying to compute the address of 3 (&3) doesn't invoke undefined
behavior; it's a constraint violation requiring a diagnostic, because
3 isn't an lvalue.

Constraint violations cause undefined behaviour (the compiler is still
permitted to generate an executable, after emitting the diagnostic).

Anyway, in C99 it is very clearly not a constraint violation, and
requires no diagnostic (whether this was intentional, is another
matter).

Wrong!

Constraints

1 The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue
that designates an object that is not a bit-field and is not
declared with the register storage-class specifier.

Even if 3 is an lvalue (by the C99 definition), it is not one that
designates an object, so &3 is a constraint violation.

I think this is another (fairly minor) flaw in the C99 standard.

Assuming the C99 definition of "lvalue" is corrected so it expresses
the actual intent, consider
&(*ptr)
If ptr is a null pointer, (*ptr) does not designate an object. The
intent is that this should invoke undefined behavior, not that it's a
constraint violation.

(Certainly &3 should be a constraint violation, but because 3
shouldn't be an lvalue.)
 
O

Old Wolf

Constraint violations cause undefined behaviour (the compiler is still
permitted to generate an executable, after emitting the diagnostic).

Anyway, in C99 it is very clearly not a constraint violation, and
requires no diagnostic (whether this was intentional, is another
matter).

Wrong!

Constraints

1 The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue
that designates an object that is not a bit-field and is not
declared with the register storage-class specifier.

Ack. I was confusing &3 with 3++
 
D

Dan Pop

In said:
[email protected] (Dan Pop) said:
In said:
(e-mail address removed) (Dan Pop) writes:
[...]
C99 clearly expresses the fact that 3 *is* an lvalue. One that invokes
undefined behaviour when used as such, but an lvalue, nevertheless.

It's worse than that; according to the literal wording, it invokes
undefined behavior when evaluated as an expression, whether it's in a
context requiring an lvalue or not.

This is an abusive interpretation. If the context doesn't require an
lvalue, no lvalue is evaluated so there is no undefined behaviour.

An lvalue can be used in a context that doesn't require one.

In which case, it is NOT evaluated as an lvalue.
In the following:
int x;
x = 3;
the context of the RHS of the assignment does not require a constant
expression; nevertheless, 3 is still a constant expression.

But it is not evaluated as a constant expression, which is precisely my
point. The *intent* of the C99 definition of lvalue is perfectly clear.
It divides lvalues in two categories: "good" lvalues, that designate an
object when evaluated *as lvalues* and "bad" lvalues, that don't.
Evaluating a bad lvalue *as an lvalue* invokes undefined behaviour.

Dan
 
D

Dan Pop

In said:
[email protected] (Dan Pop) said:
Constraints

1 The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue
that designates an object that is not a bit-field and is not
declared with the register storage-class specifier.

Even if 3 is an lvalue (by the C99 definition), it is not one that
designates an object, so &3 is a constraint violation.

I think this is another (fairly minor) flaw in the C99 standard.

Assuming the C99 definition of "lvalue" is corrected so it expresses
the actual intent, consider
&(*ptr)
If ptr is a null pointer, (*ptr) does not designate an object. The

It doesn't have to. See below.
intent is that this should invoke undefined behavior, not that it's a
constraint violation.

Nope, the unambiguously expressed intent is that this expression
evaluates to a null pointer:

6.5.3.2 Address and indirection operators

Constraints

1 The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
that designates an object that is not a bit-field and is not
declared with the register storage-class specifier.

2 The operand of the unary * operator shall have pointer type.

Semantics

3 The unary & operator returns the address of its operand. If the
operand has type ``type'', the result has type ``pointer to
type''. If the operand is the result of a unary * operator,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
neither that operator nor the & operator is evaluated and the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
result is as if both were omitted, except that the constraints
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
on the operators still apply and the result is not an lvalue.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,146
Messages
2,570,831
Members
47,374
Latest member
anuragag27

Latest Threads

Top