offsetof() Macro Attempt

T

Tim Rentsch

Harald van Dijk said:
It would be plural both with your interpretation as with mine.

Actually singular is more usual if only the one category is
meant. Try it with 'function definition' - 'other forms of
function definition', not 'other forms of function definitions'.

It doesn't say that in the standard,

There's no disagreement about what words appear, but the question
is what elements are included under those words. For example, a
statement in the Standard might use the term 'declarations' but
would also apply to 'definitions', as (many) definitions are also
declarations. Inferring that a statement about 'declarations'
doesn't also apply to those 'definitions' just because the word
'definitions' isn't mentioned locally is obviously wrong-headed.
though I was already agreeing
that that seems to be what is intended. This would clearly not be the
intended interpretation for something like "signed integer types",
though: permission to extend the definition of "signed integer type"
should not implicitly allow implementations to extend the definition
of "standard signed integer type".[*]

This analogy seems too hypothetical and not closely analogous
enough to be relevant to the current discussion.
Without explicit permission to
redefine "integer constant expression", an implementer does not have
that luxury either.

You say this like it's an article of faith, but generally the
reverse is true: any statement about a broad category of items
also applies to more specialized subcategories unless a specific
exception is called out, like the 'declaration' and 'definition'
example above. For example, the Standard includes this text:
"Thus, * can be used only in function declarations that are not
definitions (see 6.7.5.3)." Unless you can cite some other
example passages from the Standard that support the "Without
explicit permission" theory, I don't see any reason to assume
it applies here.

I know, I've actually had the opportunity to modify one compiler to
accept extended forms of integer constant expressions: it was so
strict, that it was impossible to define offsetof as an i.c.e., even
though it was more than capable of simplifying the common forms of
offsetof at translation time. FWIW, I opted for a "literal" keyword to
mark the following expression as an extended constant expression, so
that you could do:

typedef char __char;
typedef size_t __size_t;
#pragma keyword __offsetof_magic for keyword literal
#define offsetof(s, m) (__offsetof_magic (__size_t) ((__char *) &((s
*) 0)->m - (__char *) 0))

-- please excuse possible typos -- which resulted in errors at a later
stage if the expression was too complex to simplify.

One point of 6.6p10 is that such goofiness is not necessary.

(I hasten to add that the 'goofiness' I'm talking about
here is not your design choice but the circumlocutions
needed if other forms of ICE's were not available under
6.6p10. The definitions above could be a perfectly
reasonable design choice even if the circumlocutions
were not necessary.)

Also, if you look at the description of 'constant expression' (as
opposed to the subcategories) all the restrictions on it are
either syntactic or constraints. Violating either of those would
necessitate a diagnostic message, in which case 6.6p10 would
serve no purpose, since the same result could be achieved by just
defining an extension.

6.6p10 would allow

char p[2];
int i = (int) p;

as an extension, because it would violate none of the relevant
"shall"s in 6.6: (int) p would be a constant expression of type int,
without being an integer constant expression. 6.7.8p4 requires a
constant expression, so without 6.6p10, it would be a constraint
violation.

No, 6.6p10 is not necessary for this. The expression '(int) p'
(provided it can be evaluated during translation) already qualifies
as a constant expression. The discussion in 3p1 and 6.1p1 make it
clear that the term "constant expression" is defined syntactically,
and '(int) p' already meets this definition. The "shall"s in 6.6p3,
6.6p4, and 6.6p7 (and following) are not part of the definition of
"constant expression". Failure to meet the "shall"s in 6.6p3 or
6.6p4 would require a diagnostic if a constant expression were
needed, but not meeting the "shall"s in 6.6p7 (and following) would
only be undefined behavior, not a constraint violation, and the
expression in question would still be a constant expression (provided
6.6p1 and 6.6p2 are satisfied) - no diagnostic required.

One could argue that 6.6p7 is defining a new category, "constant
expressions in initializers", but then under your theory 6.6p10
wouldn't apply to those either since it doesn't name them explicitly.

It is only on the "and does allow" that we disagreee.

If you think 6.6p10 is _meant_ to allow other forms of ICE, but
_doesn't_ allow other forms of ICE, does this mean you think
how you read the Standard differs from how the committee
reads the Standard?

Yes. Do you know of any constant expression of integer type that is
accepted in file scope initialisers, but not in enumeration constants?
Because I don't.

Yes. One version of gcc I use accepts

char x[1];
static long xlong = (long) x;

but doesn't accept '(long) x' as an integer constant expression.
(The expression '(long) x' has type 'long' rather than type 'int',
which may perforce disqualify it for enumeration constants, but it
also doesn't qualify as "just" an integer constant expression, where
which particular value it has doesn't matter as long as the type is
integral.) [The pun in the last sentence was not intentional!]
Older versions of gcc accepted many extended forms of
constant expressions in both contexts, current versions accept few if
any in either context.

I don't draw any conclusions from this other than that recently a
different decision was made about which forms to include (by
default?) in the conforming modes. Even what you're calling the
"older versions" of gcc have been around long enough so questions
about conformance should have been previously addressed; IIANM
the wording of 6.6p10 is unchanged since C90 (except of course
the section numbering was different).

[*] The "signed integer types" section is worded much better, and
clearly only allows implementation-defined "extended signed integer
types". I'm wondering how we would interpret the standard if it said

"There are five /standard signed integer types/, designated as signed
char, short
int, int, long int, and long long int. (These and other types may be
designated
in several additional ways, as described in 6.7.2.) There may also be
other
implementation-defined signed integer types."

Seems like sort of a strawman. I expect this wording would be
judged defective because it doesn't define the term 'signed
integer types', not to mention the term 'extended signed integer
types' which is used several times in other places in the
Standard.
 
H

Harald van Dijk

Actually singular is more usual if only the one category is
meant.  Try it with 'function definition' - 'other forms of
function definition', not 'other forms of function definitions'.

Now you made me check: in a Google news search for "other forms of",
the vast majority of articles have it followed by a mass noun or
something else that cannot be pluralised at all in the context. Of the
rest, both singular and plural are used, and I see no difference in
meaning.

But in the standard, the phrase "other forms of" occurs a few more
times, of which the most similar is 6.4.6p2: "other forms of operator
also exist in some contexts". So you may be right that the standard
would have used the singular form in my interpretation.
There's no disagreement about what words appear, but the question
is what elements are included under those words.  For example, a
statement in the Standard might use the term 'declarations' but
would also apply to 'definitions', as (many) definitions are also
declarations.  Inferring that a statement about 'declarations'
doesn't also apply to those 'definitions' just because the word
'definitions' isn't mentioned locally is obviously wrong-headed.

It's closer to saying that "extern int f(void);" may be a definition
of f, because it is a declaration, and a declaration can be a
definition, even though the term "definition" is defined in such a way
that "extern int f(void);" does not apply.
6.6p10 would allow
char p[2];
int i = (int) p;
as an extension, because it would violate none of the relevant
"shall"s in 6.6: (int) p would be a constant expression of type int,
without being an integer constant expression. 6.7.8p4 requires a
constant expression, so without 6.6p10, it would be a constraint
violation.

No, 6.6p10 is not necessary for this.  The expression '(int) p'
(provided it can be evaluated during translation) already qualifies
as a constant expression.
[... Not] meeting the "shall"s in 6.6p7 (and following) would
only be undefined behavior, not a constraint violation, and the
expression in question would still be a constant expression (provided
6.6p1 and 6.6p2 are satisfied) - no diagnostic required.

Agreed. In that case, I don't think 6.6p10 adds anything if the
definition of "integer constant expression" cannot be extended.
If you think 6.6p10 is _meant_ to allow other forms of ICE, but
_doesn't_ allow other forms of ICE, does this mean you think
how you read the Standard differs from how the committee
reads the Standard?

Has the committee been asked about this and decided that the standard
is clear enough as it is currently written?
Yes. Do you know of any constant expression of integer type that is
accepted in file scope initialisers, but not in enumeration constants?
Because I don't.

Yes.  One version of gcc I use accepts

    char x[1];
    static long xlong = (long) x;

but doesn't accept '(long) x' as an integer constant expression.

Thanks. I wonder if you consider this an example of undefined
behaviour. GCC does not document this as an extension, so it is
unclear if it relies on 6.6p10 (no documentation required), or simply
doesn't diagnose the violation of 6.6p7.
 
H

Harald van Dijk

    int helper[2];
    enum foo {
        bar = helper + 1 - helper
      };
[...]
Address constants are not permitted in integer constant expressions.

Do you mean because of the restriction on "operands," above?  If so,
which operands in the code are address constants?  If your answer is
'helper', is it:
- An operand to the binary addition operator (and the binary subtraction
operator)
- An operand of the integer constant expression?
- Both?

I would think 6.6p6 is meant as:

An integer constant expression is one of:
- An integer constant[*]
- An enumeration constant
- A character constant
- A sizeof expression whose result is an integer constant[*]
- A floating constant, cast to an integer type
- An integer constant expression, cast to an integer type
- An operator applied only to integer constant expressions

[*] "An integer constant" means integer-constant in the first case,
but merely combines the words "integer" and "constant" in the second.
If your answer is "both," does that mean that "operandness" applies to
any and all subexpressions?

Almost, but not completely. In enum { foo = (1 + 2) + 3 }; the left
operand of the right + operator is (1 + 2), but that does not matter.
In enum { foo = sizeof(char) }; the operand of the sizeof operator is
char, but that does not matter.
If so, does that mean that:

   int x;

   enum foo {
       bar = 1 ? 3 : x - x
     };

is not permitted (you said, "...are not permitted...", above) because 'x
- x' is not one of the items of 6.6p6?  Or is 'x - x' one of the items
in 6.6p6?  Or is that construction permitted due to some other reason?

This is not a standard integer constant expression, because x is not
any of the items of 6.6p6, so neither is x - x.
Is the compiler free to reject it?  My impression is that you are
suggesting that it is.

Yes, the compiler is free to reject it.
 
T

Tim Rentsch

Harald van Dijk said:
Harald van Dijk said:
6.6p10 allows 'other forms of constant expressions' plural, [snip]
which
also includes the different subcategories of constant expression.
It doesn't say that in the standard,

There's no disagreement about what words appear, but the question
is what elements are included under those words. For example, a
statement in the Standard might use the term 'declarations' but
would also apply to 'definitions', as (many) definitions are also
declarations. Inferring that a statement about 'declarations'
doesn't also apply to those 'definitions' just because the word
'definitions' isn't mentioned locally is obviously wrong-headed.

It's closer to saying that "extern int f(void);" may be a definition
of f, because it is a declaration, and a declaration can be a
definition, even though the term "definition" is defined in such a way
that "extern int f(void);" does not apply.

I don't get this. The sentence in 6.6p10 and the remark that
something "may be a definition of f" don't seem analogous at
all. Can you explain in more detail?
[snip]
If you think 6.6p10 is _meant_ to allow other forms of ICE, but
_doesn't_ allow other forms of ICE, does this mean you think
how you read the Standard differs from how the committee
reads the Standard?

Has the committee been asked about this and decided that the standard
is clear enough as it is currently written?

I don't know about that, but it doesn't affect my question.
If you think the committee intended 6.6p10 to allow other
forms of ICE, do you also think how you read the Standard
differs from how the committee reads the Standard (meaning in
the present, however they might read it in the future)?

I don't think any implementation behaves this way, so either I'm
missing something, or the standard is.
Have you tried gcc -ansi -pedantic ?
Yes. Do you know of any constant expression of integer type that is
accepted in file scope initialisers, but not in enumeration constants?
Because I don't.

Yes. One version of gcc I use accepts

char x[1];
static long xlong = (long) x;

but doesn't accept '(long) x' as an integer constant expression.

Thanks. I wonder if you consider this an example of undefined
behaviour. GCC does not document this as an extension, so it is
unclear if it relies on 6.6p10 (no documentation required), or simply
doesn't diagnose the violation of 6.6p7.

Putting aside any questions about QOI, I don't see that the
answer makes any difference. Something being 'undefined
behavior' is not a statement about how programs will behave
but a permission granted to implementations to do whatever
they see fit. Here we have something that (we believe) is
permitted under 6.6p10, and also is permitted under undefined
behavior. In either case it's permitted, and the best that
can be done is hope the implementation does something sensible
with such undocumented behaviors.

(Well, that, and ask the implementation team to add a compiler
option to diagnose such cases, and/or document them. Certainly
my preference is that cases like this one be documented, and also
that there be compiler options to warn about them or disable
them. But all that is outside the scope of your question.)
 
H

Harald van Dijk

I don't get this.  The sentence in 6.6p10 and the remark that
something "may be a definition of f" don't seem analogous at
all.  Can you explain in more detail?

1) An integer constant expression is a type of constant expression.
(arr - arr) is allowed to be a constant expression. Therefore, (arr -
arr) is allowed to be an integer constant expression, even though the
standard defines the term "integer constant expression" in such a way
that (arr - arr) cannot ever be one.

2) A definition is a type of a declaration. extern int f(void); is
allowed to be (in fact, just "is") a declaration. Therefore, extern
int f(void); is allowed to be a definition, even though the standard
defines the term "definition" in such a way that extern int f(void);
cannot ever be one.

2) is a bogus claim: the fact that it is a declaration does not imply
it can ever be a definition.

So... I feel that 1) is a bogus claim: the fact that it can be a
constant expression does not imply it can ever be an integer constant
expression. You probably don't think 1 accurately represents your
interpretation, but I hope you can see how and why I am reading it the
way I am.
I don't know about that, but it doesn't affect my question.
If you think the committee intended 6.6p10 to allow other
forms of ICE, do you also think how you read the Standard
differs from how the committee reads the Standard (meaning in
the present, however they might read it in the future)?

I do not yet have a reason to believe my literal interpretation of the
standard disagrees with any committee member's. I think the wording of
6.6p10 was a simple mistake, and mistakes happen, even in standards.
Besides, it isn't all that important to fix the wording if there is
agreement on what the intended meaning is.
Putting aside any questions about QOI, I don't see that the
answer makes any difference.

If gcc relies on undefined behaviour, then it is permitted to make (a
- a) or ((long) a) behave unintuitively in constant expressions. If
gcc relies on 6.6p10, different results in constant and in non-
constant expressions for (a - a) or ((long) a) may affect conformance.
It does not matter in your example, since ((long) a) just works, but
since gcc relies on help from the linker, it would not surprise me if
similar expressions, except more complicated and probably very stupid,
go undiagnosed by gcc but are not handled properly by the linker.
(Well, that, and ask the implementation team to add a compiler
option to diagnose such cases, and/or document them.  Certainly
my preference is that cases like this one be documented, and also
that there be compiler options to warn about them or disable
them.  But all that is outside the scope of your question.)

It seems that gcc is hardly alone in accepting it. I may ask for this
to get an optional warning in multiple compilers.
 
H

Harald van Dijk

So what about (thanks, pete!):

   char ** helper;

   #define DUMMY_OF(type) ((type *)*helper)
   #define ADDRESS_AT(address) ((char *)(address))
   #define DUMMY_MEMBER_ADDRESS(type, member) (&DUMMY_OF(type)->member)
   #define PROTECT(expression) \
     (sizeof *(1 ? 0 : (char(*)[1][(expression)])*helper))
   #define OFFSETOF(type, member) (PROTECT(             \
       ADDRESS_AT(DUMMY_MEMBER_ADDRESS(type, member)) - \
       ADDRESS_AT(DUMMY_OF(type))                      \
     ))

   struct s {
       int x;
       short y;
       int z;
     };

   enum foo {
       bar = OFFSETOF(struct s, z)
     };

   int main(void) {
       return bar;
     }

?  Suppose that the cast-expression '(char(*)[1][(expression)])*helper'
in the 'PROTECT' macro is not an integer constant expression.  Is the
result of the 'sizeof' operator to its left an integer constant
expression regardless of what 'expression' is?  I should think not.

Correct. When (expression) is not an integer constant expression,
(char(*)[1][(expression)]) is a variably modified type, even if the
compiler can prove that (expression) will never change. The result of
applying sizeof to a variable-length array is not a constant.
But given that in the 'OFFSETOF' macro, the result of the binary '-'
subtraction operator can be known to be a 'constant-expression' with
type 'ptrdiff_t' (regardless of the value of '*helper'), would that not
mean that the result of the 'sizeof' operator in the 'PROTECT' macro
ought to be an integer constant worthy of classification as an integer
constant expression by your bullet list above?

No, that doesn't work.
Since 'type-name' ('char', in your example) can include expressions, do
you believe that those expressions are restricted by 6.6p6?

Not directly.

int foo(int);
enum {
ok = sizeof(char(*)[foo(30)]),
};

is just fine, because when sizeof is not applied to a variable length
array, its result is an integer constant.

int arr[1];
enum {
not_ok = sizeof(char[arr+1 - arr])
};

is not fine, because even though arr+1 - arr can be known at compile
time, it is not an integer constant expression, which means char[arr+1
- arr] is a variable length array, and sizeof's result is not an
integer constant when applied to a variable length array.
Ok.  I could agree to that since I'm concerned with portability.  So
what about in:

   int x;

   enum foo {
       bar = sizeof (char[1 ? 3 : x - x])
     };

?  If '1 ? 3 : x - x' is not an integer constant expression, that would
make 'char[...]' a variable length array type[6.7.5.2p4].  That would
suggest to me that "the operand is evaluated" by 'sizeof'[6.5.3.4p2].
Does that mean that the result of the 'sizeof' cannot be an integer
constant?
Exactly.

 Or might it be an integer constant because "the result is an
integer" and the conditional-expression is known to always be '3',
regardless of 'x'?

6.5.3.4p2 states that the result of the sizeof operator is an integer
constant if the type of its operand is not a variable length array. No
part of the standard says that its result is an integer constant in
any other case, so you cannot portably rely on it.
What about:

   int x;

   enum foo {
       bar = sizeof (char[1][1 ? 3 : x - x])
     };

?  Where the element type for 'char[1][1 ? 3 : x - x]' is 'char[1 ? 3 :
x - x]' (a VLA, if 'x - x' is not an integer constant expression).  And
though that element type might be a VLA, it might have a "known constant
size"[6.7.5.2p4].  Could the outer-most array type thus _not_ be a VLA?

That's clever. Unfortunately, "known constant size" is defined in
6.2.5p23 as: "A type has /known constant size/ if the type is not
incomplete and is not a variable length array type."
All right, thanks.  I wouldn't want that if I want portability.

The only portable way to define a custom offsetof is in terms of
<stddef.h>'s offsetof, sorry.
 
S

Shao Miller

int helper[2];
enum foo {
bar = helper + 1 - helper
};
[...]
Address constants are not permitted in integer constant expressions.

Do you mean because of the restriction on "operands," above? If so,
which operands in the code are address constants? If your answer is
'helper', is it:
- An operand to the binary addition operator (and the binary subtraction
operator)
- An operand of the integer constant expression?
- Both?

I would think 6.6p6 is meant as:

An integer constant expression is one of:
- An integer constant[*]
- An enumeration constant
- A character constant
- A sizeof expression whose result is an integer constant[*]
- A floating constant, cast to an integer type
- An integer constant expression, cast to an integer type
- An operator applied only to integer constant expressions

[*] "An integer constant" means integer-constant in the first case,
but merely combines the words "integer" and "constant" in the second.

So what about (thanks, pete!):

char ** helper;

#define DUMMY_OF(type) ((type *)*helper)
#define ADDRESS_AT(address) ((char *)(address))
#define DUMMY_MEMBER_ADDRESS(type, member) (&DUMMY_OF(type)->member)
#define PROTECT(expression) \
(sizeof *(1 ? 0 : (char(*)[1][(expression)])*helper))
#define OFFSETOF(type, member) (PROTECT( \
ADDRESS_AT(DUMMY_MEMBER_ADDRESS(type, member)) - \
ADDRESS_AT(DUMMY_OF(type)) \
))

struct s {
int x;
short y;
int z;
};

enum foo {
bar = OFFSETOF(struct s, z)
};

int main(void) {
return bar;
}

? Suppose that the cast-expression '(char(*)[1][(expression)])*helper'
in the 'PROTECT' macro is not an integer constant expression. Is the
result of the 'sizeof' operator to its left an integer constant
expression regardless of what 'expression' is? I should think not.

But given that in the 'OFFSETOF' macro, the result of the binary '-'
subtraction operator can be known to be a 'constant-expression' with
type 'ptrdiff_t' (regardless of the value of '*helper'), would that not
mean that the result of the 'sizeof' operator in the 'PROTECT' macro
ought to be an integer constant worthy of classification as an integer
constant expression by your bullet list above?
Almost, but not completely. In enum { foo = (1 + 2) + 3 }; the left
operand of the right + operator is (1 + 2), but that does not matter.
In enum { foo = sizeof(char) }; the operand of the sizeof operator is
char, but that does not matter.

Since 'type-name' ('char', in your example) can include expressions, do
you believe that those expressions are restricted by 6.6p6?
This is not a standard integer constant expression, because x is not
any of the items of 6.6p6, so neither is x - x.

Ok. I could agree to that since I'm concerned with portability. So
what about in:

int x;

enum foo {
bar = sizeof (char[1 ? 3 : x - x])
};

? If '1 ? 3 : x - x' is not an integer constant expression, that would
make 'char[...]' a variable length array type[6.7.5.2p4]. That would
suggest to me that "the operand is evaluated" by 'sizeof'[6.5.3.4p2].
Does that mean that the result of the 'sizeof' cannot be an integer
constant? Or might it be an integer constant because "the result is an
integer" and the conditional-expression is known to always be '3',
regardless of 'x'?

What about:

int x;

enum foo {
bar = sizeof (char[1][1 ? 3 : x - x])
};

? Where the element type for 'char[1][1 ? 3 : x - x]' is 'char[1 ? 3 :
x - x]' (a VLA, if 'x - x' is not an integer constant expression). And
though that element type might be a VLA, it might have a "known constant
size"[6.7.5.2p4]. Could the outer-most array type thus _not_ be a VLA?

After all, if 'x' isn't volatile, I believe that its stored value (even
if a trap representation due to lack of initialization above, assuming
function scope and not file scope) must be consistent with the
last-stored value. And beyond that, we "know" that 'x - x' is not
evaluated, so shouldn't the 'char[1 ? 3 : x - x]' [element] type have a
"known constant size" regardless of possible VLA status?
Yes, the compiler is free to reject it.

All right, thanks. I wouldn't want that if I want portability.
 
S

Shao Miller

So what about (thanks, pete!):

char ** helper;

#define DUMMY_OF(type) ((type *)*helper)
#define ADDRESS_AT(address) ((char *)(address))
#define DUMMY_MEMBER_ADDRESS(type, member) (&DUMMY_OF(type)->member)
#define PROTECT(expression) \
(sizeof *(1 ? 0 : (char(*)[1][(expression)])*helper))
#define OFFSETOF(type, member) (PROTECT( \
ADDRESS_AT(DUMMY_MEMBER_ADDRESS(type, member)) - \
ADDRESS_AT(DUMMY_OF(type)) \
))

struct s {
int x;
short y;
int z;
};

enum foo {
bar = OFFSETOF(struct s, z)
};

int main(void) {
return bar;
}

? Suppose that the cast-expression '(char(*)[1][(expression)])*helper'
in the 'PROTECT' macro is not an integer constant expression. Is the
result of the 'sizeof' operator to its left an integer constant
expression regardless of what 'expression' is? I should think not.

Correct. When (expression) is not an integer constant expression,
(char(*)[1][(expression)]) is a variably modified type, even if the
compiler can prove that (expression) will never change. The result of
applying sizeof to a variable-length array is not a constant.

Well, perhaps not defined to be a constant.
But given that in the 'OFFSETOF' macro, the result of the binary '-'
subtraction operator can be known to be a 'constant-expression' with
type 'ptrdiff_t' (regardless of the value of '*helper'), would that not
mean that the result of the 'sizeof' operator in the 'PROTECT' macro
ought to be an integer constant worthy of classification as an integer
constant expression by your bullet list above?

No, that doesn't work.

Ok.
Since 'type-name' ('char', in your example) can include expressions, do
you believe that those expressions are restricted by 6.6p6?

Not directly.

int foo(int);
enum {
ok = sizeof(char(*)[foo(30)]),
};

is just fine, because when sizeof is not applied to a variable length
array, its result is an integer constant.

int arr[1];
enum {
not_ok = sizeof(char[arr+1 - arr])
};

is not fine, because even though arr+1 - arr can be known at compile
time, it is not an integer constant expression, which means char[arr+1
- arr] is a variable length array, and sizeof's result is not an
integer constant when applied to a variable length array.

(Or defined to be.)
Ok. I could agree to that since I'm concerned with portability. So
what about in:

int x;

enum foo {
bar = sizeof (char[1 ? 3 : x - x])
};

? If '1 ? 3 : x - x' is not an integer constant expression, that would
make 'char[...]' a variable length array type[6.7.5.2p4]. That would
suggest to me that "the operand is evaluated" by 'sizeof'[6.5.3.4p2].
Does that mean that the result of the 'sizeof' cannot be an integer
constant?

Exactly.

All right.
6.5.3.4p2 states that the result of the sizeof operator is an integer
constant if the type of its operand is not a variable length array. No
part of the standard says that its result is an integer constant in
any other case, so you cannot portably rely on it.

Aha. That really seems like a key point.
What about:

int x;

enum foo {
bar = sizeof (char[1][1 ? 3 : x - x])
};

? Where the element type for 'char[1][1 ? 3 : x - x]' is 'char[1 ? 3 :
x - x]' (a VLA, if 'x - x' is not an integer constant expression). And
though that element type might be a VLA, it might have a "known constant
size"[6.7.5.2p4]. Could the outer-most array type thus _not_ be a VLA?

That's clever. Unfortunately, "known constant size" is defined in
6.2.5p23 as: "A type has /known constant size/ if the type is not
incomplete and is not a variable length array type."

I'd quite forgotten about the definition for "known constant size" and
sincerely appreciate you pointing that out.
The only portable way to define a custom offsetof is in terms of
<stddef.h>'s offsetof, sorry.

Well, it was worth a shot and definitely yielded some interesting
discussion. :) Fortunately, 'offsetof' is in C89, so I guess that's
that. Oh well.
 
T

Tim Rentsch

Harald van Dijk said:
I don't get this. The sentence in 6.6p10 and the remark that
something "may be a definition of f" don't seem analogous at
all. Can you explain in more detail?

1) An integer constant expression is a type of constant expression.
(arr - arr) is allowed to be a constant expression. Therefore, (arr -
arr) is allowed to be an integer constant expression, even though the
standard defines the term "integer constant expression" in such a way
that (arr - arr) cannot ever be one.

[snip parallel example with declaration and definition]

So... I feel that 1) is a bogus claim: the fact that it can be a
constant expression does not imply it can ever be an integer constant
expression. You probably don't think 1 accurately represents your
interpretation, but I hope you can see how and why I am reading it the
way I am.

Yes, the inference rule used (or implied) in 1 is not what I
meant. Let me see if I can clarify. The earlier context:

To restate: any statement _the Standard makes_ about a broad
category of items also applies to more specialized subcategories
unless a specific exception is called out _in the Standard_ (or
DR presumably). If the Standard said (the Standard doesn't say
this, I'm just giving an example as if it did) "Declarations made
at file scope may be made at block scope," that would be taken to
apply to definitions too (not counting macros), since definitions
are a subset of declarations.

Note also the distinction between talking about an element of a
category versus talking about the category itself, which (often)
parallels the distinction between singular and plural. A
statement about /a/ declaration doesn't say anything about
definition-ness, but a statement about declaration/s/ as a
category also says something about definition/s/ as a category.
Does that make more sense?

If gcc relies on undefined behaviour, then it is permitted to make (a
- a) or ((long) a) behave unintuitively in constant expressions. If
gcc relies on 6.6p10, different results in constant and in non-
constant expressions for (a - a) or ((long) a) may affect conformance.

As far as the Standard is concerned there is no difference
between these two; gcc doesn't have to declare which one
it's doing, and so is always allowed to do whatever it wants.
Whether it does what we want/expect is a QOI issue, not a
conformance issue.
It does not matter in your example, since ((long) a) just works, but
since gcc relies on help from the linker, it would not surprise me if
similar expressions, except more complicated and probably very stupid,
go undiagnosed by gcc but are not handled properly by the linker.

I for one would be shocked if gcc didn't diagnose an expression
that it couldn't rely on the linker to evaluate accurately.
I tried '17 * (long) a', and indeed got the expected diagnostic.
Again though, just QOI.

It seems that gcc is hardly alone in accepting it. I may ask for this
to get an optional warning in multiple compilers.

I would gladly second any such request.
 
H

Harald van Dijk

1) An integer constant expression is a type of constant expression.
(arr - arr) is allowed to be a constant expression. Therefore, (arr -
arr) is allowed to be an integer constant expression, even though the
standard defines the term "integer constant expression" in such a way
that (arr - arr) cannot ever be one.
[snip parallel example with declaration and definition]
So... I feel that 1) is a bogus claim: the fact that it can be a
constant expression does not imply it can ever be an integer constant
expression. You probably don't think 1 accurately represents your
interpretation, but I hope you can see how and why I am reading it the
way I am.
[snip]
To restate: any statement _the Standard makes_ about a broad
category of items also applies to more specialized subcategories
unless a specific exception is called out _in the Standard_ (or
DR presumably). If the Standard said (the Standard doesn't say
this, I'm just giving an example as if it did) "Declarations made
at file scope may be made at block scope," that would be taken to
apply to definitions too (not counting macros), since definitions
are a subset of declarations.

For that sentence, I would share your interpretation.
Note also the distinction between talking about an element of a
category versus talking about the category itself, which (often)
parallels the distinction between singular and plural. A
statement about /a/ declaration doesn't say anything about
definition-ness, but a statement about declaration/s/ as a
category also says something about definition/s/ as a category.
Does that make more sense?

Somewhat. But does that mean that 6.6p6, p8 and p9 do not apply at all
to the other forms of constant expressions permitted by p10?

Perhaps also interesting, which I came across while looking for
examples of the singular/plural distinction:

6.7.5.2p1:
"In addition to optional type qualifiers and the keyword static, the
[ and ] may delimit
an expression or *. If they delimit an expression (which specifies
the size of an array), the
expression shall have an integer type. If the expression is a
constant expression, it shall
have a value greater than zero."

Would you agree that "is a constant expression" refers to all forms of
constant expressions? If so, int a[(int) (1. - 2.)]; is a variable
length array but still a constraint violation, even on those
implementations that do not accept (int) (1. - 2.) as an integer
constant expression. Even more strangely, while
if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.
As far as the Standard is concerned there is no difference
between these two; gcc doesn't have to declare which one
it's doing, and so is always allowed to do whatever it wants.

Only as long as it doesn't declare which one it's doing. If the
documentation is to state it relies on 6.6p10 in a future version
(which I'm hoping it will, if asked), the standard requires the
obvious behaviour, assuming 6.6p11 refers to all forms of constant
expressions (despite using the singular form).
I for one would be shocked if gcc didn't diagnose an expression
that it couldn't rely on the linker to evaluate accurately.
I tried '17 * (long) a', and indeed got the expected diagnostic.
Again though, just QOI.

I was thinking of cases like (unsigned long) a - LONG_MAX, which would
not surprise me if they trigger linker errors with some linkers or
loaders for unexpected integer "overflow". Trying, I could not get
things to misbehave in this area, though I did manage to find (and
report) a different bug.
 
H

Harald van Dijk

6.7.5.2p1:
"In addition to optional type qualifiers and the keyword static, the
[ and ] may delimit
 an expression or *. If they delimit an expression (which specifies
the size of an array), the
 expression shall have an integer type. If the expression is a
constant expression, it shall
 have a value greater than zero."

Would you agree that "is a constant expression" refers to all forms of
constant expressions? If so, int a[(int) (1. - 2.)]; is a variable
length array but still a constraint violation, even on those
implementations that do not accept (int) (1. - 2.) as an integer
constant expression. Even more strangely, while
  if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
  if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.

That last part is wrong. (int) (0 / 0.) is not a constant expression,
for the same reason that 0 / 0 isn't. 0 / 0. is one, but it cannot be
converted to int in a constant expression.
 
T

Tim Rentsch

Harald van Dijk said:
Harald van Dijk said:
Harald van Dijk <[email protected]> writes:
[snip]
Note also the distinction between talking about an element of a
category versus talking about the category itself, which (often)
parallels the distinction between singular and plural. A
statement about /a/ declaration doesn't say anything about
definition-ness, but a statement about declaration/s/ as a
category also says something about definition/s/ as a category.
Does that make more sense?

Somewhat. But does that mean that 6.6p6, p8 and p9 do not apply at all
to the other forms of constant expressions permitted by p10?

Kind of a multi-layer question. These paragraphs give definitions,
and definitions always relate to the category rather than to
individual element(s) in the category. However, the 'shall's in
these paragraphs do not limit what other forms may be accepted under
p10 (justification to be given below).

Perhaps also interesting, which I came across while looking for
examples of the singular/plural distinction:

6.7.5.2p1:
"In addition to optional type qualifiers and the keyword static, the
[ and ] may delimit
an expression or *. If they delimit an expression (which specifies
the size of an array), the
expression shall have an integer type. If the expression is a
constant expression, it shall
have a value greater than zero."

Would you agree that "is a constant expression" refers to all forms of
constant expressions?

In this context, I believe 'constant expression' means something
that satisfies the synatx of constant-expression, and that can
be evaluated at translation (or possibly link) time, and that
satisfies 6.6p3 *, and that satisfies 6.6p4 **. (The * and **
indicate there may be exceptions to these conditions, regarding
which I will discuss more presently.)

If so, int a[(int) (1. - 2.)]; is a variable
length array but still a constraint violation, even on those
implementations that do not accept (int) (1. - 2.) as an integer
constant expression.

I agree that it's a constraint violation. I would say it's
a VLA if and only if the implementation does not accept
'(int) (1. - 2.)' as an integer constant expression, since
if it is an ICE then 6.7.5.2p4 says the array is not a VLA.
I think it's possible in principle that this dependency can
produce some semantic nastiness (eg, behavior expected to be
defined may be undefined), but in practice I expect such
circumstances never occur in any real-world program.

Even more strangely, while
if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.

I see that you have amended this assessment in a followup;
I'll respond to that.

Only as long as it doesn't declare which one it's doing. If the
documentation is to state it relies on 6.6p10 in a future version
(which I'm hoping it will, if asked), the standard requires the
obvious behaviour,

[the "this" under discussion was snipped]
char x[1];
static long xlong = (long) x;

I think I should revise my answer here. Despite there being no
italics in 6.6p7, I believe this paragraph is meant to be
definitional, giving requirements that need to be met for an
expression (that is a constant-expression syntactically) to be
considered "a constant expression in an initializer". Under that
reading, if the 'shall' in 6.6p7 is not met, the expression in
question fails to be a constant expression (in that context),
which would be a constraint violation for cases like the example
above. So I think the question about "undefined behavior" is
really just a red herring here. (Yes I do think this point is in
need of clarification, and would support a suggestion that the
Standard clarify it. Until there is more information, however,
I think this interpretation makes the most sense.)

assuming 6.6p11 refers to all forms of constant expressions
(despite using the singular form).

It seems clear that the statement in 6.6p11 is meant to apply to
all types of constant expression discussed in 6.6. Certainly
all the different kinds of constant expressions must match the
syntactical form of constant-expression.

Having said that, I should add that the distinction between
singular and plural is not absolute. What I said earlier was
something like "singular is more usual" for just one category,
and that's how it was meant, as more likely but not necessarily
a hard-and-fast rule.

I was thinking of cases like (unsigned long) a - LONG_MAX, which would
not surprise me if they trigger linker errors with some linkers or
loaders for unexpected integer "overflow". Trying, I could not get
things to misbehave in this area, though I did manage to find (and
report) a different bug.

In light of the revised comment above, I would now say that any
such mistake in evaluation is pure and simple a bug in the
implementation; if an implementation accepts an expression as
a constant expression (of any kind), its evaluation must be
defined (perhaps implementation-dependently) and yield the same
result that would be produced by a run-time evaluation (of the
same expression).


To return to the original question --

Dredging around in the DR's, I found this:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_044.html

The answer to question one, in combination with the remarks in
the Rationale document, make it pretty clear that 6.6p10 is
meant to cover all the various forms of constant expressions.

Examples in the Rationale provide the justification alluded to
above: that the 'shall's in 6.6p6, etc, need not apply to forms
of constant expressions accepted under 6.6p10 -- since they do
not apply to the example definitions of 'offsetof()' in the
Rationale, which must produce an integer constant expression.
The offsetof() examples, taken together with the comments in the
DR, illustrate the inference fairly pointedly.
 
T

Tim Rentsch

Harald said:
6.7.5.2p1:
"In addition to optional type qualifiers and the keyword static, the
[ and ] may delimit
an expression or *. If they delimit an expression (which specifies
the size of an array), the
expression shall have an integer type. If the expression is a
constant expression, it shall
have a value greater than zero."

Would you agree that "is a constant expression" refers to all forms of
constant expressions? If so, int a[(int) (1. - 2.)]; is a variable
length array but still a constraint violation, even on those
implementations that do not accept (int) (1. - 2.) as an integer
constant expression. Even more strangely, while
if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.

(Side remark: that makes sense, in a peculiar sort of way.)

That last part is wrong. (int) (0 / 0.) is not a constant expression,
for the same reason that 0 / 0 isn't. 0 / 0. is one, but it cannot be
converted to int in a constant expression.

Actually I think either or both of these should, or could, be
allowed as constant expressions, depending on the implementation.
First off I think they _could_ be allowed as constant expressions
(and even as integer constant expressions) under 6.6p10. Second,
I suggest they _should_ be allowed as constant expressions on
those implementations that define the semantics of the operations
in question to yield valid values for the types in question. (In
the second case only '0/0' would be an ICE. Also, in both cases,
if they are constant expressions, and used as shown here, to
declare array sizes, then the value must satisfy the constraint
of 6.7.5.2p1.) Of course the key point is 6.6p4: if there is
undefined behavior, but the behavior in question is defined by
the implementation to yield a legal value, is the constraint of
6.6p4 met? I think it makes the most sense if 6.6p4 _is_ met
under such circumstances, but the matter is (at best) somewhat
murky.

To summarize briefly:

1. The Rationale document shows examples where an expression
may be accepted as a constant expression even though there
is (in the strictly conforming sense) undefined behavior.

2. In the absence of any mitigating implementation definitions,
any undefined behavior results either in the expression not
being a constant expression, or in a constraint violation due
to 6.6p4 not being met. Which one of these applies depends on
the context of the expression in question.

3. Related statements in various ancillary documents are either not
completely clear, not completely thought through, or not mutually
consistent. I expect most people would agree some clarification
would be helpful; anything beyond that I'm not up to debating at
the moment.

Here are links for several ancillary documents I looked at and that
appear to be relevant:

http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf

http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_031.html
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_032.html
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_044.html
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_064.html
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_145.html
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_261.htm

For the last one, it may be helpful to get the html (using wget or
similar) and edit it a bit, so all the example code lines may be
seen easily.
 
H

Harald van Dijk

Kind of a multi-layer question.  These paragraphs give definitions,
and definitions always relate to the category rather than to
individual element(s) in the category.

Did you mean that the other way around? The definition of "integer
constant expression" should not be applied to arbitrary constant
expressions. Perhaps I did not understand your message properly, but
at any rate...
 However, the 'shall's in
these paragraphs do not limit what other forms may be accepted under
p10 (justification to be given below).

....this is an answer to the question I meant to ask.

Because of that, I think 6.6p6-9 should be reworded: since 6.6p6's
"and shall only have operands that are integer constants [...]" does
not apply to all integer constant expressions, it should not be part
of the definition of integer constant expression. Similarly for p7-9.
6.7.5.2p1:
"In addition to optional type qualifiers and the keyword static, the
[ and ] may delimit
 an expression or *. If they delimit an expression (which specifies
the size of an array), the
 expression shall have an integer type. If the expression is a
constant expression, it shall
 have a value greater than zero."
Would you agree that "is a constant expression" refers to all forms of
constant expressions?

In this context, I believe 'constant expression' means something
that satisfies the synatx of constant-expression, and that can
be evaluated at translation (or possibly link) time, and that
satisfies 6.6p3 *, and that satisfies 6.6p4 **.  (The * and **
indicate there may be exceptions to these conditions, regarding
which I will discuss more presently.)

I do not believe there are exceptions to 6.6p3 and 6.6p4, even though
it might make sense. See DR #32/#31 and below.
If so, int a[(int) (1. - 2.)]; is a variable
length array but still a constraint violation, even on those
implementations that do not accept (int) (1. - 2.) as an integer
constant expression.

I agree that it's a constraint violation.  I would say it's
a VLA if and only if the implementation does not accept
'(int) (1. - 2.)' as an integer constant expression, since
if it is an ICE then 6.7.5.2p4 says the array is not a VLA.
I think it's possible in principle that this dependency can
produce some semantic nastiness (eg, behavior expected to be
defined may be undefined), but in practice I expect such
circumstances never occur in any real-world program.

Agreed. In real-world programs, a non-constant array length which
unconditionally evaluates to a negative value is not useful.
Even more strangely, while
  if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
  if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.

I see that you have amended this assessment in a followup;
I'll respond to that.

I'll include my reply to that here:

Without an official response, it could be unclear whether 6.6p4
applies to 0 / 0 if an implementation defines its result, but DR #31
flat out states that "case INT_MAX + 2:" is a constraint violation and
requires a diagnostic. 0 / 0 and (int) (0 / 0.) are in the same
category.
Only as long as it doesn't declare which one it's doing. If the
documentation is to state it relies on 6.6p10 in a future version
(which I'm hoping it will, if asked), the standard requires the
obvious behaviour,

[the "this" under discussion was snipped]
    char x[1];
    static long xlong = (long) x;

I think I should revise my answer here.  Despite there being no
italics in 6.6p7, I believe this paragraph is meant to be
definitional, giving requirements that need to be met for an
expression (that is a constant-expression syntactically) to be
considered "a constant expression in an initializer".  Under that
reading, if the 'shall' in 6.6p7 is not met, the expression in
question fails to be a constant expression (in that context),
which would be a constraint violation for cases like the example
above.

[unless an implementation extends 6.6p7 to include address constants
cast to integral types.]
So I think the question about "undefined behavior" is
really just a red herring here.  (Yes I do think this point is in
need of clarification, and would support a suggestion that the
Standard clarify it.  Until there is more information, however,
I think this interpretation makes the most sense.)

I will agree that your interpretation could be what was intended by
6.6p7.

Does it make sense without interpreting it as a definition? Are there
useful extended "constant expressions in initializers" that could not
be an extended "arithmetic constant expression" or an extended
"address constant expression"? Yes, I think there could be.

static struct S s = (const struct S) {0};

In fact, this is accepted by ICC in conforming mode.
It seems clear that the statement in 6.6p11 is meant to apply to
all types of constant expression discussed in 6.6.  Certainly
all the different kinds of constant expressions must match the
syntactical form of constant-expression.

Having said that, I should add that the distinction between
singular and plural is not absolute.  What I said earlier was
something like "singular is more usual" for just one category,
and that's how it was meant, as more likely but not necessarily
a hard-and-fast rule.

Yes, I understand that, I was merely being explicit.
In light of the revised comment above, I would now say that any
such mistake in evaluation is pure and simple a bug in the
implementation;  if an implementation accepts an expression as
a constant expression (of any kind), its evaluation must be
defined (perhaps implementation-dependently) and yield the same
result that would be produced by a run-time evaluation (of the
same expression).
Agreed.

To return to the original question --

Dredging around in the DR's, I found this:

   http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_044.html

The answer to question one, in combination with the remarks in
the Rationale document, make it pretty clear that 6.6p10 is
meant to cover all the various forms of constant expressions.

I had since found DR 261 already, which states as part of its
response:

"Otherwise, if the expression meets the requirements of 6.6
(including any form accepted in accordance with 6.6#10), it is a
constant expression."

which is equally clear.
 
T

Tim Rentsch

Harald van Dijk said:
Did you mean that the other way around? The definition of "integer
constant expression" should not be applied to arbitrary constant
expressions. Perhaps I did not understand your message properly, but
at any rate...

Sorry, what I said was confusing. I agree with your statement
about ICE's not applying to arbitrary CE's. Probably it's better
if I just drop the rest of that and move on...

However, the 'shall's in
these paragraphs do not limit what other forms may be accepted under
p10 (justification to be given below).

...this is an answer to the question I meant to ask.

Because of that, I think 6.6p6-9 should be reworded: since 6.6p6's
"and shall only have operands that are integer constants [...]" does
not apply to all integer constant expressions, it should not be part
of the definition of integer constant expression. Similarly for p7-9.

Alternatively, 6.6p10 might say explicitly what it could
override. The main thing is we agree on both the intended
meaning and the need for clarification.

6.7.5.2p1:
"In addition to optional type qualifiers and the keyword static, the
[ and ] may delimit
an expression or *. If they delimit an expression (which specifies
the size of an array), the
expression shall have an integer type. If the expression is a
constant expression, it shall
have a value greater than zero."
Would you agree that "is a constant expression" refers to all forms of
constant expressions?

In this context, I believe 'constant expression' means something
that satisfies the synatx of constant-expression, and that can
be evaluated at translation (or possibly link) time, and that
satisfies 6.6p3 *, and that satisfies 6.6p4 **. (The * and **
indicate there may be exceptions to these conditions, regarding
which I will discuss more presently.)

I do not believe there are exceptions to 6.6p3 and 6.6p4, even though
it might make sense. See DR #32/#31 and below.

6.6p3 is pretty clear cut. 6.6p4 is more problematic,
despite the DR's (which I already looked at carefully).
Continuing below...

If so, int a[(int) (1. - 2.)]; is a variable
length array but still a constraint violation, even on those
implementations that do not accept (int) (1. - 2.) as an integer
constant expression.

I agree that it's a constraint violation. I would say it's
a VLA if and only if the implementation does not accept
'(int) (1. - 2.)' as an integer constant expression, since
if it is an ICE then 6.7.5.2p4 says the array is not a VLA.
I think it's possible in principle that this dependency can
produce some semantic nastiness (eg, behavior expected to be
defined may be undefined), but in practice I expect such
circumstances never occur in any real-world program.

Agreed. In real-world programs, a non-constant array length which
unconditionally evaluates to a negative value is not useful.

Actually I was talking about a different dependency, namely
whether or not something is a VLA, regardless of whether
the size is valid. But like I said I think there is no
real practical difference.

Even more strangely, while
if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.

I see that you have amended this assessment in a followup;
I'll respond to that.

I'll include my reply to that here:

Without an official response, it could be unclear whether 6.6p4
applies to 0 / 0 if an implementation defines its result, but DR #31
flat out states that "case INT_MAX + 2:" is a constraint violation and
requires a diagnostic. 0 / 0 and (int) (0 / 0.) are in the same
category.

I did of course read DR 31. The trouble is it is awfully
ambiguous about what it applies to or when; IMO it's not at all
obvious whether these cases "are in the same category".
Consider:

1. Examples in the Rationale show that some expressions with
undefined behavior still can be valid ICE's and not run afoul
of 6.6p4;

2. DR 31 doesn't mention 6.6p10 (and the Rationale examples
demonstrate that this might be relevant);

3. DR 31 doesn't mention implementation-dependent definitions
for evaluation in certain cases of UB;

4. The distinction between having documented definitions and
having something be "still just UB" is important and should not
just be glossed over;

5. Looking at the specifics, 'INT_MAX + 2' has a well-understood
mathematical value, and one outside the range of int, but '0/0'
does not -- it would not be unreasonable to define '0/0' as 1,
and any other integer over 0 as an exceptional condition;

6. The case of '(int) (0 / 0.)' is yet more unusual, because
'0/0.' has a defined value (in some implementations), and what
then happens on the '(int)' conversion may definedly lead to
a valid value (which I believe accounts for why gcc gives '0/0'
a warning but does not for the '(int) (0/0.)' case).

7. There is no reason an implementation couldn't define
values for these expressions as constant expressions
but leave them undefined for run-time evaluation -- in
fact that might make a certain amount of sense for
expressions like '0/0'.

A trap that I think is easy to fall into reading the Standard,
or other Standard-related documents, is a focus on one aspect as
being absolutely right, and then draw conclusions based on that,
without considering other passages that could undermine the
reasoning. Unfortunately there are cases -- and I believe this
is one of them -- where separate pieces individually make sense
but taken together result in an inconsistency.

As far as the Standard is concerned there is no difference
between these two; gcc doesn't have to declare which one
it's doing, and so is always allowed to do whatever it wants.
Only as long as it doesn't declare which one it's doing. If the
documentation is to state it relies on 6.6p10 in a future version
(which I'm hoping it will, if asked), the standard requires the
obvious behaviour,

[the "this" under discussion was snipped]
char x[1];
static long xlong = (long) x;

I think I should revise my answer here. Despite there being no
italics in 6.6p7, I believe this paragraph is meant to be
definitional, giving requirements that need to be met for an
expression (that is a constant-expression syntactically) to be
considered "a constant expression in an initializer". Under that
reading, if the 'shall' in 6.6p7 is not met, the expression in
question fails to be a constant expression (in that context),
which would be a constraint violation for cases like the example
above.

[unless an implementation extends 6.6p7 to include address constants
cast to integral types.]

No that's my point -- the expression '(long) x' above can be
taken as a constant expression if, and *ONLY* if, there is an
additional form of constant expression accepted under 6.6p10; it
cannot be accepted as a constant expression just as "undefined
behavior" of not meeting the 'shall's.

I will agree that your interpretation could be what was intended by
6.6p7.

Does it make sense without interpreting it as a definition? Are there
useful extended "constant expressions in initializers" that could not
be an extended "arithmetic constant expression" or an extended
"address constant expression"? Yes, I think there could be.

static struct S s = (const struct S) {0};

In fact, this is accepted by ICC in conforming mode.

Interesting. I concur with your reasoning.

[..snip some parts where we just agree..]
To return to the original question --

Dredging around in the DR's, I found this:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_044.html

The answer to question one, in combination with the remarks in
the Rationale document, make it pretty clear that 6.6p10 is
meant to cover all the various forms of constant expressions.

I had since found DR 261 already, which states as part of its
response:

"Otherwise, if the expression meets the requirements of 6.6
(including any form accepted in accordance with 6.6#10), it is a
constant expression."

which is equally clear.

This kind of surprises me; I thought DR 044 was more
obviously on point for that question. However, as
long as we're both okay with the conclusion I suppose
it doesn't matter what convinced us. :)
 
H

Harald van Dijk

Harald van Dijk said:
I do not believe there are exceptions to 6.6p3 and 6.6p4, even though
it might make sense. See DR #32/#31 and below.

6.6p3 is pretty clear cut.  6.6p4 is more problematic,
despite the DR's (which I already looked at carefully).
Even more strangely, while
  if (0) { int a[0 / 0]; }
is permitted in a strictly conforming program because 0 / 0 is not a
constant expression,
  if (0) { int a[(int) (0 / 0.)]; }
violates the constraint. gcc generates a warning for the first, but
nothing for the second.
I see that you have amended this assessment in a followup;
I'll respond to that.
I'll include my reply to that here:
Without an official response, it could be unclear whether 6.6p4
applies to 0 / 0 if an implementation defines its result, but DR #31
flat out states that "case INT_MAX + 2:" is a constraint violation and
requires a diagnostic. 0 / 0 and (int) (0 / 0.) are in the same
category.

I did of course read DR 31.  The trouble is it is awfully
ambiguous about what it applies to or when;  IMO it's not at all
obvious whether these cases "are in the same category".

Looking closer, I agree that it is not clear. I was basing it on
6.5p5, which is a single rule rendering both undefined in non-constant
expressions (not considering extensions). I had not realised that the
rules for constant expressions are split up differently.
Consider:

  1. Examples in the Rationale show that some expressions with
  undefined behavior still can be valid ICE's and not run afoul
  of 6.6p4;

Yes, but those examples are undefined in non-constant expressions for
other reasons than 6.5p5.
  2. DR 31 doesn't mention 6.6p10 (and the Rationale examples
  demonstrate that this might be relevant);

3. DR 31 doesn't mention implementation-dependent definitions
for evaluation in certain cases of UB;

DR 32 clarifies that 6.6p10 does not grant implementations permission
to accept anything violating a constraint as an extended constant
expression. It states that an implementation cannot ignore the
constraint in 6.6p3 by relying on 6.6p10, so I am fairly certain an
implementation cannot ignore the constraint in 6.6p4 by relying on
6.6p10 either.
  4. The distinction between having documented definitions and
  having something be "still just UB" is important and should not
  just be glossed over;
Agreed.

  5. Looking at the specifics, 'INT_MAX + 2' has a well-understood
  mathematical value, and one outside the range of int,

On the other hand, evaluating INT_MAX + 2 to INT_MIN + 1 (which is
within the range of int) is such a common extension that if 6.6p10 is
relevant, it really ought to have been mentioned in DR #31.
but '0/0'
  does not -- it would not be unreasonable to define '0/0' as 1,
  and any other integer over 0 as an exceptional condition;

All of INT_MAX + 2, 0 / 0, and 1 / 0 are an "exceptional condition" in
the sense of 6.5p5 which renders the behaviour is undefined. But yes,
a legitimate point can be raised that while "exceptional condition"
doesn't distinguish between "the result is not mathematically defined"
and "the result is not in the range of representable values for its
type", 6.6p4 does, so only some exceptional conditions require a
diagnostic in constant expressions. I do not know whether this
distinction is intentional.
  6. The case of '(int) (0 / 0.)' is yet more unusual, because
  '0/0.' has a defined value (in some implementations), and what
  then happens on the '(int)' conversion may definedly lead to
  a valid value (which I believe accounts for why gcc gives '0/0'
  a warning but does not for the '(int) (0/0.)' case).
Agreed.

  7. There is no reason an implementation couldn't define
  values for these expressions as constant expressions
  but leave them undefined for run-time evaluation -- in
  fact that might make a certain amount of sense for
  expressions like '0/0'.

That would be in violation of 6.6p11.
A trap that I think is easy to fall into reading the Standard,
or other Standard-related documents, is a focus on one aspect as
being absolutely right, and then draw conclusions based on that,
without considering other passages that could undermine the
reasoning.  Unfortunately there are cases -- and I believe this
is one of them -- where separate pieces individually make sense
but taken together result in an inconsistency.

I do not believe there is any inconsistency, either in your
interpretation or in mine.
    char x[1];
    static long xlong = (long) x;
I think I should revise my answer here.  Despite there being no
italics in 6.6p7, I believe this paragraph is meant to be
definitional, giving requirements that need to be met for an
expression (that is a constant-expression syntactically) to be
considered "a constant expression in an initializer".  Under that
reading, if the 'shall' in 6.6p7 is not met, the expression in
question fails to be a constant expression (in that context),
which would be a constraint violation for cases like the example
above.
[unless an implementation extends 6.6p7 to include address constants
cast to integral types.]

No that's my point -- the expression '(long) x' above can be
taken as a constant expression if, and *ONLY* if, there is an
additional form of constant expression accepted under 6.6p10;  it
cannot be accepted as a constant expression just as "undefined
behavior" of not meeting the 'shall's.

Yes, I understood. I was pretty sure that by "if the 'shall' in 6.6p7
is not met, the expression in question fails to be a constant
expression (in that context)" you meant "unless permitted as an
extended constant expression" (under 6.6p10, extending 6.6p7), and
that was all I meant with my comment. The words just came out
confusing.
 
M

Michael Press

Tim Rentsch said:
5. Looking at the specifics, 'INT_MAX + 2' has a well-understood
mathematical value, and one outside the range of int, but '0/0'
does not -- it would not be unreasonable to define '0/0' as 1,
and any other integer over 0 as an exceptional condition;

I think it is unreasonable to try to define 0/0. A
situation when 0/0 has meaning never arises. 0/0 should
always raise the divide by zero signal, because 0/0
only ever arises when a programmer goofed. Two rational
fractions a/b, c/d are equal when ad = bc but if b = d = 0
then 0/0 = a/b for all a, b so the definition of equality
does not work.

0
Contrast with with 0

0
0 = 1 is typically a useful definition.
For instance

x
1) limit x = 1.
x->0

2) The number of mappings from the empty set to the
empty set is 0^0. It _has_ to be 1 because the only
mapping from the empty set to the empty set is the
empty set, so the set of mappings contains only the
empty set, and thus has cardinality 1.

<http://www.cs.uwaterloo.ca/~alopez-o/math-faq/mathtext/node14.html>
 
T

Tim Rentsch

Harald said:
Harald van Dijk said:

Consider:

1. Examples in the Rationale show that some expressions with
undefined behavior still can be valid ICE's and not run afoul
of 6.6p4;

Yes, but those examples are undefined in non-constant expressions for
other reasons than 6.5p5.

That doesn't matter. Undefined behavior is undefined behavior;
there aren't different kinds of UB (per 4p2).

DR 32 clarifies that 6.6p10 does not grant implementations permission
to accept anything violating a constraint as an extended constant
expression. It states that an implementation cannot ignore the
constraint in 6.6p3 by relying on 6.6p10, so I am fairly certain an
implementation cannot ignore the constraint in 6.6p4 by relying on
6.6p10 either.

6.6p4 cannot be ignored if it is violated. However, if the behavior
is defined by the implementation to produce a value that is in-range
for the type in question, then there is no violation.

On the other hand, evaluating INT_MAX + 2 to INT_MIN + 1 (which is
within the range of int) is such a common extension that if 6.6p10 is
relevant, it really ought to have been mentioned in DR #31.

That result sometimes happens accidentally, but that's not the same
thing as it being defined that way. Even in implementations that
reliably generate code that gives this result in all cases, I wouldn't
necessarily expect that behavior to be guaranteed or documented. That
is a key difference.

All of INT_MAX + 2, 0 / 0, and 1 / 0 are an "exceptional condition" in
the sense of 6.5p5 which renders the behaviour is undefined. But yes,
a legitimate point can be raised that while "exceptional condition"
doesn't distinguish between "the result is not mathematically defined"
and "the result is not in the range of representable values for its
type", 6.6p4 does, so only some exceptional conditions require a
diagnostic in constant expressions. I do not know whether this
distinction is intentional.

I read 6.6p4 as talking about the value under C evaluation
rules, not some notion of mathematical value; it's hard
to see how it could be otherwise, given things like unsigned
types and pointer arithmetic. Surely 6.6p5 is meant to
describe a situation relative to some mathematical notion
of value. So I don't think they are talking about exactly
the same thing. Yes?

That would be in violation of 6.6p11.

I don't think so. Because the Standard deems this behavior undefined,
the implementation is free to define it however it wants. So, for
example, it could define it to be the result of a function call whose
value depended on whether 'main()' had been entered (so neither case
is "undefined" but the potential effect is just as wide-ranging).
Different results, but the same definition.

I do not believe there is any inconsistency, either in your
interpretation or in mine.

I'm not sure I understand what you're saying. The inconsistency
I'm talking about is one that arises from the Standard plus
several DR's. (Technically there is no inconsistency since
the DR's don't always make general statements but only give
examples, but assuming the most natural generalizations of the
comments in the DR there is.) The inconsistency results from

1. Undefined behavior always means an out-of-range value
(for any type);

2. An out-of-range value is always a violation of 6.6p4;

3. CE's under 6.6p10 still may not violate 6.6p3;

4. 6.6p3 and 6.6p4 both have the same status as regards
violation, and in particular have the same status as
regards CE's under 6.6p10;

5. Hence no form accepted under 6.6p10 that has undefined
behavior can be accepted without a diagnostic (ie,
without being a constraint violation); and yet

6. The examples for 'offsetof()' in the Rationale have
undefined behavior, and presumably would not be a
constraint violation because use of 'offsetof()' must
be able to be used in strictly conforming programs.

Something has to give somewhere. These can't all be true unless
we are willing to accept an inconsistent logic (and I believe no
one who is at all sensible seriously advocates doing that).

char x[1];
static long xlong = (long) x;
I think I should revise my answer here. Despite there being no
italics in 6.6p7, I believe this paragraph is meant to be
definitional, giving requirements that need to be met for an
expression (that is a constant-expression syntactically) to be
considered "a constant expression in an initializer". Under that
reading, if the 'shall' in 6.6p7 is not met, the expression in
question fails to be a constant expression (in that context),
which would be a constraint violation for cases like the example
above.
[unless an implementation extends 6.6p7 to include address constants
cast to integral types.]

No that's my point -- the expression '(long) x' above can be
taken as a constant expression if, and *ONLY* if, there is an
additional form of constant expression accepted under 6.6p10; it
cannot be accepted as a constant expression just as "undefined
behavior" of not meeting the 'shall's.

Yes, I understood. I was pretty sure that by "if the 'shall' in 6.6p7
is not met, the expression in question fails to be a constant
expression (in that context)" you meant "unless permitted as an
extended constant expression" (under 6.6p10, extending 6.6p7), and
that was all I meant with my comment. The words just came out
confusing.

I see -- your []'ed comment was meant as a clarification of my
statement (which I believe was intended to be made in a context
where 6.6p10 was known not to be in play, otherwise I would have
pointed out that possibility myself). Anyway that's all good
now.
 
H

Harald van Dijk

Harald said:
[snip]
Consider:
1. Examples in the Rationale show that some expressions with
undefined behavior still can be valid ICE's and not run afoul
of 6.6p4;
Yes, but those examples are undefined in non-constant expressions for
other reasons than 6.5p5.

That doesn't matter. Undefined behavior is undefined behavior;
there aren't different kinds of UB (per 4p2).

4p2 doesn't say that undefined behaviour with constraint violation is
equivalent to undefined behaviour without constraint violation. Some
forms of undefined behaviour in non-constant expressions violate a
constraint in constant expressions, others don't.
That result sometimes happens accidentally, but that's not the same
thing as it being defined that way.

I understand. There are multiple implementations that do define it as
an extension.
I read 6.6p4 as talking about the value under C evaluation
rules, not some notion of mathematical value;

If the same wording in 6.5p5 can only logically be taken to refer to
the mathematical value, I assume it also refers to the mathematical
value in 6.6p4.
it's hard
to see how it could be otherwise, given things like unsigned
types

Unsigned types are reduced modulo 2**N. Taking UINT_MAX+1, the
mathematical result of (UINT_MAX+1) modulo (UINT_MAX+1) is zero, which
is within the range of unsigned int.
and pointer arithmetic.

I do not believe either 6.5p5 or 6.6p4 applies or is intended to apply
to pointer arithmetic. Do you know of an implementation that treats it
that way -- that diagnoses

a.c:
int array[1];

b.c:
extern int array[];
int pointer = array + 20;

as a constraint violation? I believe this is simply undefined
behaviour because of 6.5.6p8, both in constant and in non-constant
expressions.
Surely 6.6p5 is meant to

[ 6.5p5? ]
describe a situation relative to some mathematical notion
of value. So I don't think they are talking about exactly
the same thing. Yes?

The only difference in the wording is that 6.5p5 allows for cases
where the result is not mathematically defined. The wording "in the
range of representable values for its type" is identical between 6.5p5
and 6.6p4, so how could it mean two different things? Besides, 6.6p4
makes more sense to me when reading it as referring to the
mathematical value than otherwise.
I don't think so. Because the Standard deems this behavior undefined,
the implementation is free to define it however it wants. So, for
example, it could define it to be the result of a function call whose
value depended on whether 'main()' had been entered (so neither case
is "undefined" but the potential effect is just as wide-ranging).
Different results, but the same definition.

Unless 6.6p4 is intended to apply to 0/0 -- of which I am now unsure,
as already stated:

*If* the implementation defines 0/0 as 1/!__has_entered_main(), *and*
documents it as such, it may technically be permitted. It is a silly
implementation that only attempts to stretch the rules, and I will not
take such an implementation any more seriously than I would one that
diagnoses every program with only "hello".
I do not believe there is any inconsistency, either in your
interpretation or in mine.

I'm not sure I understand what you're saying.
[...]
1. Undefined behavior always means an out-of-range value
(for any type);

This is not something I have claimed, or you have claimed. If you do
believe this, or believe that I do, then I can see how you come to an
inconsistency, and why you are attempting to resolve it by
interpreting 6.6p4 differently from how I do.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,091
Messages
2,570,604
Members
47,224
Latest member
Gwen068088

Latest Threads

Top