Origin of size_t? Curious.

S

S.Tobias

pete said:
S.Tobias said:
N869
6.5.3.4 The sizeof operator
[#2] The sizeof operator yields the size (in bytes) of its
operand, which may be an expression or the parenthesized
name of a type
If the size cannot be taken, then how can sizeof do
what it's supposed to do?

So what should a compiler do? Should it reject the code which
contains a declaration of type that is too large, or is it UB
applying `sizeof' to such type?
My compiler accepts the oversized declaration,
I just can't use it anywhere.
When I uncomment array_5, i get:
new.c(11) : error C2089: 'structure' : 'struct' too large
/* BEGIN new.c */
#include <stdio.h>
struct structure {
char array_0[(size_t)-1 / 5];
char array_1[(size_t)-1 / 5];
char array_2[(size_t)-1 / 5];
char array_3[(size_t)-1 / 5];
char array_4[(size_t)-1 / 5];
/* char array_5[(size_t)-1 / 5];*/
};
int main(void)
{
printf("sizeof(array) is %lu\n",
(long unsigned)sizeof(struct structure));
return 0;
}
/* END new.c */

AFAICT this program is strictly conforming (I believe there're no limits
on size of types), and there is no reason to reject it, not at least
at the point of struct declaration. I think it's a defect in the Standard.
There's an interesting answer from Robert Gamble, pointing to DR266.
I think the Committee's response misses the point, but at least
it indicates there's no constraint violation there (diagnostics
is not required).
 
L

Lawrence Kirby

S.Tobias said:
pete said:
S.Tobias wrote:

I don't understand. Where exactly is it written that `size_t' must
be able to cover sizes of all types? All I can find is that `size_t'
is the type returned by `sizeof' operator. Does it mean that
implementations must forbid types whose size cannot be taken?
N869
6.5.3.4 The sizeof operator
[#2] The sizeof operator yields the size (in bytes) of its
operand, which may be an expression or the parenthesized
name of a type
If the size cannot be taken, then how can sizeof do
what it's supposed to do?

It can't which means that the compiler should not successfully translate
the program, which is fine.

It takes the 3rd option: "I accept your code but I can't translate it".
It doesn't say anywhere that it is undefined behavior.

Well undefined behaviour can occur as a lack of definition of behaviour,
but I don't think that's a valid argument here.
6.5.3.4 clearly
says what the compiler has to do. If the compiler cannot do what it has
to do, then it should better tell the user.

If it cannot generate code that conforms to the standard's specificaton
then it must not generate code.
(The correct solution is for the compiler not to allow the definition of
any type that is so large that sizeof cannot yield the correct result as
required).

An important principle here is that a conforming implementation is only
required to be able to successfully translate and execute ONE program as
specified in 5.2.4.1. That makes some sense if you consider that a
strictly conforming program can be arbitrarily large and use an
arbitrarily large amount of static and automatic storage, even if
individual objects are all small. So if you required a conforming
implementation to successfully translate and execute all strictly
conforming programs, there could be no conforming implementations in the
real world.

Lawrence
 
P

pete

Lawrence said:
S.Tobias said:
S.Tobias wrote:

I don't understand.
Where exactly is it written that `size_t' must
be able to cover sizes of all types?
All I can find is that `size_t'
is the type returned by `sizeof' operator. Does it mean that
implementations must forbid types whose size cannot be taken?

N869
6.5.3.4 The sizeof operator
[#2] The sizeof operator yields
the size (in bytes) of its
operand, which may be an expression or the
parenthesized
name of a type

If the size cannot be taken, then how can sizeof do
what it's supposed to do?

It can't which means that the compiler
should not successfully translate
the program, which is fine.

It takes the 3rd option:
"I accept your code but I can't translate it".

What does "accept" mean?

I don't know that "accept" means
"be able to successfully translate and execute",
but I can't think of what else it could mean.
Well undefined behaviour can occur as
a lack of definition of behaviour,
but I don't think that's a valid argument here.


If it cannot generate code that conforms to the standard's
specificaton then it must not generate code.

Beyond generating code,
I can't apply meaning to what sizeof is supposed to do.
When the operand is a type with (1 + (size_t)-1) bytes.
what should sizeof return?
I can't answer that question in English.
It's not just a matter of generating code.
An important principle here is that a conforming implementation
is only required to be able to successfully translate
and execute ONE program as specified in 5.2.4.1.
That makes some sense if you consider that a
strictly conforming program can be arbitrarily large and use an
arbitrarily large amount of static and automatic storage, even if
individual objects are all small.

If a program defines two objects
or an object with more than 65535 bytes,
then it is beyond what 5.2.4.1 says
an implementation shall be able to translate and execute.
So if you required a conforming
implementation to successfully translate and execute all strictly
conforming programs,
there could be no conforming implementations in the real world.

I think the "one" reference is saying that
if something can translate and execute any strictly conforming program
then it's a conforming implementation,
and that if it can't, it isn't.
 
B

Ben Pfaff

Stan R. said:
Wasn't it so ther would be a common size among different compilers, or
something of the sort?

No. It is so each compiler can choose its own type to represent
a size.
 
L

lawrence.jones

pete said:
The question is
"Should it reject the code which contains a declaration
of type that is too large, or is it UB applying `sizeof' to such type?"

There was a great deal of discussion about that when the committee
considered DR 266. Whilst many people thought that the compiler was
obliged to reject such code, others thought that it was just undefined
behavior. The final decision (as reflected in the DR response) is that
such code is not strictly conforming, the compiler is allowed to reject
it, the compiler is also allowed to accept it (no diagnostic is
required) but applying sizeof results in undefined behavior (with a
compile-time diagnostic desirable but not required).

-Larry Jones

I've got more brains than I know what to do with. -- Calvin
 
P

pete

There was a great deal of discussion about that when the committee
considered DR 266. Whilst many people thought that the compiler was
obliged to reject such code,
others thought that it was just undefined behavior.
The final decision (as reflected in the DR response) is that
such code is not strictly conforming,
the compiler is allowed to reject it,
the compiler is also allowed to accept it (no diagnostic is
required) but applying sizeof results in undefined behavior (with a
compile-time diagnostic desirable but not required).

Thank you.
 
L

Lawrence Kirby

Lawrence said:
S.Tobias wrote:

I don't understand.
Where exactly is it written that `size_t' must
be able to cover sizes of all types?
All I can find is that `size_t'
is the type returned by `sizeof' operator. Does it mean that
implementations must forbid types whose size cannot be taken?

N869
6.5.3.4 The sizeof operator
[#2] The sizeof operator yields
the size (in bytes) of its
operand, which may be an expression or the
parenthesized
name of a type

If the size cannot be taken, then how can sizeof do
what it's supposed to do?

It can't which means that the compiler
should not successfully translate
the program, which is fine.
So what should a compiler do? Should it reject the code which
contains a declaration of type that is too large, or is it UB
applying `sizeof' to such type?

It takes the 3rd option:
"I accept your code but I can't translate it".

What does "accept" mean?

I don't know that "accept" means
"be able to successfully translate and execute",
but I can't think of what else it could mean.
Well undefined behaviour can occur as
a lack of definition of behaviour,
but I don't think that's a valid argument here.


If it cannot generate code that conforms to the standard's
specificaton then it must not generate code.

Beyond generating code,
I can't apply meaning to what sizeof is supposed to do.

sizeof returns the size of the type of its operand. If it cannot
successfully evaluate to a value that meets the requirements of the
standard then it mut nut successfully evaluate.

Actually my statement the the compiler must not generate code is too
strong, the compiler can generate code but the code must not complete
such a sizeof operation successfully.
When the operand is a type with (1 + (size_t)-1) bytes.
what should sizeof return?

That's the point, it cannot produce a result without violating the
standard so it must not produce a result.
I can't answer that question in English. It's not just a matter of
generating code.

It is a matter of implementing the semantics specified by the standard.
It is clear that in this case those semantic cannot be implemented, well
before generating code is an issue.

There is a defect in the standard in as much as it should state that the
size of a declared object shall not exceed SIZE_MAX. That would allow
undefined behavour. However as things stand there is no opportunity for
undefined behaviour because the behaviour of sizeof is clearly defined. It
just happens to be unimplementable in this case which is a clear defect.
If a program defines two objects
or an object with more than 65535 bytes, then it is beyond what 5.2.4.1
says
an implementation shall be able to translate and execute.

It is certainly more than is required for the one program that the
implementation must be able to translate and execute successfully. However
that doesn't stop it being a strictly conforming program.
I think the "one" reference is saying that if something can translate
and execute any strictly conforming program then it's a conforming
implementation, and that if it can't, it isn't.

The text is pretty clear:

"The implementation shall be able to translate and execute at least one
program that contains at least one instance of every one of the following
limits."

It is a conformance requirement on the implementation. There must be at
least one program that the implementation can translate and execute and
that program must contain at least once instance of each of the specified
limits. There is nothing in the standard that requires that an
implementation be able to successfully translate and execute any other
program.

It is easy to construct strictly conforming programs for a compiler that
it can't translate and execute successfully, so the standard cannot
require an implementation to translate and execute every strictly
conforming program, if it is to be useful in the real world.

Lawrence
 
P

pete

Lawrence said:
The text is pretty clear:

"The implementation shall be able to
translate and execute at least one
program that contains at least one
instance of every one of the following limits."

It is a conformance requirement on the implementation.
There must be at
least one program that the implementation
can translate and execute and
that program must contain at least once instance
of each of the specified
limits. There is nothing in the standard that requires that an
implementation be able to successfully translate and execute any other
program.

It is easy to construct strictly conforming programs
for a compiler that
it can't translate and execute successfully, so the standard cannot
require an implementation to translate and execute every strictly
conforming program, if it is to be useful in the real world.

I think your interpretation beats mine for uselessness.

An implementation that can only translate
and execute one specific strictly conforming program,
and no others,:
int main(void) {return 0;}
for example, is pretty useless.
 
P

pete

pete said:
I think your interpretation beats mine for uselessness.

An implementation that can only translate
and execute one specific strictly conforming program,
and no others,:
int main(void) {return 0;}
for example, is pretty useless.

Well, that example is wrong because it doesn't contain one instance
of each of those specified limits.
But, a requirement that a conforming implementation
only need to be able to translate and execute
one specific program, is pretty useless.
 
C

CBFalconer

Lawrence said:
.... snip ...

sizeof returns the size of the type of its operand. If it cannot
successfully evaluate to a value that meets the requirements of
the standard then it mut nut successfully evaluate.

Actually my statement the the compiler must not generate code is
too strong, the compiler can generate code but the code must not
complete such a sizeof operation successfully.

sizeof cannot possibly fail. It doesn't really exist as an
operator. The compiler has to keep track of the storage needs of
components in some manner or other in order to assign that
storage. sizeof is simply a means of transmitting those numbers
for use in a program. What is unknown a-priori is the range of
those values, thus a size_t type is needed.
 
C

Chris Torek

[My, i.e., Pete's earlier] example is wrong because it doesn't
contain one instance of each of those specified limits.
But, a requirement that a conforming implementation
only need to be able to translate and execute
one specific program, is pretty useless.

Indeed. However, the all-or-nothing nature of the Standard would
otherwise require a conforming implementation to be able to translate
and execute *every* program that contains one instance of each of
those limits, and/or every program that does not exceed any or all
of those limits.

I think this would actually be useful in a theoretical sense, but
you can probably imagine that it would be quite untestable in a
practical sense. :)
 
P

pete

Chris said:
[My, i.e., Pete's earlier] example is wrong because it doesn't
contain one instance of each of those specified limits.
But, a requirement that a conforming implementation
only need to be able to translate and execute
one specific program, is pretty useless.

Indeed. However, the all-or-nothing nature of the Standard would
otherwise require a conforming implementation to be able to translate
and execute *every* program that contains one instance of each of
those limits, and/or every program that does not exceed any or all
of those limits.

So then, what do you think "accept" means, as in:
"A conforming hosted implementation
shall accept any strictly conforming program."
 
C

Chris Torek

So then, what do you think "accept" means, as in:
"A conforming hosted implementation
shall accept any strictly conforming program."

I think this means "accept" in the computer-grammar and parsing
sense, i.e., agree that the syntax is correct.
 
P

pete

Chris said:
I think this means "accept" in the computer-grammar and parsing
sense, i.e., agree that the syntax is correct.

Does that mean that it would have to
be able to compile but not necessarily link?
I can't think of any other way to verify that
the implementation likes the syntax, except by compiling.
 
L

Lawrence Kirby

sizeof cannot possibly fail.

It cannot succeed when the value it is required to produce (the size of
the object/type) is outside the range of the values it is allowed to
generate (0 to SIZE_MAX).
It doesn't really exist as an
operator.

The C standard defines it as an operator. That is what it is.
The compiler has to keep track of the storage needs of
components in some manner or other in order to assign that
storage. sizeof is simply a means of transmitting those numbers
for use in a program. What is unknown a-priori is the range of
those values, thus a size_t type is needed.

Yes, the question is what happens when size_t cannot represent the
required value.

Lawrence
 
L

Lawrence Kirby

I think this means "accept" in the computer-grammar and parsing
sense, i.e., agree that the syntax is correct.

Even that is too strong to be practical; strictly conforming programs can
be arbitrarily large. I doubt if, for example, any real-world preprocessor
can deal with an arbitrarily large sequence of #defines, or at least large
enough to cover every valid user namespace identifier.

"Accept" here IMO means "not reject" so a "I accept your program as
as possibly correct but I can't translate it" is a valid accept response
EXCEPT for the one specified program. As a diagnostic it also covers the
case where the program has a syntax error or constraint violation.

Lawrence
 
C

CBFalconer

Lawrence said:
. snip ...

The C standard defines it as an operator. That is what it is.


Yes, the question is what happens when size_t cannot represent the
required value.

That's the point. The object can't exist when the compiler can't
keep track of its storage needs. Therefore the question is moot.
 
L

Lawrence Kirby

That's the point. The object can't exist when the compiler can't
keep track of its storage needs. Therefore the question is moot.

There's nothing to stop the compiler keeping track of its storage needs
using something other than size_t internally. The problem occurs when
the program explicitly asks about its size. The compiler is then
constrained to produce a value representable as a size_t.

I agree that this is a wrinkle in the language and the standard should
explicitly disallow declared objects of more than SIZE_MAX bytes, either
as undefined behaviour or a constraint. There is room for debate over
which is appropriate.

Lawrence
 
L

Lawrence Kirby

I think your interpretation beats mine for uselessness.

I have to disagree on that. :)

Your interpretation means that there can be no real-world conforming
implementations, which is about as useless as you can get. It means that
nothing can claim conformance and if your compiler doesn't claim
conformance you have no comeback on standards grounds when the compiler
generates code that produces wrong results.

My interpretation means that you can create trivial conforming
implementations. For example you can write a program that simply tests
whether the source code against its designated "one" program and writes a
prebuilt executable for that code. For any other code it just says "sorry
I couldn't compile that". Now that may sound pretty useless but it turns
out to be not too bad because:

1. Conforming implementations are possible

2. An implementation that claims conformance is bound by the rules of the
standard. If if produces wrong output you have a sound basis for
complaint.

3. In practice a "trivial" implementation wouldn't last very long.
Compiler writers have a vested interest in making implementations
that are useful. It is a quality of implementation issue rather than
a correctness (conformance) issue. The fact is that real-world
compilers DO attempt to implement the features of the language and
when they do translate code the rules mean that they have to do it
right to conform.

So it turns out that this isn't a bad approach after all.
An implementation that can only translate
and execute one specific strictly conforming program,
and no others,:
int main(void) {return 0;}
for example, is pretty useless.

So don't buy it. At least it won't corrupt your valuable database by
generating WRONG code from your application source.

Note that this isn't valid as an example of a "one" designated program
because it doesn't contain at least one instance of all of the
implementation limits. However such programs can be equally useless.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,166
Messages
2,570,901
Members
47,442
Latest member
KevinLocki

Latest Threads

Top