Keith Thompson said:
My personal opinion is that getting the diagnostic 1000 times out of
1000 is better than getting it 999 times out of a thousand.
Yes, what you want is different from what I want. I get that.
If the Standard is left as is, so declarations resulting in sizes
larger than SIZE_MAX are not constraint violations, the chances are
_extremely_ high that you can and will still get all the diagnostics
you want; if, on the other hand, such cases are made constraint
violations, it's _guaranteed_ that conforming implementations will
give a (not always wanted) diagnostic. I know that you always want
it, but that's not true of everyone in all cases.
Currently, if I'm writing a function that deals with an array of
unsigned char of arbitrary size, I can be only 99.9% sure that I can
safely use size_t to index it. I don't see that that 0.1% does
anybody any real good.
This comment implicitly equates the 999/1000 pseudo-statistic (in effect
agreed to for the sake of discussion) with the percent chance that
size_t will work. That equality doesn't hold -- even if we grant the
pseudo-statistic, the actual chance that size_t will work is much
higher. You know that, right?
More importantly, unless size_t is required to be larger than the
address space of the machine in question (which seems like a bad
idea), there isn't any way to guarantee that using size_t will work in
all cases, _because code can obtain objects through extra-linguistic
means that don't have to obey implementation-enforced limits_.
If it's important to you to write code using indexing that will work
with any run-time object, including objects obtained through means not
under the implementation's control, there's nothing that says you have
to use size_t. There are other integer types with stronger guarantees
about how large a range they will cover -- use one of those instead.
If I wanted to write code that creates objects bigger than SIZE_MAX
bytes, I could only use that code with an implementation that supports
such objects.
This is sort of a silly statement. No one sets out to create
objects specifically larger than SIZE_MAX bytes, it happens
incidentally through other considerations. If someone wrote
code that had an array declaration like 'int a[10000][10000];',
that code could work _either_ with implementations of suitably
large SIZE_MAX, _or_ with implementations that allow objects
with size larger than SIZE_MAX.
I don't believe there currently are any such
implementations, so I can't write such code anyway. I propose
making that official.
The reasoning here seems bogus, based on the previous statement. Of
course you can write code that declares large arrays like those in the
above paragraph. What you want is to disallow implementations with
SIZE_MAX that is, say, 24 bits, from accepting them (with the obvious
meaning of "accepting"). If the implementation can run the program
and still be conforming in every other way, it makes more sense to
allow it than to disallow it.
And again, if it makes any sense to create objects bigger than
SIZE_MAX bytes, the problem is that the implementation chose too
small a value for SIZE_MAX.
I understand that that's your view. Other people have different
views.
Does it appeal to you? Are there any concrete non-hypothetical cases
where you really want to create objects bigger than SIZE_MAX bytes,
where having a bigger SIZE_MAX isn't the best solution?
I don't do much programming on embedded processors these days, but I
can easily imagine an embedded processor with a 16-bit or 24-bit word
size where in some cases it would make sense to have an object
larger than 16 or 24 bits worth of size.
Realistically, I might want to create an object bigger than 2**32
bytes on a system with 32-bit size_t -- but I can't, even if the C
implementation changed SIZE_MAX, because the underlying system just
doesn't support them. If I want to create such objects, I can either
use files rather than objects, or I can switch to a 64-bit system.
It sounds like you're assuming that an implementation with a 32-bit
size_t won't ever be running on a machine with a 64-bit address.
Even if that's true today, I don't see any reason to assume it
must be true tomorrow.
The above is limited to common current systems. Perhaps there
(or could be) are other systems with "exotic" memory models where
it makes more sense? Still, for all the cases I can think of,
the answer is to make SIZE_MAX bigger.
Again, I understand that that's your view.
[...]
I don't share the goal, but that's unrelated to the point I was
trying to make. The thing you want to do (and please excuse me
if I misrepresent your position, I don't mean to and am doing the
best I can not to) means putting stronger limits on what an
implementation can do (with regard to object sizes). Making _more_
things be undefined behavior reduces restrictions -- ie, it makes
the limits placed on implementations weaker, not stronger. Indeed,
if everything were undefined behavior then there would be no
limits at all -- anything could be a C compiler. So saying pointer
arithmetic of more than SIZE_MAX bytes is undefined behavior
gives implementations more freedom, not less; and, consequently,
makes programs less defined rather than more defined. It seems
like that direction is contrary to the direction you want to go.
Yes, in principle adding more cases of undefined behavior gives more
freedom to the implementation.
I started with the assumption that objects should not be bigger than
SIZE_MAX bytes. (Why? Because that's what SIZE_MAX *means*, though
that's not currently its literal meaning.) Given this assumption,
I assert that implementations shouldn't support the creation or
manipulation of such objects, user code shouldn't attempt to create
such objects, and user code doesn't need to worry that other code
might have created such objects. (The last is perhaps the most
important goal.)
My assumptions are different, but more importantly my conclusions are
different. I believe implementations should have the freedom to
choose whether or not they accommodate objects larger than SIZE_MAX
bytes. If you want this choice be implementation-defined, and thus
documented, I'm okay with that. If you want there to be additional
preprocessor symbols specified, so code can be adjusted for such
implementations using conditional compilation, I'm also okay with
that. And if you want to add a requirement that implementations must
provide an option that produces diagnostics for declarations larger
than SIZE_MAX bytes, I'm okay with that too. What I am not okay with
is saying _no_ conforming implementation must _ever_ be allowed to
tolerate objects larger than SIZE_MAX under any circumstances. As far
as I can tell, that last thing is what you're proposing.
Given these assumptions, I conclude that the best we can do is to
make compile-time violations constraint violations, certain calloc()
calls fail by returning NULL, and other attempts undefined behavior.
Other people have different assumptions. What thought have
you given to exploring or discovering alternatives that
might let different viewpoints co-exist?
There are different kinds of undefined behavior. Some cases are
things that are clearly errors that an implementation isn't required
to detect. In other cases, an implementation can reasonably define
and document a meaning, providing an extension. (The standard
doesn't make this distinction.) My intent is that the new cases
of UB are of the former kind, errors that needn't be diagnosed.
They'e UB rather than diagnosible errors simply because it's not
practical to diagnose them. (If C had exceptions ...)
I think I understand the distinction you're trying to make.
Assuming for the moment that I agree with your categorization
(and I'm not saying I don't only that I'm not sure), I don't
see how it makes any difference to (what I think is) the
basic issue.
[...]
Sorry, my comment was a little bit unclear here. What I mean
is uintmax_t has 64 _value_ bits (and I want it to have 64 value
bits for other reasons), but intmax_t/uintmax_t have 128 bits
of representation. Such an implementation is now conforming
(because of the undefined behavior loophole for signed integer
arithmetic). Do you think the Standard should be changed so
that such implementations _not_ be conforming? I can't think of
a good reason why the Standard should be changed to exclude
them.
No, I don't think the Standard should be changed to exclude them.
In practice, though, it's almost certain either that 64-bit integers
only require 64 bits of storage, or 128-bit integers can use all
128 bits of storage. In the rare counterexamples presumably there
are good reasons for having 64 padding bits, and you can't use them
as value bits anyway.
The point is that now implementations have the freedom to
exercise that choice. It's better to leave that freedom
in place, even 999 times out of 1000 an implementation
will make a different choice for that.
Yes, to the extent that that's possible. Implementations are
perfectly free to support arbitrarily large objects; I'm merely
suggesting that thay should define SIZE_MAX appropriately to
reflect this.
And I don't have any objection to making that statement as
a suggestion (or a Recommended Practice). I just don't think
it should be an irrevocable requirement.
Then that's where we disagree.
Not quite. The maximum object size can be much smaller than the
usable address space size. For example it's entirely plausible that
an implementation with a 64-bit address space might only support
objects up to 2**32 bytes. I mentioned that most current systems
SIZE_MAX is big enough to span the entire address space but that
wasn't meant to be prescriptive.
Once you allow implementations to have address spaces larger than
SIZE_MAX, you better be prepared to deal with objects larger than
SIZE_MAX, because there's no way the implementation can guarantee such
objects won't occur dynamically.
And what I'm suggesting is that SIZE_MAX should reflect (or be an
upper bound on) the maximum possible size of any object, because
that's what SIZE_MAX should mean.
I don't object to making that statement as a suggestion.
I do object to having it be an unconditional requirement.
Incidentally, there are several of your articles that I haven't
responded to, because it's going to require considerable time and
thought to do so. I've left them pending in my newsreader, and I
may or may not get around to responding to them eventually.
Well at least now I know they're thought provoking.