CBFalconer said:
Well, just as easily you can get into a situation where 'size_t'
overflows and a dedicated application-specific type works. One
day one of those other coders decides to replace the built-in
array with a custom implementation of some "disjoint" array,
like 'deque', for example. Size of that container is no longer
limited by the range of 'size_t'.
All of which leads to easy generation of buffer overflows. What is
really needed is an index type to match the array, i.e. a
subrange. Unfortunately C doesn't have such a thing. Even an
enumeration won't work. An enhancement of typedef would allow
stronger typing in future, but probably won't happen. I.E. let
typedef create a real type:
typedef 0U ... 10U myindex; /* reusing the ... varargs token */
T arraything[tmax(myindex)];
for (myindex i = 0; i < tmax(myindex); i++) ...
using tmax (and tmin) as operators, similar to sizeof.
crossed to comp.std.c in case it stirs some thoughts.
...
That's true, but again, at application level the issue of choosing the
index type for an array is not really related to arrays in any way,
hoiwever strange it might sound. In other words, at application level
the original issue does not exist at all. The very fact that someone is
asking about the choice of type for array index already indicates that
there's a more generic problem with the way that someone designs its
code or with his way of thinking about the issue. Let me explain once again.
In generic (library-level) code, i.e. code that works with generic
arrays, the choice of the index type is obvious - it's 'size_t'
(assuming that we are not considering negative indexing). It looks like
nobody is arguing with that. 'size_t' is also the correct choice of
index type for indexing strings for pretty much the same reasons.
However, in specific (application-level) code the choice of index type
immediately follows from the choice of the corresponding "quantity"
type. Whenever there is a need to store an application-specific
"quantity" in the program (number of cars on the lot, number of
employees in the company, number of lines in the file etc.) there's an
issue of choosing a type to represent that quantity. Assuming that we
are choosing from the set of built-in types, there's will always be a
limitation on the maximum quantity we can handle in the program. This
limitation will become a part of the program's specification. It is the
program's responsibility to observe that this part of the specification
is met. Otherwise, the chosen quantity type will overflow with
disastrous results. Your suggestion concerning "range types" is directly
relevant to this particular issue - the issue of choosing (or creating)
a type for representing an application specific "quantity" (although the
need to watch for overflows will always be there, as long as we are
using a fixed-range type). Note, that so far I never mentioned any
arrays or any other containers. There might be no containers in the
program at all. The issue of choosing the "quantity" type, however,
still exists.
Now, let's assume that one needs an array in the program and needs to
choose the index type. The first question one should ask himself is what
this array is going to store, what is the number of elements in that
array and (!) what type is _currently_ used to represent that quantity.
Note, that by the time one needs an array the choice of the "quantity"
type has _already_ been made. _That_ type is the type that should be
used for array indexing, period. Someone here suggested that if array
index type might overflow, if it is not 'size_t'. Wrong. As long as the
same type is used to represent the number of elements in the array and
the index (regardless of the concrete type, could be 'unsigned char' for
example) no overflow can ever occur. More precisely, at that stage
there's no issue of index type overflow. What _can_ overflow is the
originally chosen "quantity" type, but this issue has absolutely nothing
to do with array indexing. The overflow will happen long before one even
gets to any array indexing. The only thing that overflow will mean is
that the author of the code made a bad choice of "quantity" type (not
index type) or failed to enforce the program specification.
The suggestion to _unconditionally_ use 'size_t' for array indexing
because it "can never overflow" is one of those "fake wisdoms" that
float around the net in great numbers. It is as correct in theory as it
is useless and irrelevant in practice. Something from the area of "Never
use division, because there's a danger of dividing something by zero".
As a disclaimer: I said that many times already, but just in case, for
"I'm not a reader, I'm a writer" type of posters - 'size_t' is a correct
choice of index type for generic non-negative array indexing. Generic
array processing functions should always use this particular type for
representing array sizes and array indices (note, BTW, that in this case
the latter follows form the former too, i.e. the index type is
determined by the corresponding "quantity" type).