[MUDFLAP] Is sizeof(ARRAY[0]) equivalent to sizeof(*ARRAY) ?

K

Keith Thompson

Shao Miller said:
Shao Miller said:
MUDFLAP can diagnose stuff that's permitted by C90, such as warning you
that an index used for iteration has gone too far. Usually you want to
index one-past an array, at most. Do the 'for' loops have any problems?

As I'm sure you know, it's legal to construct a pointer one element
past the end of an array, but not to dererence it. Constructing a
pointer before the beginning of an array or more than one element
past the end of it, or deferencing a pointer one past the end of it,
has undefined behavior. I'm not aware of any relevant changes in
this area from C90 to C99 to C11.

Are you saying that Mudflap complains about stuff that's legal *and
has defined behavior* in C90? If it complained about something
like this:

int arr[N];
int *ptr;
for (ptr = arr; ptr < arr + N; ptr ++) {
/* ... */
}

that would IMHO be a serious bug in Mudflap.

(I haven't analyzed the posted code, but from the discussion I don't
think that's what's going on.)

Nah, I tried to be careful and chose to type "index" instead of "point".
The code I was criticizing used both an integer index as well as a
pointer. Their pointer would never point more than one past the array,
but their integer would index two past the array. That seems to be
permitted by C90.

It's permitted (in all versions of the language) for a object
of an integer type to have any value in the range of that type.
That has nothing at all to do with arrays; there's no association
(that a compiler can detect) between a given integer value and
indices into some array object.

That is, unless you actually use the integer value as an index into
a particular array. In that case, exactly the same rules apply as
for pointers, because the indexing operation is defined in terms
of pointer arithmetic.

Concretely:

int arr[10] = { 0 };
int i;
for (i = 0; i < 10; i += 3) {
arr = 42;
}

In successive iterations of the loop, i takes on the values 0, 3,
6, and 9. *After* the loop, i==12, which is a perfectly valid value
for an object of type int. Evaluating `arr` after the loop would
have undefined behavior, but the program doesn't do that so there's
no problem. There could be problems if i iterates past INT_MAX,
but it doesn't do that either.
I seem to recall some bit about this situation being easily detectable
and reportable as a bonus diagnostic, perhaps in the hopes of alerting
the programmer that there's a better way to write their loop without
stepping out of bounds, conceptually.

What would such a diagnostic look like?

Warning: `i' exceeds the bounds of `arr', so evaluating `arr'
would have undefined behavior, but that never happens so I'm not
sure what I'm warning you about.

A similar snippet using pointers *would* cause undefined behavior,
and would be worthy of a warning (though I can't quite persuade
gcc to generate one):

int arr[10] = { 0 };
int *ptr;
for (ptr = arr; ptr < arr + 10; ptr += 3) {
*ptr = 42;
}
return 0;

because it actually constructs the invalid pointer value.

Or am I missing something?
 
G

glen herrmannsfeldt

(snip on variables that could be used to index outside an array)
A similar snippet using pointers *would* cause undefined behavior,
and would be worthy of a warning (though I can't quite persuade
gcc to generate one):
int arr[10] = { 0 };
int *ptr;
for (ptr = arr; ptr < arr + 10; ptr += 3) {
*ptr = 42;
}
return 0;
because it actually constructs the invalid pointer value.

Yes, the standard disallows this, but it is not likely to cause
problems on the usual implementations.

An implementation actually doing array bounds checking should catch
an attempt to derefernence that pointer.

I suppose I can imagine someone implementing bounds checking
such that it actually checks such, but I don't see why.

On most processors, the worst that happens is that a pointer overflows,
usually without any problem.

In the past, there were processors that with keyed storage, such
that it actually knew the type stored in any memory location.
Along with in appropriate implementation of pointers, this might
be caught in hardware.

-- glen
 
S

Shao Miller

Shao Miller said:
[...]
MUDFLAP can diagnose stuff that's permitted by C90, such as warning you
that an index used for iteration has gone too far. Usually you want to
index one-past an array, at most. Do the 'for' loops have any problems?

As I'm sure you know, it's legal to construct a pointer one element
past the end of an array, but not to dererence it. Constructing a
pointer before the beginning of an array or more than one element
past the end of it, or deferencing a pointer one past the end of it,
has undefined behavior. I'm not aware of any relevant changes in
this area from C90 to C99 to C11.

Are you saying that Mudflap complains about stuff that's legal *and
has defined behavior* in C90? If it complained about something
like this:

int arr[N];
int *ptr;
for (ptr = arr; ptr < arr + N; ptr ++) {
/* ... */
}

that would IMHO be a serious bug in Mudflap.

(I haven't analyzed the posted code, but from the discussion I don't
think that's what's going on.)

Nah, I tried to be careful and chose to type "index" instead of "point".
The code I was criticizing used both an integer index as well as a
pointer. Their pointer would never point more than one past the array,
but their integer would index two past the array. That seems to be
permitted by C90.

It's permitted (in all versions of the language) for a object
of an integer type to have any value in the range of that type.
That has nothing at all to do with arrays; there's no association
(that a compiler can detect) between a given integer value and
indices into some array object.

But that's not so. A compiler can plainly see an integer object being
compared with the bounds of an array, as does the code I was
criticizing. For example:

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)

If the count of 'arr' is 4, this alone is sufficient cause to tell the
programmer to have their head checked, with no pointers in sight.
That is, unless you actually use the integer value as an index into
a particular array. In that case, exactly the same rules apply as
for pointers, because the indexing operation is defined in terms
of pointer arithmetic.

Concretely:

int arr[10] = { 0 };
int i;
for (i = 0; i < 10; i += 3) {
arr = 42;
}

In successive iterations of the loop, i takes on the values 0, 3,
6, and 9. *After* the loop, i==12, which is a perfectly valid value
for an object of type int. Evaluating `arr` after the loop would
have undefined behavior, but the program doesn't do that so there's
no problem. There could be problems if i iterates past INT_MAX,
but it doesn't do that either.


This example is a good one, and rings a bell as being along the lines of
the bonus diagnostic I mentioned.

If I recall correctly, the idea is that whether or not one is using an
integer index or a pointer itself for iteration is hardly any different
to the compiler, so noting that an index goes "out of bounds" is just as
easy as noting that a pointer goes out of bounds, so it's pretty
straightforward to provide a bonus diagnostic, even though the program
might not suffer from undefined behaviour.
I seem to recall some bit about this situation being easily detectable
and reportable as a bonus diagnostic, perhaps in the hopes of alerting
the programmer that there's a better way to write their loop without
stepping out of bounds, conceptually.

What would such a diagnostic look like?

Warning: `i' exceeds the bounds of `arr', so evaluating `arr'
would have undefined behavior, but that never happens so I'm not
sure what I'm warning you about.


Let me try to dig it up and post again...
A similar snippet using pointers *would* cause undefined behavior,
and would be worthy of a warning (though I can't quite persuade
gcc to generate one):

int arr[10] = { 0 };
int *ptr;
for (ptr = arr; ptr < arr + 10; ptr += 3) {
*ptr = 42;
}
return 0;

because it actually constructs the invalid pointer value.

Or am I missing something?

Not that I know of. :) It's just helpful, because it's usually right...
The loop is usually clearer if it's reworked to avoid the diagnostic.
 
S

Shao Miller

I seem to recall some bit about this situation being easily detectable
and reportable as a bonus diagnostic, perhaps in the hopes of alerting
the programmer that there's a better way to write their loop without
stepping out of bounds, conceptually.

What would such a diagnostic look like?

Warning: `i' exceeds the bounds of `arr', so evaluating `arr'
would have undefined behavior, but that never happens so I'm not
sure what I'm warning you about.


Let me try to dig it up and post again...


I seem to recall that the introduction of FORTIFY_SOURCE caused some
sudden breakage for Syslinux, when Syslinux treated warnings as errors.
I've tried now for several minutes to find the nice explanation of the
conforming-but-warned-against loop situation that I'd seen before, but
I've not yet found it.

In the meanwhile, what I can do is to quote the patch:

"With -D_FORTIFY_SOURCE=2 some more checking is added, but some
conforming programs might fail." -- Jakub Jelinek

http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02055.html
 
G

glen herrmannsfeldt

Shao Miller said:
On 1/11/2013 17:04, Keith Thompson wrote:
(snip)

As long as it isn't used to index any actual array, I agree.
But that's not so. A compiler can plainly see an integer object being
compared with the bounds of an array, as does the code I was
criticizing. For example:
for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)
If the count of 'arr' is 4, this alone is sufficient cause to tell the
programmer to have their head checked, with no pointers in sight.

(And I presume referencing arr inside the loop.)

Yes the compiler can plainly see that i indexes the array, and
also that it can never have a value inside the loop that will
exceed the array bounds.

A compiler that does bounds checking might not check references
to arr inside the loop, as it is known to be in bounds.
(Unless i is otherwise changed inside the loop.)
(snip)
This example is a good one, and rings a bell as being along the lines of
the bonus diagnostic I mentioned.
If I recall correctly, the idea is that whether or not one is using an
integer index or a pointer itself for iteration is hardly any different
to the compiler, so noting that an index goes "out of bounds" is just as
easy as noting that a pointer goes out of bounds, so it's pretty
straightforward to provide a bonus diagnostic, even though the program
might not suffer from undefined behaviour.

No, other way around.

The standard prohibits pointers from pointing more than one past the
end of an array, not to aid bounds checking but to allow for hardware
that might have restrictions on pointers. (Well, that it my
understanding of the reason.)

Because of that requirement it would certainly be allowed for an
implementation doing bounds checking to check that a pointer was
in range. Unless it is actually used to reference, fetch or store,
actual data, though, I wouldn't do it.

I used to do a lot of programming in 80286 protected mode with OS/2.

Well, all the modes are still there, but nobody uses them.

A segment descriptor describes a region in memory with an origin and
length, and any access using that descriptor is checked by the hardware.

When a (16 bit) segment selector is loaded into a segment register,
the appropriate descriptor is loaded into the correspending register.

Now, compilers won't load random date into segment registers, as it
will fail if the corresponding descriptor doesn't exist or is invalid.

There is, however, no restriction on the offset. That goes into an
ordinary register and is only checked when actual access is made to
data.

There is a hardware trap on fetch or store past the end of a segment.
I did a lot of debugging using that. OS/2 would put up a nice window
with the register contents at the time of the trap, making it fairly
easy to find.

However, in huge model it might cause problems. Personally, I avoid
huge if possible. Huge allows for arrays more than 64K (in 80286
mode) or more than 4GB (in 32 bit protected mode) by choosing a
segment selector based on the index. Even so, though, I would be
surprised to see the compiler actually do it. It can easily do the
computation in other registers, only loading the segment register
when actually addressing data.

Well, the big problem with doing this is that there is no cache
for segment descriptors. Every time you load a 16 bit selector
the processor has to also load a 64 bit descriptor from memory.
(Well, hopefully in processor cache, but still.)

With an appropriate in-processor descriptor cache it could have
been much faster. Enough to delay the need for 64 bit addressing.

--------------

It might be that some of the Burroughs machines from years past had
pointer registers that checked that they were in bounds.

But there should be no problem with a program indexing an integer
variable that could be used to access an array, as long as it isn't
actually used to access the array outside the bounds.

A compiler on a machine that did have that requirement would have
to be sure not to load such into the appropriate register until
it had been checked.
Let me try to dig it up and post again...
A similar snippet using pointers *would* cause undefined behavior,
and would be worthy of a warning (though I can't quite persuade
gcc to generate one):
int arr[10] = { 0 };
int *ptr;
for (ptr = arr; ptr < arr + 10; ptr += 3) {
*ptr = 42;
}
return 0;
because it actually constructs the invalid pointer value.

-- glen
 
B

Ben Bacarisse

glen herrmannsfeldt said:
(snip on variables that could be used to index outside an array)
A similar snippet using pointers *would* cause undefined behavior,
and would be worthy of a warning (though I can't quite persuade
gcc to generate one):
int arr[10] = { 0 };
int *ptr;
for (ptr = arr; ptr < arr + 10; ptr += 3) {
*ptr = 42;
}
return 0;
because it actually constructs the invalid pointer value.

Yes, the standard disallows this, but it is not likely to cause
problems on the usual implementations.

An implementation actually doing array bounds checking should catch
an attempt to derefernence that pointer.

I suppose I can imagine someone implementing bounds checking
such that it actually checks such, but I don't see why.

On most processors, the worst that happens is that a pointer overflows,
usually without any problem.

If you combine these last to you get a reason to see why you might want
to check the pointer. If an implementation wants to do bounds checking,
yet pointers can wrap round to "safe" values, you need to check the
pointer generation and not just the access.

<snip>
 
S

Shao Miller

Shao Miller said:
On 1/11/2013 17:04, Keith Thompson wrote:
(snip)

As long as it isn't used to index any actual array, I agree.
But that's not so. A compiler can plainly see an integer object being
compared with the bounds of an array, as does the code I was
criticizing. For example:
for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)
If the count of 'arr' is 4, this alone is sufficient cause to tell the
programmer to have their head checked, with no pointers in sight.

(And I presume referencing arr inside the loop.)


No, that was the point. There is no 'arr' in the loop and yet the
compiler (or run-time support, perhaps) can still see that something
doesn't seem quite right about this. Why is 'i' being incremented by 10
and then compared against the length of 'arr'? Seems fishy.
Yes the compiler can plainly see that i indexes the array, and
also that it can never have a value inside the loop that will
exceed the array bounds.

A compiler that does bounds checking might not check references
to arr inside the loop, as it is known to be in bounds.
(Unless i is otherwise changed inside the loop.)


Well my point here was that the loop body didn't make a difference.
No, other way around.

The standard prohibits pointers from pointing more than one past the
end of an array, not to aid bounds checking but to allow for hardware
that might have restrictions on pointers. (Well, that it my
understanding of the reason.)

Sorry; wasn't talking about the Standard and its rationale, but about my
recollection of the rationale for the diagnostic which could be emitted
as a sanity-check for a conforming program. And it makes sense, to me,
as something that can benefit the programmer (or at least, _other_
programmers reading the code; heh!).
Because of that requirement it would certainly be allowed for an
implementation doing bounds checking to check that a pointer was
in range. Unless it is actually used to reference, fetch or store,
actual data, though, I wouldn't do it.

I used to do a lot of programming in 80286 protected mode with OS/2.

Well, all the modes are still there, but nobody uses them.

A segment descriptor describes a region in memory with an origin and
length, and any access using that descriptor is checked by the hardware.

When a (16 bit) segment selector is loaded into a segment register,
the appropriate descriptor is loaded into the correspending register.

Now, compilers won't load random date into segment registers, as it
will fail if the corresponding descriptor doesn't exist or is invalid.

There is, however, no restriction on the offset. That goes into an
ordinary register and is only checked when actual access is made to
data.

There is a hardware trap on fetch or store past the end of a segment.
I did a lot of debugging using that. OS/2 would put up a nice window
with the register contents at the time of the trap, making it fairly
easy to find.

However, in huge model it might cause problems. Personally, I avoid
huge if possible. Huge allows for arrays more than 64K (in 80286
mode) or more than 4GB (in 32 bit protected mode) by choosing a
segment selector based on the index. Even so, though, I would be
surprised to see the compiler actually do it. It can easily do the
computation in other registers, only loading the segment register
when actually addressing data.

Well, the big problem with doing this is that there is no cache
for segment descriptors. Every time you load a 16 bit selector
the processor has to also load a 64 bit descriptor from memory.
(Well, hopefully in processor cache, but still.)

With an appropriate in-processor descriptor cache it could have
been much faster. Enough to delay the need for 64 bit addressing.

--------------

It might be that some of the Burroughs machines from years past had
pointer registers that checked that they were in bounds.

But there should be no problem with a program indexing an integer
variable that could be used to access an array, as long as it isn't
actually used to access the array outside the bounds.

A compiler on a machine that did have that requirement would have
to be sure not to load such into the appropriate register until
it had been checked.

Right. :) It's not that there's an actual problem with the operation,
it's a diagnostic for the senselessness of the code.
 
K

Keith Thompson

glen herrmannsfeldt said:
The standard prohibits pointers from pointing more than one past the
end of an array, not to aid bounds checking but to allow for hardware
that might have restrictions on pointers. (Well, that it my
understanding of the reason.)
[...]

In fact, the "prohibition" doesn't *aid* bounds checking at all; rather,
since the behavior is undefined, it permits implementations to check
bounds or not as they see fit.
 
K

Keith Thompson

Shao Miller said:
On 1/11/2013 17:04, Keith Thompson wrote: [...]
It's permitted (in all versions of the language) for a object
of an integer type to have any value in the range of that type.
That has nothing at all to do with arrays; there's no association
(that a compiler can detect) between a given integer value and
indices into some array object.

But that's not so. A compiler can plainly see an integer object being
compared with the bounds of an array, as does the code I was
criticizing. For example:

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)

If the count of 'arr' is 4, this alone is sufficient cause to tell the
programmer to have their head checked, with no pointers in sight.

Hardly. If I want to access every 10th element of an array, starting
from element 0, the above is a perfectly good way to do that. If the
array has 10 or fewer elements, it will access only element 0. If it
has 23 elements, it will access elements 0, 10, and 20, and then leave i
with the value 30.

It *doesn't matter* that the value of i goes past the last valid index
of the array, as long as I don't use that value to index the array. The
upper bound for i is INT_MAX.

(This is a significant difference between accessing arrays with integer
indices vs. using pointers, and one that hadn't occurred to be before.
It probably doesn't show up very often, we don't usually skip through
arrays like this.)
That is, unless you actually use the integer value as an index into
a particular array. In that case, exactly the same rules apply as
for pointers, because the indexing operation is defined in terms
of pointer arithmetic.

Concretely:

int arr[10] = { 0 };
int i;
for (i = 0; i < 10; i += 3) {
arr = 42;
}

In successive iterations of the loop, i takes on the values 0, 3,
6, and 9. *After* the loop, i==12, which is a perfectly valid value
for an object of type int. Evaluating `arr` after the loop would
have undefined behavior, but the program doesn't do that so there's
no problem. There could be problems if i iterates past INT_MAX,
but it doesn't do that either.


This example is a good one, and rings a bell as being along the lines
of the bonus diagnostic I mentioned.


I'm tempted to suggest that you've misspelled "bogus". :cool:}
If I recall correctly, the idea is that whether or not one is using an
integer index or a pointer itself for iteration is hardly any
different to the compiler, so noting that an index goes "out of
bounds" is just as easy as noting that a pointer goes out of bounds,
so it's pretty straightforward to provide a bonus diagnostic, even
though the program might not suffer from undefined behaviour.

I would not want a compiler to produce a diagnostic like this.
[...]
Not that I know of. :) It's just helpful, because it's usually
right... The loop is usually clearer if it's reworked to avoid the
diagnostic.

I could rework that loop to avoid letting i exceed 10, but I believe the
result would be substantially less clear:

#define LEN 10
int arr[LEN] = { 0 };
int i = 0;
while (1) {
arr = 42;
if (i >= LEN-3) {
break;
}
else {
i += 3;
}
}

All this to avoid letting i exceed 10.
 
G

glen herrmannsfeldt

(snip, I wrote)
As long as it isn't used to index any actual array, I agree.
But that's not so. A compiler can plainly see an integer object being
compared with the bounds of an array, as does the code I was
criticizing. For example:
for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)
If the count of 'arr' is 4, this alone is sufficient cause to tell the
programmer to have their head checked, with no pointers in sight.
(And I presume referencing arr inside the loop.)

No, that was the point. There is no 'arr' in the loop and yet the
compiler (or run-time support, perhaps) can still see that something
doesn't seem quite right about this. Why is 'i' being incremented by 10
and then compared against the length of 'arr'? Seems fishy.


Doesn't seem fishy at all. Reasonably often I want to go through
an array and look at only even elements, or even every 10th element,
but ignore any fraction left at the end.

Certainly there should be something inside the loop, otherwise

i=(sizeof(arr)/sizeof(*arr)+9)/10*10;
Yes the compiler can plainly see that i indexes the array, and
also that it can never have a value inside the loop that will
exceed the array bounds.
A compiler that does bounds checking might not check references
to arr inside the loop, as it is known to be in bounds.
(Unless i is otherwise changed inside the loop.)

Well my point here was that the loop body didn't make a difference.

Might as well put a warning on all for loops, then.
(snip)
Sorry; wasn't talking about the Standard and its rationale, but about my
recollection of the rationale for the diagnostic which could be emitted
as a sanity-check for a conforming program. And it makes sense, to me,
as something that can benefit the programmer (or at least, _other_
programmers reading the code; heh!).
(snip)

Right. :) It's not that there's an actual problem with the operation,
it's a diagnostic for the senselessness of the code.

But it isn't sensless. I suppose there is some small probabiity
that it isn't what was meant, but pretty small.

-- glen
 
S

Shao Miller

Hardly. If I want to access every 10th element of an array, starting
from element 0, the above is a perfectly good way to do that. If the
array has 10 or fewer elements, it will access only element 0. If it
has 23 elements, it will access elements 0, 10, and 20, and then leave i
with the value 30.

It *doesn't matter* that the value of i goes past the last valid index
of the array, as long as I don't use that value to index the array. The
upper bound for i is INT_MAX.

(This is a significant difference between accessing arrays with integer
indices vs. using pointers, and one that hadn't occurred to be before.
It probably doesn't show up very often, we don't usually skip through
arrays like this.)
[...]
I'm tempted to suggest that you've misspelled "bogus". :cool:}

Ha ha ha, yes... From one beard-o to another: I could very well be
misremembering. :)} I won't argue about the usefulness any more since
it's subjective and because I can't seem to track down the exact warning
or the explanatory web page that I recalled.

When Syslinux was bitten by this warning (turned error because of
-Werror), here're some numbers, for what they're worth:

- I'm 100% certain that the diagnostic was issued and turned out to
be a false positive, as there was no undefined behaviour in the code.

- I'm 95% certain that the code involved was clearly explained by the
web page explaining the cause of the false positive.

- I'm 90% certain that the code was indeed a bit of an odd loop
construct, and that reworking it could've avoided the false positive.

- I'm 100% certain that the extra attention to the loop seemed
warranted, but certainly didn't warrant aborting the build process, as
was happening.

- I'm 70% certain that the web page's explanation of the false
positive was that the translation process treated the array index and
pointer arithmetic in the same way, so the problem with a pointer
pointing beyond one past was being misapplied to the index.

- I'm 60% certain that an optimization level was involved.

- I'm 70% certain that instead of changing the code, Syslinux chose
to change the compilation options.

- I'm 65% certain that FORTIFY_SOURCE was involved.

Oh well, I give up. Heh.
 
K

Keith Thompson

Shao Miller said:
But that's not so. A compiler can plainly see an integer object being
compared with the bounds of an array, as does the code I was
criticizing. For example:
for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)
If the count of 'arr' is 4, this alone is sufficient cause to tell the
programmer to have their head checked, with no pointers in sight.

(And I presume referencing arr inside the loop.)


No, that was the point. There is no 'arr' in the loop and yet the
compiler (or run-time support, perhaps) can still see that something
doesn't seem quite right about this. Why is 'i' being incremented by
10 and then compared against the length of 'arr'? Seems fishy.


The point of the check is precisely to prevent any invalid access to the
array. Warning about it would make as much sense as this:

int x, y, z;
/* code to set x, y, and z */
if (y == 0) {
z = 0;
}
else {
z = x / y;
}

triggering a warning about a possible division by zero.
 
S

Shao Miller

The point of the check is precisely to prevent any invalid access to the
array. ...

Absolutely agreed. What I meant to suggest was that the particular
diagnostic that I recalled could be emitted regardless of the loop body.
I could be misremembering. Please see elsethread for recollection
details, for what they're worth. :)

Once again, here's a piece from the FORTIFY_SOURCE patch:

"With -D_FORTIFY_SOURCE=2 some more checking is added, but some
conforming programs might fail." -- Jakub Jelinek

A false positive can still be useful (a bonus) in order to draw
attention to weird code for closer scrutiny, but to break translation is
a problem, in my opinion. I respect your difference of opinion, if
indeed your opinion differs.
 
K

Keith Thompson

Shao Miller said:
Absolutely agreed. What I meant to suggest was that the particular
diagnostic that I recalled could be emitted regardless of the loop body.
I could be misremembering. Please see elsethread for recollection
details, for what they're worth. :)

Once again, here's a piece from the FORTIFY_SOURCE patch:

"With -D_FORTIFY_SOURCE=2 some more checking is added, but some
conforming programs might fail." -- Jakub Jelinek

A false positive can still be useful (a bonus) in order to draw
attention to weird code for closer scrutiny, but to break translation is
a problem, in my opinion. I respect your difference of opinion, if
indeed your opinion differs.

I still fail to understand why you think the code in question is weird
or, as you wrote, "is sufficient cause to tell the programmer to have
their head checked".

How would you re-write

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)

to make it less weird?
 
S

Shao Miller

Shao Miller said:
Absolutely agreed. What I meant to suggest was that the particular
diagnostic that I recalled could be emitted regardless of the loop body.
I could be misremembering. Please see elsethread for recollection
details, for what they're worth. :)

Once again, here's a piece from the FORTIFY_SOURCE patch:

"With -D_FORTIFY_SOURCE=2 some more checking is added, but some
conforming programs might fail." -- Jakub Jelinek

A false positive can still be useful (a bonus) in order to draw
attention to weird code for closer scrutiny, but to break translation is
a problem, in my opinion. I respect your difference of opinion, if
indeed your opinion differs.

I still fail to understand why you think the code in question is weird
or, as you wrote, "is sufficient cause to tell the programmer to have
their head checked".

How would you re-write

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)

to make it less weird?

On its own, it looks like it's about to work with every 10th element of
'arr'. Since all elements should be initialized, I'd suggest dealing
with the "every tenth" case in the same place (loop) as the rest of the
elements are dealt with. Sometimes, I agree, it doesn't make sense to
do so. But if it does, then noticing this code and thinking twice about
it doesn't seem to hurt.

The loop body for this could very well initialize ten elements at a
time, but I'd probably advance 'i' by 1 instead of by 10, and use the '%
10' operator in the loop's body, if need be. From looking at just this
loop, I'd think that some bodies would be rather contrived, especially
because of the 'sizeof' count computation instead of 'n', or some such.
 
K

Keith Thompson

Shao Miller said:
On 1/12/2013 18:13, Keith Thompson wrote: [...]
I still fail to understand why you think the code in question is weird
or, as you wrote, "is sufficient cause to tell the programmer to have
their head checked".

How would you re-write

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)

to make it less weird?

On its own, it looks like it's about to work with every 10th element of
'arr'. Since all elements should be initialized, I'd suggest dealing
with the "every tenth" case in the same place (loop) as the rest of the
elements are dealt with. Sometimes, I agree, it doesn't make sense to
do so. But if it does, then noticing this code and thinking twice about
it doesn't seem to hurt.

Let's assume that we really want to deal only with every 10th element
rather than second-guessing the intent.
The loop body for this could very well initialize ten elements at a
time, but I'd probably advance 'i' by 1 instead of by 10, and use the '%
10' operator in the loop's body, if need be. From looking at just this
loop, I'd think that some bodies would be rather contrived, especially
because of the 'sizeof' count computation instead of 'n', or some such.

So you'd rather iterate over every element of the array, and then
test i%10 to decide whether to process the current one?

for (i = 0; i < sizeof arr / sizeof arr[0]; i ++) {
if (i % 10 == 0) {
/* ... */
}
}

I find that more verbose, less clear, and almost certainly less
efficient (unless the compiler is clever enough to transform it
into the first form). All to solve a problem that is not a problem
at all.

If there's some concern about i overflowing the upper limit of its
type (INT_MAX or SIZE_MAX), I'd probably precompute the last index
I want to look at and use that to control the loop. But I see no
point in doing so just to avoid letting i exceed the upper bound
of the array, when the code explicitly avoids accessing anything
outside the array bounds.
 
S

Shao Miller

Shao Miller said:
On 1/12/2013 18:13, Keith Thompson wrote: [...]
I still fail to understand why you think the code in question is weird
or, as you wrote, "is sufficient cause to tell the programmer to have
their head checked".

How would you re-write

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)

to make it less weird?

On its own, it looks like it's about to work with every 10th element of
'arr'. Since all elements should be initialized, I'd suggest dealing
with the "every tenth" case in the same place (loop) as the rest of the
elements are dealt with. Sometimes, I agree, it doesn't make sense to
do so. But if it does, then noticing this code and thinking twice about
it doesn't seem to hurt.

Let's assume that we really want to deal only with every 10th element
rather than second-guessing the intent.

That is just what I've been trying to suggest: The diagnostic can be
considered useful for second-guessing the intention. I cannot argue
that if I begin by agreeing that it's false. :)
The loop body for this could very well initialize ten elements at a
time, but I'd probably advance 'i' by 1 instead of by 10, and use the '%
10' operator in the loop's body, if need be. From looking at just this
loop, I'd think that some bodies would be rather contrived, especially
because of the 'sizeof' count computation instead of 'n', or some such.

So you'd rather iterate over every element of the array, and then
test i%10 to decide whether to process the current one?

for (i = 0; i < sizeof arr / sizeof arr[0]; i ++) {
if (i % 10 == 0) {
/* ... */
}
}

I find that more verbose, less clear, and almost certainly less
efficient (unless the compiler is clever enough to transform it
into the first form). All to solve a problem that is not a problem
at all.

If there's some concern about i overflowing the upper limit of its
type (INT_MAX or SIZE_MAX), I'd probably precompute the last index
I want to look at and use that to control the loop. But I see no
point in doing so just to avoid letting i exceed the upper bound
of the array, when the code explicitly avoids accessing anything
outside the array bounds.

No, I was talking about if the loop was for initialization. In such a
case, I'd initialize every element (advancing by 1), but if something
special was needed for every tenth, I could determine that with your 'if'.

For just _working_ with an already-initialized array, I'd probably use
precisely this loop (though I might possibly be more inclined to put it
in a function and pass the count). BUT, as was snipped, if the count of
'arr' is 4, then the programmer _really_ ought to have their head
checked. :)
 
K

Keith Thompson

Shao Miller said:
On 1/13/2013 18:39, Keith Thompson wrote: [...]
So you'd rather iterate over every element of the array, and then
test i%10 to decide whether to process the current one?

for (i = 0; i < sizeof arr / sizeof arr[0]; i ++) {
if (i % 10 == 0) {
/* ... */
}
}

I find that more verbose, less clear, and almost certainly less
efficient (unless the compiler is clever enough to transform it
into the first form). All to solve a problem that is not a problem
at all.

If there's some concern about i overflowing the upper limit of its
type (INT_MAX or SIZE_MAX), I'd probably precompute the last index
I want to look at and use that to control the loop. But I see no
point in doing so just to avoid letting i exceed the upper bound
of the array, when the code explicitly avoids accessing anything
outside the array bounds.

No, I was talking about if the loop was for initialization. In such a
case, I'd initialize every element (advancing by 1), but if something
special was needed for every tenth, I could determine that with your 'if'.

Who says the loop is for initialization? It might not even be writing
to the array.
For just _working_ with an already-initialized array, I'd probably use
precisely this loop (though I might possibly be more inclined to put it
in a function and pass the count). BUT, as was snipped, if the count of
'arr' is 4, then the programmer _really_ ought to have their head
checked. :)

Absurd.

A function that, for whatever reason, needs to access every 10th
(or every Nth) element, can perfectly sensibly work on an array of
any size. If the array has 10 or fewer elements, the function just
accesses element 0. I see no reason to treat that as a special case.
 
G

glen herrmannsfeldt

(snip)
for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)
(snip)
Let's assume that we really want to deal only with every 10th element
rather than second-guessing the intent.
That is just what I've been trying to suggest: The diagnostic can be
considered useful for second-guessing the intention. I cannot argue
that if I begin by agreeing that it's false. :)

Yes, but so can a few million other things.

It doesn't help much to put on each statement:

"Warning: this statement could be wrong"

If you find the statement:

i += 3;

You could have the warning: "Are you sure you meant 3, and not 4?"

and once in about 1e12 times you might be right.

Warnings have a cost, if nothing else than in the time the programmer
has to read them and ignore them. But even more, too many warnings will
make it more likely to miss the ones that are actually important.
The programmer will just give up and turn them off.

So, really really think about how useful a warning will be before
suggesting that it should be added.

-- glen
 
S

Shao Miller

Shao Miller said:
On 1/13/2013 18:39, Keith Thompson wrote: [...]
So you'd rather iterate over every element of the array, and then
test i%10 to decide whether to process the current one?

for (i = 0; i < sizeof arr / sizeof arr[0]; i ++) {
if (i % 10 == 0) {
/* ... */
}
}

I find that more verbose, less clear, and almost certainly less
efficient (unless the compiler is clever enough to transform it
into the first form). All to solve a problem that is not a problem
at all.

If there's some concern about i overflowing the upper limit of its
type (INT_MAX or SIZE_MAX), I'd probably precompute the last index
I want to look at and use that to control the loop. But I see no
point in doing so just to avoid letting i exceed the upper bound
of the array, when the code explicitly avoids accessing anything
outside the array bounds.

No, I was talking about if the loop was for initialization. In such a
case, I'd initialize every element (advancing by 1), but if something
special was needed for every tenth, I could determine that with your 'if'.

Who says the loop is for initialization? It might not even be writing
to the array.

I did. "The loop body for this could very well initialize ten elements
at a time, but I'd probably advance 'i' by 1 instead of by 10, and use
the '% 10' operator in the loop's body, if need be." (If you'll forgive
the 10 with "operator") I'm saying "if, then", not making a claim about
the loop's purpose.
Absurd.

A function that, for whatever reason, needs to access every 10th
(or every Nth) element, can perfectly sensibly work on an array of
any size. If the array has 10 or fewer elements, the function just
accesses element 0. I see no reason to treat that as a special case.

If you are talking about:

void some_func1(void) {
int arr[4];
int i;

for (i = 0; i < sizeof arr / sizeof arr[0]; i += 10)
/* Anything */
}

then I'd certainly say it's silly! We can see right away that the
advancement of 'i' is silly! Consider what kind of code review you'd
offer someone about this.

If you are talking about:

void some_func1(int * arr, size_t n) {
int i;

for (i = 0; i < n; i += 10)
/* Anything */
}

then I'd certainly say it's not silly at all!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,567
Members
47,203
Latest member
EmmaSwank1

Latest Threads

Top