array subscript type cannot be `char`?

P

Pedro Graca

I run into a strange warning (for me) today (I was trying to improve
the score of the UVA #10018 Programming Challenge).

$ gcc -W -Wall -std=c89 -pedantic -O2 10018-clc.c -o 10018-clc
10018-clc.c: In function `main':
10018-clc.c:22: warning: array subscript has type `char'

I don't like warnings ... or casts.


#include <stdio.h>

#define SIGNEDNESS
/* #define SIGNEDNESS signed */ /* either of these */
/* #define SIGNEDNESS unsigned */ /* defines "works" */

static int charval['9' + 1];
static unsigned long x;

int main(void) {
SIGNEDNESS char test[] = "9012";
SIGNEDNESS char *p = test;

charval['1'] = 1;
charval['2'] = 2;
/* similarly for 3 to 8 */
charval['9'] = 9;

x = 0; /* redundant */
while (*p) {
x *= 10;
x += charval[*p]; /* line 22 */

/* casts to get rid of warning: all of them "work"! */
/* x += charval[ (int) *p]; */
/* x += charval[ (size_t) *p]; */
/* x += charval[ (unsigned) *p]; */
/* x += charval[ (long) *p]; */
/* x += charval[ (wchar_t) *p]; */
/* x += charval[ (signed char) *p]; */
/* x += charval[ (unsigned char) *p]; */

++p;
}

printf("%lu\n", x);
return 0;
}


Is this only a question of portability? (I realize the warning appears
only because of the -Wall option to gcc)

What is the type of an array subscript?
I'd guess size_t, and other types would be promoted automatically.

Should I make an effort to declare all char stuff as either signed or
unsigned? ... before it runs on a DS 9000 :)
 
R

Robert Gamble

Pedro said:
I run into a strange warning (for me) today (I was trying to improve
the score of the UVA #10018 Programming Challenge).

$ gcc -W -Wall -std=c89 -pedantic -O2 10018-clc.c -o 10018-clc
10018-clc.c: In function `main':
10018-clc.c:22: warning: array subscript has type `char'

[snip example program using char subscript]

There is technically nothing "wrong" about using char as an array
subscript, any integer type is legal as an array subscript.

According to the rationale for this warning in the gcc documentation,
many programmers forget the fact that char can be signed which could
obviously lead to unexpected problems if the char value was negative.
This warning is enabled with the -Wall option and can be disabled by
using -Wno-char-subscripts.

Robert Gamble
 
O

Old Wolf

Pedro said:
int main(void) {
SIGNEDNESS char test[] = "9012";
SIGNEDNESS char *p = test;

charval['1'] = 1;
charval['2'] = 2;
/* similarly for 3 to 8 */
charval['9'] = 9;

x = 0; /* redundant */
while (*p) {
x *= 10;
x += charval[*p]; /* line 22 */

The warning is because chars can be negative, and a negative
subscript to an array will cause undefined behaviour. If you happen
to include some negative chars in test[], then you have UB.

This is not a required diagnostic; I guess the GCC developers feel
that this error is more likely to occur with a char than with other
signed integral types :)
Should I make an effort to declare all char stuff as either signed or
unsigned? ... before it runs on a DS 9000 :)

Just make sure your code does not rely on chars being either
signed or unsigned.
If you need to rely on unsignedness (eg. an array of all possible
char values) then you should explicitly use unsigned chars.
 
K

Keith Thompson

Pedro Graca said:
I run into a strange warning (for me) today (I was trying to improve
the score of the UVA #10018 Programming Challenge).

$ gcc -W -Wall -std=c89 -pedantic -O2 10018-clc.c -o 10018-clc
10018-clc.c: In function `main':
10018-clc.c:22: warning: array subscript has type `char'

I don't like warnings ... or casts.
[code snipped]

Is this only a question of portability? (I realize the warning appears
only because of the -Wall option to gcc)

I think the point is that char can be either signed or unsigned,
depending on the implementation. Code that works properly where plain
char is unsigned might fail on another platform where plain char is
signed:

int arr[256];
char index = 200;
... arr[index] ...

Presumably if you use "signed char" explicitly, the compiler assumes
you know what you're doing.

Using plain int as an array index doesn't present the same problem,
because plain int is always signed; any problems will show up on any
platform.
What is the type of an array subscript?
I'd guess size_t, and other types would be promoted automatically.

The index merely has to have some integer type.
Should I make an effort to declare all char stuff as either signed or
unsigned? ... before it runs on a DS 9000 :)

If the actual values are always going to be in the range 0..127, it
shouldn't matter. If they can exceed 127 (the minimum possible value
of CHAR_MAX), you might consider either declaring your variables as
unsigned char, or casting to unsigned char when indexing:

int arr[256];
char index = 200;
... arr[(unsigned char)index] ...
 
B

Ben C

The warning is because chars can be negative, and a negative subscript
to an array will cause undefined behaviour.

Are you sure? There's nothing undefined about this:

#include <stdio.h>

int main(void)
{
int x[10];
int *y = x + 5;
y[-1] = 100;

printf("%d\n", y[-1]);

return 0;
}
 
P

Pedro Graca

Old said:
Just make sure your code does not rely on chars being either
signed or unsigned.
If you need to rely on unsignedness (eg. an array of all possible
char values) then you should explicitly use unsigned chars.

Thank you for your answers.

Is it guaranteed that all characters available on some implementation
for which there is a standards compliant compiler are positive?

AFAICT, in EBCDIC the character '0' has value 0xF0.
Assuming CHAR_BIT is 8 does it follow that plain char is unsigned
for conforming compilers?
 
C

CBFalconer

Pedro said:
.... snip ...

Is it guaranteed that all characters available on some implementation
for which there is a standards compliant compiler are positive?

AFAICT, in EBCDIC the character '0' has value 0xF0.
Assuming CHAR_BIT is 8 does it follow that plain char is unsigned
for conforming compilers?

No. However all chars in the required char set, which includes
'0'..'9', 'a'..'z', 'A'..'Z', '+-*!@#%^&(){}[]:;'"?\/<>.,' must be
positive.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
K

Kenneth Brody

Pedro said:
I run into a strange warning (for me) today (I was trying to improve
the score of the UVA #10018 Programming Challenge).

$ gcc -W -Wall -std=c89 -pedantic -O2 10018-clc.c -o 10018-clc
10018-clc.c: In function `main':
10018-clc.c:22: warning: array subscript has type `char'

I don't like warnings ... or casts.

#include <stdio.h>

#define SIGNEDNESS
/* #define SIGNEDNESS signed */ /* either of these */
/* #define SIGNEDNESS unsigned */ /* defines "works" */ [...]
SIGNEDNESS char *p = test; [...]
x += charval[*p]; /* line 22 */ [...]
/* casts to get rid of warning: all of them "work"! */ [...]
/* x += charval[ (signed char) *p]; */
/* x += charval[ (unsigned char) *p]; */
[...]

Given that explicitly using "unsigned char" or "signed char" will both
get rid of the warning, my guess is that it's your compiler's way of
pointing out "hey, char can be signed in some environments, and unsigned
in others, so using a plain 'char' as a subscript may not necessarily be
what you want to do here".

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <mailto:[email protected]>
 
R

Robert Gamble

Ben said:
The warning is because chars can be negative, and a negative subscript
to an array will cause undefined behaviour.

Are you sure? There's nothing undefined about this:

#include <stdio.h>

int main(void)
{
int x[10];
int *y = x + 5;
y[-1] = 100;

printf("%d\n", y[-1]);

return 0;
}

y is not an array, it is a pointer.

Robert Gamble
 
R

Richard G. Riley

Ben said:
The warning is because chars can be negative, and a negative subscript
to an array will cause undefined behaviour.

Are you sure? There's nothing undefined about this:

#include <stdio.h>

int main(void)
{
int x[10];
int *y = x + 5;
y[-1] = 100;

printf("%d\n", y[-1]);

return 0;
}

y is not an array, it is a pointer.

Robert Gamble

Looks kind of ok to me : am I being too lax with pointers in thinking that?

It seems to my (becoming "more standard") eye, that y points to x[5],
and so y[-1] is perfectly valid since arr[-1] is the same as
*(arr-1). And since "arr" (or y) points into a valid data area then it
is defined. Is this wrong?
 
R

Richard Tobin

int x[10];
int *y = x + 5;
y[-1] = 100;

printf("%d\n", y[-1]);
y is not an array, it is a pointer.
Looks kind of ok to me : am I being too lax with pointers in thinking that?

It seems to my (becoming "more standard") eye, that y points to x[5],
and so y[-1] is perfectly valid since arr[-1] is the same as
*(arr-1). And since "arr" (or y) points into a valid data area then it
is defined. Is this wrong?

y[-1] is perfectly OK for the reason you give. Given an arbitrary
pointer y you can't tell whether y[-1] is valid or not. But if y were
declared as an array, you could be sure that it was wrong.

-- Richard
 
R

Robert Gamble

Richard said:
Ben said:
The warning is because chars can be negative, and a negative subscript
to an array will cause undefined behaviour.

Are you sure? There's nothing undefined about this:

#include <stdio.h>

int main(void)
{
int x[10];
int *y = x + 5;
y[-1] = 100;

printf("%d\n", y[-1]);

return 0;
}

y is not an array, it is a pointer.

Robert Gamble

Looks kind of ok to me : am I being too lax with pointers in thinking that?

It seems to my (becoming "more standard") eye, that y points to x[5],
and so y[-1] is perfectly valid since arr[-1] is the same as
*(arr-1). And since "arr" (or y) points into a valid data area then it
is defined. Is this wrong?

No, it's not wrong at all. Old Wolf stated that a negative subscript
is not valid for an array. Ben C produced his attempt at a
counter-example. My point was that since he was using a negative
subscript with a pointer, not an array, that his example doesn't fall
under Old Wolf's assessment.

Robert Gamble
 
B

Ben C

Ben said:
[...] a negative subscript to an array will cause undefined
behaviour.
[...] Are you sure?
int x[10];
int *y = x + 5;
y[-1] = 100;
...
y is not an array, it is a pointer.

What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.
 
K

Keith Thompson

Ben C said:
Ben said:
[...] a negative subscript to an array will cause undefined
behaviour.
[...] Are you sure?
int x[10];
int *y = x + 5;
y[-1] = 100;
...
y is not an array, it is a pointer.

What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

I think that's actually a matter of some dispute. x[1] is a pointer
to an array of 3 ints, and x[1][-1] indexes into that array.
Conceivably an implementation could do bounds-checking on all array
indexing operations; since x[1][-1] is outside the bounds of the
3-element array being index the attempt to evaluate it could cause a
trap (or, more generally, undefined behavior).

The question is whether such an implementation would be conforming.
I offer no opinion on that question.

This is similar to the question of the "struct hack", which indexes
beyond the declared bounds of an array, but into memory that is known
to exist. I don't think the legality of the struct hack was ever
really settled (though it works on every implementation I've heard
of); C99 sidestepped the question by introducing flexible array
members.

(Of course we all know that we've gone far beyond the original
question; a strict bounds-checking implementation of comp.lang.c
would have required a new thread by now.)
 
R

Robert Gamble

Ben said:
Ben said:
[...] a negative subscript to an array will cause undefined
behaviour.
[...] Are you sure?
int x[10];
int *y = x + 5;
y[-1] = 100;
...
y is not an array, it is a pointer.

What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

Undefined behavior just as x[1][3] would be.
return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

Well, you believe wrong. In your example you are trying to access the
-1st element of an array (the array x[1]) and there is no such element,
trying to access an array element using an out of bounds index is not
defined.

Robert Gamble
 
R

Robert Gamble

Keith said:
Ben C said:
Ben C wrote:
[...] a negative subscript to an array will cause undefined
behaviour.
[...] Are you sure?
int x[10];
int *y = x + 5;
y[-1] = 100;
...
y is not an array, it is a pointer.

What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

I think that's actually a matter of some dispute.

It might have been in 1992, I think that DR #17 made it pretty clear
that this is undefined behavior. Quote the response to question #16:

"For an array of arrays, the permitted pointer arithmetic in subclause
6.3.6, page 47, lines 12-40 is to be understood by interpreting the use
of the word ``object'' as denoting the specific object determined
directly by the pointer's type and value, not other objects related to
that one by contiguity. Therefore, if an expression exceeds these
permissions, the behavior is undefined. For example, the following code
has undefined behavior:
int a[4][5];

a[1][7] = 0; /* undefined */
Some conforming implementations may choose to diagnose an ``array
bounds violation,'' while others may choose to interpret such attempted
accesses successfully with the ``obvious'' extended semantics."

The result of this question was to add the following to the
(informative) section G.2 which documents examples of undefined
behavior:

"An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.3.6)."

Robert Gamble
 
K

Keith Thompson

Robert Gamble said:
Keith said:
What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

I think that's actually a matter of some dispute.

It might have been in 1992, I think that DR #17 made it pretty clear
that this is undefined behavior. Quote the response to question #16: [snip]
The result of this question was to add the following to the
(informative) section G.2 which documents examples of undefined
behavior:

"An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.3.6)."

Thanks. I'm sure I've read that, but I didn't remember the details.

The wording added to G.2 is in section J.2 in the C99 standard:

The behavior is undefined in the following circumstances:
[...]
-- An array subscript is out of range, even if an object is
apparently accessible with the given subscript (as in the
lvalue expression a[1][7] given the declaration int a[4][5])
(6.5.6).

I'm not entirely convinced that C99 6.5.6 couldn't be read to imply
that a[1][7] is valid, but J.2 makes the intent clear enough.
 
S

santosh

Ben said:
Ben said:
[...] a negative subscript to an array will cause undefined
behaviour.
[...] Are you sure?
int x[10];
int *y = x + 5;
y[-1] = 100;
...
y is not an array, it is a pointer.

What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

Well the index value of -1 is out of bounds. It might well point to
memory within the array, but if you have bounds checking enabled, it
will cause an exception.
 
B

Ben C

What about this then?
#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

I think that's actually a matter of some dispute.

It might have been in 1992, I think that DR #17 made it pretty clear
that this is undefined behavior. Quote the response to question #16
[...]:

Most interesting. Thank you, I stand corrected!
 
E

ena8t8si

Keith said:
Ben C said:
Ben C wrote:
[...] a negative subscript to an array will cause undefined
behaviour.
[...] Are you sure?
int x[10];
int *y = x + 5;
y[-1] = 100;
...
y is not an array, it is a pointer.

What about this then?

#include <stdio.h>

int main(void)
{
int x[][3] =
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};

printf("%d\n", x[1][-1]);

return 0;
}

x is an array, not a pointer. I believe there is nothing "undefined"
here.

I think that's actually a matter of some dispute. x[1] is a pointer
to an array of 3 ints,

You mean x[1] is an array of 3 ints. In context x[1] does turn
into a pointer, but it turns into a pointer to int.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top