C
Chris Torek
Christopher said:assert( isalpha(*cp) );
counts[ toupper(*cp)-'A' ]++;
The above two is/to calls should be cast to (unsigned char).
Technically I think you can get away with just one: if isalpha() says
it is (and you are in the C locale), *cp itself must be nonnegative
even if plain "char" is signed.
Of course, if one is not in the C locale, there are actual cases
where isalpha() is true but *cp is negative:
#include <ctype.h>
#include <locale.h>
#include <stdio.h>
int main(void) {
char c, *cp = &c;
setlocale(LC_CTYPE, "ISO8859-1");
*cp = 0xc4;
if (isalpha((unsigned char)*cp))
printf("%c (%d) is alphabetic\n", *cp, *cp);
return 0;
}
Uppercase A-umlaut (code 0xc4) is indeed alphabetic in ISO-Latin-1,
but is negative on many machines; when I run this, I get:
Ä (-60) is alphabetic
There is also one character (German eszet) that has no uppercase
equivalent, so that toupper() leaves it lowercase.
Also, the assumption that A-Z are contiguous, while generally correct,
is not portable (I think in EBCDIC they are not).
Indeed, in EBCDIC they are not: there is a gap between 'I' and 'J',
and another between 'R' and 'S'.