Weird warning about data type range

D

Digital Puer

I am seeing this very weird warning about
data type range. I am using g++ 4.1.2 (but this
applies to gcc as well).

I have this program:

#include <cstdio>
main()
{
char c = 27;

if (c >= 110 &&
c <= 127)
{
printf("hi");
}
}


When I compile it with g++, I get the warning:
test_range.cpp:7: warning: comparison is always true due to limited
range of data type

Line 7 is the comparison "c <= 127". Ok fine, that will always
be true, but the entire if-expression may not be true since
there is && in the expression.

Can someone tell me how to shut off this warning?
 
J

Juha Nieminen

Digital said:
I am seeing this very weird warning about
data type range. I am using g++ 4.1.2 (but this
applies to gcc as well).

I have this program:

#include <cstdio>
main()
{
char c = 27;

if (c >= 110 &&
c <= 127)
{
printf("hi");
}
}


When I compile it with g++, I get the warning:
test_range.cpp:7: warning: comparison is always true due to limited
range of data type

Line 7 is the comparison "c <= 127". Ok fine, that will always
be true, but the entire if-expression may not be true since
there is && in the expression.

It's not saying that the expression is always true. It's saying that
that comparison is always true and thus a nop.
 
S

SG

Note it doesn't say anything about the entire if condition, just about a
particular comparison (specifically the c <= 127 one). It would seem
that char is signed by default on your platform. To get rid of the
warning, just delete the second comparison, i.e. change to using:

if(c >= 110)

That's not really satisfactory if you want your code to be portable
and support CHAR_MAX>127 cases. I'm not really interested in whether
the compiler figured out that it can ignore the c<=127 test because
the standard doesn't guarantee CHAR_MAX to be 127. CHAR_MAX can be
larger.

I do appreciate warnings, though. A warning in cases like

unsigned foo = getfoo();
if (foo>=0) {
// ...
}

is totally fine because foo can never be negative regardless of the
implementation details.

Cheers!
SG
 
J

Jonathan Lee

Line 7 is the comparison "c <= 127". Ok fine, that will always
be true, but the entire if-expression may not be true since
there is && in the expression.

Why not just #include <climits> and wrap the second half in an #if?

if (c >= 110
#if (CHAR_MAX > 127)
&& c <= 127
#endif
) {
...
}

I know its not pretty but either the test stays (and the compiler
produces the warning), or the test is somehow removed.

--Jonathan
 
D

Digital Puer

Thanks for everyone's help.

What I am really trying to do is to test if
characters in a std::string are alphanumeric
characters in the Latin-1 encoding. I have
the following:


#define IS_ALPHANUMERIC(x) ( \
((x) >= 48 && (x) <= 57 ) || \
((x) >= 65 && (x) <= 90 ) || \
((x) >= 97 && (x) <= 122) || \
((x) >= 192 && (x) <= 214) || \
((x) >= 216 && (x) <= 246) || \
((x) >= 248 && (x) <= 255) )

string s = getLatin1Text();
int len = s.size();
for (int i = 0; i < len; i++)
{
if (! IS_ALPHANUMERIC(s.at(i)))
{
...
}
}

When I compile that, the last check (x <= 255)
gives the same warning that I showed in my
original post:

warning: comparison is always true due to limited range of data type

I guess I will have to isolate away the last check
with a #IF statement.
 
J

Juha Nieminen

Digital said:
#define IS_ALPHANUMERIC(x) ( \
((x) >= 48 && (x) <= 57 ) || \
((x) >= 65 && (x) <= 90 ) || \
((x) >= 97 && (x) <= 122) || \
((x) >= 192 && (x) <= 214) || \
((x) >= 216 && (x) <= 246) || \
((x) >= 248 && (x) <= 255) )

string s = getLatin1Text();
int len = s.size();
for (int i = 0; i < len; i++)
{
if (! IS_ALPHANUMERIC(s.at(i)))
{
...
}
}

I don't think that's doing what you want it to do.

A std::string contains chars, which are usually signed in most
systems. Values outside the ASCII range will thus have *negative*
values. When you do eg. a "x >= 248" what will happen is that x, which
is a signed char, will first be promoted to an int, and then compared to
248. Since 248 is larger than 127, the comparison will always yield false.

And btw, this is a great example where using a preprocessor macro
instead of an inline function is a *bad* idea. With the macro you will
be silently calling "s.at(i)" 12 times per call, rather than just once.
(Might not be so relevant if it was "s", but with "s.at(i)" 11 of the
boundary checks will be completely useless.)

What you need is an inline function which takes a char as parameter,
then internally casts it to unsigned char, and then compares it to
unsigned char literals.
 
J

Jonathan Lee

What I am really trying to do is to test if
characters in a std::string are alphanumeric
characters in the Latin-1 encoding. I have
the following:

I think <locale> offers isalnum() to do just that. I haven't used it
before, but the way I understand it you can create the appropriate
locale and then the library does it for you. Ex.,

#include <locale>
#include <string.

// Not sure on the constructor argument here...
std::locale latin1("en_US.ISO8859-1");
std::string s;

...

for (size_t i = 0; i < s.length(); ++i) {
if (std::isalnum(s, latin1)) { ... }
}

Anyone know more?

--Jonathan
 
J

Jerry Coffin

Thanks for everyone's help.

What I am really trying to do is to test if
characters in a std::string are alphanumeric
characters in the Latin-1 encoding. I have
the following:


#define IS_ALPHANUMERIC(x) ( \
((x) >= 48 && (x) <= 57 ) || \
((x) >= 65 && (x) <= 90 ) || \
((x) >= 97 && (x) <= 122) || \
((x) >= 192 && (x) <= 214) || \
((x) >= 216 && (x) <= 246) || \
((x) >= 248 && (x) <= 255) )

string s = getLatin1Text();
int len = s.size();
for (int i = 0; i < len; i++)
{
if (! IS_ALPHANUMERIC(s.at(i)))
{
...
}
}

If I had to do this, I think I'd use something like this:

#include <climits>
#include <vector>
#include <algorithm>

struct alphanumeric_table {
std::vector<bool> table;
public:
#define elements(r) (sizeof(r)/sizeof(r[0]))

alphanumeric_table() : table(UCHAR_MAX+2, false) {
static const int ranges[] = {
48, 57,
65, 90,
97, 122,
192, 214,
216, 246,
248, 255
};


for (int i=0; i<elements(ranges); i+=2)
std::fill(table.begin()+ranges+1,
table.begin()+ranges[i+1]+2,
true);
}

bool operator[](int n) { return table[unsigned char(n+1)]; }
} alpha_table;

inline bool is_alphanumeric(int n) {
return alpha_table[n];
}

This assumes that EOF is -1. Technically this isn't required (any
negative value is allowed) but it's extremely common -- to the point
that I'm not sure I've ever seen or heard of it actually having any
other value. In any case, the reason for the "+1"in most places is to
get a range from 0 through the maximum, so we can use it directly as
an index into the vector.
 
D

Digital Puer

Thanks for everyone's help.
What I am really trying to do is to test if
characters in a std::string are alphanumeric
characters in the Latin-1 encoding. I have
the following:
#define IS_ALPHANUMERIC(x) ( \
    ((x) >= 48  && (x) <= 57 ) || \
    ((x) >= 65  && (x) <= 90 ) || \
    ((x) >= 97  && (x) <= 122) || \
    ((x) >= 192 && (x) <= 214) || \
    ((x) >= 216 && (x) <= 246) || \
    ((x) >= 248 && (x) <= 255) )
string s = getLatin1Text();
int len = s.size();
for (int i = 0; i < len; i++)
{
  if (! IS_ALPHANUMERIC(s.at(i)))
  {
     ...
  }
}

If I had to do this, I think I'd use something like this:

#include <climits>
#include <vector>
#include <algorithm>

struct alphanumeric_table {
        std::vector<bool> table;
public:    
#define elements(r) (sizeof(r)/sizeof(r[0]))

        alphanumeric_table() : table(UCHAR_MAX+2, false) {
                static const int ranges[] = {
                        48, 57,
                        65, 90,
                        97, 122,
                        192, 214,
                        216, 246,
                        248, 255
                };

        for (int i=0; i<elements(ranges); i+=2)
            std::fill(table.begin()+ranges+1,
                      table.begin()+ranges[i+1]+2,
                      true);
    }

        bool operator[](int n) { return table[unsigned char(n+1)]; }

} alpha_table;

inline bool is_alphanumeric(int n) {
        return alpha_table[n];

}

This assumes that EOF is -1. Technically this isn't required (any
negative value is allowed) but it's extremely common -- to the point
that I'm not sure I've ever seen or heard of it actually having any
other value. In any case, the reason for the "+1"in most places is to
get a range from 0 through the maximum, so we can use it directly as
an index into the vector.


Thanks. This is great. A self-contained class like
this is much better than my #define macro. It's also
faster since the lookup is immediate rather than
through a bunch of if-in-range statements.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top