Is this (tiny) function portable?

M

Matthias

Hello,

I am missing certain functionality of std::string, so I am currently
writing some helper functions which operate on strings. On of them is as
follows (it's actually two functions):

inline char to_lower ( char c )
{
if( c>=65 && c<=90 ) // A-Z
return c += 32;
return c;
}

inline void str_to_lower ( std::string& source )
{
std::transform( source.begin(), source.end(), source.begin(),
to_lower );
}

My question concerns the function to_lower:
Is this portable? I looked at an ASCII table, and recognized that I can
convert uppercase letters to lowercase by adding 32. However, I have no
idea if this will work for other character tables.
 
A

Alf P. Steinbach

* Matthias:
Hello,

I am missing certain functionality of std::string, so I am currently
writing some helper functions which operate on strings. On of them is as
follows (it's actually two functions):

inline char to_lower ( char c )
{
if( c>=65 && c<=90 ) // A-Z
return c += 32;
return c;
}

inline void str_to_lower ( std::string& source )
{
std::transform( source.begin(), source.end(), source.begin(),
to_lower );
}

My question concerns the function to_lower:
Is this portable? I looked at an ASCII table, and recognized that I can
convert uppercase letters to lowercase by adding 32. However, I have no
idea if this will work for other character tables.

It's not portable.

Check out the 'tolower' function.

Beware of argument and result types.
 
M

Matthias

Alf said:
Check out the 'tolower' function.

Ah, could have guessed that there is something like it... :)
I have a question though. I included <cctype> instead of <ctype.h>.
Now, I thought all those forwarding headers wrap the C-functions in the
namespace std? However, when I call std::tolower, I get an error that
this function doesn't exist. When calling without the namespace-prefix
it works just fine. Why ist that?
Beware of argument and result types.
What do you mean?
 
R

Rolf Magnus

Matthias said:
Hello,

I am missing certain functionality of std::string, so I am currently
writing some helper functions which operate on strings. On of them is as
follows (it's actually two functions):

inline char to_lower ( char c )
{
if( c>=65 && c<=90 ) // A-Z
return c += 32;
return c;
}

inline void str_to_lower ( std::string& source )
{
std::transform( source.begin(), source.end(), source.begin(),
to_lower );
}

My question concerns the function to_lower:
Is this portable?
No.

I looked at an ASCII table, and recognized that I can convert uppercase
letters to lowercase by adding 32. However, I have no idea if this will
work for other character tables.

It won't. As someone else mentioned, just use tolower().
 
R

Ron Natalie

Matthias said:
Ah, could have guessed that there is something like it... :)
I have a question though. I included <cctype> instead of <ctype.h>.
Now, I thought all those forwarding headers wrap the C-functions in the
namespace std? However, when I call std::tolower, I get an error that
this function doesn't exist. When calling without the namespace-prefix
it works just fine. Why ist that?


tolower isn't a function, it's a macro.
Macro's don't have any clue of scope (hence they are ignorant of namespaces).
 
M

Matthias

Ron said:
tolower isn't a function, it's a macro.
Macro's don't have any clue of scope (hence they are ignorant of
namespaces).

I have read the header file, and as far as I can tell, <cctype> #undef's
all macros and uses inline functions instead, because it is the cleaner
approach.
 
A

Alf P. Steinbach

* Ron Natalie:
Possibly a non-conforming C++ implementation, possibly that you
somewhere have or #include a macro that redefines 'tolower'.

tolower isn't a function, it's a macro.
Macro's don't have any clue of scope (hence they are ignorant of namespaces).

In the standard it's documented as a function, not as a macro.

So if it's a macro then that's a non-conforming C++ implementation.

Btw., the thing about types is that 'tolower' takes an 'int' argument,
which should be the character value as 'unsigned char'. If the default
'char' type is signed, then certain characters may have negative values
as 'char', and when passed directly to 'tolower' yield values not
representable as 'unsigned char'. So cast to 'unsigned char' first.
 
M

Matthias

Alf said:
Possibly a non-conforming C++ implementation, possibly that you
somewhere have or #include a macro that redefines 'tolower'.

It's the GNU implementation of C++.
Btw., the thing about types is that 'tolower' takes an 'int' argument,
which should be the character value as 'unsigned char'.

If the function can only handle unsigned chars, why does it take an int? ^^

If the default
'char' type is signed, then certain characters may have negative values
as 'char', and when passed directly to 'tolower' yield values not
representable as 'unsigned char'. So cast to 'unsigned char' first.

Alright.
 
A

Attila Feher

Matthias said:
It's the GNU implementation of C++.


If the function can only handle unsigned chars, why does it take an
int? ^^

It is an ancient rule in C (and so it came to C++ in the bag) that char is
always promoted to int. In some cases (at least one case I know) it has a
very important role. Some functions returning (normally) characters should
also be able to return EOF (the end of file constant), which will be a "real
int", not a promoted char.
 
M

Matthias

Attila said:
It is an ancient rule in C (and so it came to C++ in the bag) that char is
always promoted to int. In some cases (at least one case I know) it has a
very important role. Some functions returning (normally) characters should
also be able to return EOF (the end of file constant), which will be a "real
int", not a promoted char.

I just recognized that I can't cast because I'm using std::transform, so
I have no influence on the arguments passed. But since they come from an
std::string they are most probably valid characters right? =)
 
A

Attila Feher

Matthias wrote:
[SNIP]
I just recognized that I can't cast because I'm using std::transform,
so I have no influence on the arguments passed. But since they come
from an std::string they are most probably valid characters right? =)

As far as I know in C and C++ by definition every character is valid. There
can be no values of char which are invalid, and all bits of char types (also
called bytes) must participate in the representation of its value. But I
guess that this does not answer your question, because your term of "valid
character" means something entirely different to you.
 
M

Matthias

Attila said:
Matthias wrote:
[SNIP]
I just recognized that I can't cast because I'm using std::transform,
so I have no influence on the arguments passed. But since they come
from an std::string they are most probably valid characters right? =)


As far as I know in C and C++ by definition every character is valid. There
can be no values of char which are invalid, and all bits of char types (also
called bytes) must participate in the representation of its value. But I
guess that this does not answer your question, because your term of "valid
character" means something entirely different to you.

Well, if someone calls my function on a string which only contains
garbage characters, he shouldn't expect to retrieve anything but garbage
as well... :D
 
I

Ioannis Vranos

Matthias said:
Ah, could have guessed that there is something like it... :)
I have a question though. I included <cctype> instead of <ctype.h>.
Now, I thought all those forwarding headers wrap the C-functions in the
namespace std? However, when I call std::tolower, I get an error that
this function doesn't exist. When calling without the namespace-prefix
it works just fine. Why ist that?


Do you mean that this does nor compile to you?


#include <iostream>
#include <cctype>
#include <vector>
#include <algorithm>


int main()
{
using namespace std;

string s="THiS iS a TeST STRiNG";

transform(s.begin(), s.end(), s.begin(), tolower);
}



However unfortunately this does not compile with MINGW GCC 3.3.1:


C:\c>g++ temp.cpp -o temp.exe
temp.cpp: In function `int main()':
temp.cpp:13: error: no matching function for call to `transform(
__gnu_cxx::__normal_iterator<char*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >,
__gnu_cxx::__normal_iterator<char*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >,
__gnu_cxx::__normal_iterator<char*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >, <unknown type>)'
temp.cpp:16:1: warning: no newline at end of file

C:\c>


But compiles as it should with VC++:

C:\c>cl /EHsc temp.cpp
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.41013 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

temp.cpp
Microsoft (R) Incremental Linker Version 8.00.41013
Copyright (C) Microsoft Corporation. All rights reserved.

/out:temp.exe
temp.obj

C:\c>
 
I

Ioannis Vranos

Matthias said:
I just recognized that I can't cast because I'm using std::transform, so
I have no influence on the arguments passed. But since they come from an
std::string they are most probably valid characters right? =)


You do not need to cast anything during the tolower() call. You should
provide valid input in the first place, so if you place any checks you
should place them earlier in the string creation process.


Naturally if you input wrong data you will get undefined behaviour.
Casts during the call of tolower() can only hide invalid input which you
could notice afterwards.
 
O

Old Wolf

Ioannis said:
You do not need to cast anything during the tolower() call. You should
provide valid input in the first place, so if you place any checks you
should place them earlier in the string creation process.

Naturally if you input wrong data you will get undefined behaviour.
Casts during the call of tolower() can only hide invalid input which you
could notice afterwards.

Consider the char whose value is -123. (In the usual character
set, this is an accented 'e' which is common in French). This is
most definitely a valid member of a std::string. But it is UB to
pass it to ctype.h's tolower() macro, because that macro expects
values in the range 0...255 (eg. it could be implemented as:
.. #define tolower(c) tolower_table[c]
where tolower_table is a 256-byte array of ints which have value
0 or 1).

However, the function std::tolower in <cctype> should accept
negative chars as input. Warning -- if you are using namespace
std; and you go: tolower(x), you might get the macro version on
a poorly-implemented library. So you should always explicitly
write std::tolower .
 
M

Matthias

Old said:
However, the function std::tolower in <cctype> should accept
negative chars as input. Warning -- if you are using namespace
std; and you go: tolower(x), you might get the macro version on
a poorly-implemented library. So you should always explicitly
write std::tolower .

As I said, that doesn't work -- for whatever reason. I am getting an
error tolower() couldn't be found in the namespace std or something similar.
 
I

Ioannis Vranos

Matthias said:
As I said, that doesn't work -- for whatever reason. I am getting an
error tolower() couldn't be found in the namespace std or something
similar.


What compiler are you using? Doesn't this compile to your compiler?


#include <cctype>
#include <vector>
#include <algorithm>


int main()
{
using namespace std;

string s="THiS iS a TeST STRiNG";

transform(s.begin(), s.end(), s.begin(), tolower);
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,200
Messages
2,571,046
Members
47,646
Latest member
xayaci5906

Latest Threads

Top