Problem with <algorithm> transform

  • Thread starter Gerald I. Evenden
  • Start date
G

Gerald I. Evenden

Working on a Kubuntu 64bit system "c++ (GCC) 4.0.3".

The following simple program extracted from p.497 & 499 of N.M.Josurris'
"The C++ Standard Library ... " (file t.cpp):

1 #include <string>
2 #include <iostream>
3 #include <algorithm>
4 #include <cctype>
5 using namespace std;
6 int main() {
7 string s("This is the zip code of Hodna 1223");
8 cout << "original: " << s <<endl;
9 transform(s.begin(), s.end(),s.begin(), toupper);
10 cout << "upper: " << s << endl;
11 }

results in the following error output when executing 'c++ t.cpp'

t.cpp: In function ‘int main()’:
t.cpp:9: error: no matching function for call
to ‘transform(__gnu_cxx::__normal_iterator<char*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >,
__gnu_cxx::__normal_iterator<char*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >,
__gnu_cxx::__normal_iterator<char*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >, <unknown type>)’

Something seems seriously wrong but I can't figure it.

Help, suggestions greatly appreciated.
 
A

Andrew Koenig

transform(s.begin(), s.end(),s.begin(), toupper);
Something seems seriously wrong but I can't figure it.

Alas, toupper is a macro so you can't pass it as an argument.
 
E

Erik Wikström

Alas, toupper is a macro so you can't pass it as an argument.

Are you sure? In C99 it is a function specified as int toupper(int c)
(section 7.4.2.2, "The toupper function") and in C++98 table 45 it is
also listed as a function.

To the OP: the code you posted compiled and ran fine with MSVC++ so I am
not sure why it did not work for you. However there is also a toupper()
function in C++ declared in the <locale> header which might work (i.e.
replace <cctype> with <locale> and try again).
 
J

Jerry Coffin

Working on a Kubuntu 64bit system "c++ (GCC) 4.0.3".

The following simple program extracted from p.497 & 499 of N.M.Josurris'
"The C++ Standard Library ... " (file t.cpp):

[ ... code elided ]
results in the following error output when executing 'c++ t.cpp'

The problem appears to be with your installation of gcc -- the code is
fine.
 
A

Alf P. Steinbach

* Erik Wikström:
Are you sure? In C99 it is a function specified as int toupper(int c)
(section 7.4.2.2, "The toupper function") and in C++98 table 45 it is
also listed as a function.

The reference you give is correct, and means that Andrew Koenig made a
mistake.

To the OP: the code you posted compiled and ran fine with MSVC++ so I am
not sure why it did not work for you.

Ironically, the error seems to be due to Koenig lookup... :)

However, adding an include of <locale> does not reproduce the error with
MSVC 7.1.

As Newton puportedly said, I frame no hypothesis.

However there is also a toupper()
function in C++ declared in the <locale> header which might work (i.e.
replace <cctype> with <locale> and try again).

No, that one is a bit different: it suffers from the usual standard
library (except STL parts) unusability and complexity, taking a locale
argument as second argument.

A cure for the immediate problem is to write

::toupper

but this may be a compiler-specific cure (I'm not sure).

However, it doesn't matter much, because using the C library's toupper
function directly is a big no-no: it should only be used via a wrapper like

char toUpper( char c )
{
return static_cast<char>(
::toupper( static_cast<unsigned char>( c ) )
);
}

Otherwise, with negative char value promotion to integer (the formal
argument) will yield a value that is not representable as unsigned char,
which the C library toupper function requires.

So to sum up, the Josuttis book example may be correct for the given
data (only ASCII characters), but not for an arbitrary input string,
and, g++ apparently does something funny.

Cheers, & hth.,

- Alf
 
J

Jerry Coffin

Alas, toupper is a macro so you can't pass it as an argument.

Unless my memory is worse than usual today, a function-like macro is
only supposed to be expanded when/if its name is followed by an open
parenthesis, so in this case the fact that there may be a macro by that
name shouldn't make any difference.
 
J

James Kanze

Alas, toupper is a macro so you can't pass it as an argument.

Comme now, Andy, you know better than that. In C, toupper may
be a function style macro, but the function declaration must
also be present (hidden by the macro), and will be used if the
token immediately following the symbol isn't a '('. In C++,
toupper is a set of overloaded functions (including a function
template), and can't be used without some sort of overload
disambiguation. Typically, this will be the arguments in a
function call or, when taking the address, the type of the
destination. The problem here is that where the code takes the
address matches a templated parameter of a template function.
Which means that the compiler can't possibly do type deduction,
and the call fails.

Of course, the original code couldn't work anyway; it expects to
call toupper with a single, char argument, and the only overload
of toupper that can be legally called with a char argument
requires a locale as the second argument.
 
J

James Kanze

On 2008-01-06 21:45, Andrew Koenig wrote:
Are you sure? In C99 it is a function specified as int toupper(int c)
(section 7.4.2.2, "The toupper function") and in C++98 table 45 it is
also listed as a function.
To the OP: the code you posted compiled and ran fine with
MSVC++ so I am not sure why it did not work for you.

It's undefined behavior, so it might compile. I'd be very
surprised if it worked correctly with VC++, however (unless you
compiled with the /J option).
However there is also a toupper() function in C++ declared in
the <locale> header which might work (i.e. replace <cctype>
with <locale> and try again).

There are two separate issues involved here. The first is that
in C++, toupper is overloaded, with one of the overloads (the
one in <locale> being a function template. And that the
standard allows any standard C++ header to include any other.
Obviously, if the template function is visible, the compiler
can't possibly resolve type deduction for std::transform unless
you somehow tell it which overload you want to use: it's using
the type of the argument (i.e. the name of the function) to
deduce the template type of std::transfor, and it needs to know
the target type to resolve the overload on the name. If
<locale> hasn't been included (directly or indirectly), on the
other hand, the only function visible is the one in <cctype>,
type deduction works, and the code compiles.

The second problem is that if char is signed (which it is by
default in VC++), there is no overload of std::toupper which can
be legally called with just a single char argument: the function
template in <locale> requires two arguments, and the function in
<cctype> takes an int in the range [0...UCHAR_MAX] as an
argument---if char is signed, it can (and often will) have a
negative value, which results in undefined behavior if it is
passed to the toupper function in <ctype>. This error is so
common that a number of implementations today actually make the
code work anyway. For all values except EOF (which is almost
always -1). (If you're using ISO 8859-1---one of the more
widespread single byte encoding---then -1 from a char would
correspond to a 'ÿ', a latin small letter y with diaeresis. And
ISO 8859-1 doesn't contain a latin capital letter y with
diaeresis, so returning the same value is correct.)

Finally, of course, you might even have to ask if any of the
standard toupper functions are applicable. There is not,
generally, a one to one mapping of lower to upper, and some
lower case characters might map to a two character sequence in
upper case (German 'ß' becomes "SS", or in some special
contexts, "SZ"; Swiss German 'ä' becomes either "Ae" or "AE",
etc.). And of course, toupper (in all its forms) is totally
useless if you have a multibyte encoding, like UTF-8 (which is
the default, I believe, in most modern Linux distributions).
Depending on the application, such issues may or may not be
relevant.
 
J

James Kanze

* Erik Wikström:
The reference you give is correct, and means that Andrew
Koenig made a mistake.
Surprisingly.
Ironically, the error seems to be due to Koenig lookup... :)

Not really, but see my other postings.
However, adding an include of <locale> does not reproduce the
error with MSVC 7.1.

Are you kidding? (I just tried it with VC++ 8, and I get the
same results. Could it be that Dinkumware uses some form of
concept checking to exclude functions here which can't be called
with only one argument? Or?)

I haven't tried it, but he did have a "using namespace std;" in
there, so everything in <locale> should be visible. In theory
at least---and probably in practice, including <ctype.h> (*NOT*
<cctype>), and specifying ::toupper (to block any chance of
As Newton puportedly said, I frame no hypothesis.
No, that one is a bit different: it suffers from the usual
standard library (except STL parts) unusability and
complexity, taking a locale argument as second argument.

Which is not a problem for occasional use.

What you really need is some sort of functional object, however,
which will get the ctype facet once from the locale, and use the
toupper of the facet.
A cure for the immediate problem is to write

but this may be a compiler-specific cure (I'm not sure).

If he includes <ctype.h>, the resulting code is guaranteed to
compile.

[...]
So to sum up, the Josuttis book example may be correct for the
given data (only ASCII characters), but not for an arbitrary
input string, and, g++ apparently does something funny.

One of the C++ headers he's included probably includes <locale>
in the g++ implementation. The standard clearly says that this
is unspecified (even though it can cause no number of
portability problems).

Anyway, I'd be very interested in hearing from someone familiar
with the VC++ implementation, explaining why the original code
still compiles when you include <locale>. Like you, I can't
explain it.
 
J

James Kanze

[ ... code elided ]
results in the following error output when executing 'c++ t.cpp'
The problem appears to be with your installation of gcc -- the
code is fine.

No it's not. Whether is will compiler or not is unspecified:
any C++ header may include any other, and if <locale> is
included, you should have a serious problem with template type
deduction for the call to std::transform. (From experience, g++
headers tend to include the world; when you develope under g++,
then port to other compilers, you very quickly get used to
having to add includes for headers you'd forgotten.)

Of course, even if it compiles, it has undefined behavior if
char is signed.
 
G

Gerald I. Evenden

Digging through Prada's "C++ Primer Plus" (5th) I found notes on
p.936-7 relating to C library functions delared as having int returns
and problems thereof, so I added a line to t.cpp and changed line
13's function name:

1 #include <string>
2 #include <iostream>
3 #include <algorithm>
4 #include <cctype>
5 using namespace std;
6 /* I added the following line because cctype
7 * apparently declares 'int toupper(char)'
8 */
9 static char toUpper(char c) { return toupper(c); }
10 int main() {
11 string s("This is the zip code of Hodna 1223");
12 cout << "original: " << s <<endl;
13 transform(s.begin(), s.end(),s.begin(), toUpper);
14 cout << "upper: " << s << endl;
15 }

then:
gie@charon:~/Letters/src$ g++ t.cpp
gie@charon:~/Letters/src$ ./a.out
original: This is the zip code of Hodna 1223
upper: THIS IS THE ZIP CODE OF HODNA 1223
gie@charon:~/Letters/src$
Problem solved.
Thanks for the help. This looks like a pop quiz but I did not
mean it that way.
 
J

James Kanze

Digging through Prada's "C++ Primer Plus" (5th) I found notes
on p.936-7 relating to C library functions delared as having
int returns and problems thereof, so I added a line to t.cpp
and changed line 13's function name:

1 #include <string>
2 #include <iostream>
3 #include <algorithm>
4 #include <cctype>
5 using namespace std;
6 /* I added the following line because cctype
7 * apparently declares 'int toupper(char)'
8 */

It had better be: "int std::toupper( int )".
9 static char toUpper(char c) { return toupper(c); }

return toupper( static_cast< unsigned char >( c ) ) ;

It's undefined behavior to call the std::toupper( int ) above
with any negative value other than EOF, and char is often
signed.
10 int main() {
11 string s("This is the zip code of Hodna 1223");
12 cout << "original: " << s <<endl;
13 transform(s.begin(), s.end(),s.begin(), toUpper);
14 cout << "upper: " << s << endl;
15 }
then:
gie@charon:~/Letters/src$ g++ t.cpp
gie@charon:~/Letters/src$ ./a.out
original: This is the zip code of Hodna 1223
upper: THIS IS THE ZIP CODE OF HODNA 1223
gie@charon:~/Letters/src$
Problem solved.

If all the code has to handle is this one string, then there are
much, much easier ways to do it. In general, the above code has
undefined behavior.

And of course, the reason why the code now compiles is because
there is only one isUpper function, so overload resolution has
no problem choosing.
 
G

Gerald I. Evenden

James said:
It had better be: "int std::toupper( int )".

Yes:

static int toUpper(int c) { return toupper(c); }

Works. This seems to negate Prada's argument about type problems.
Also
transform(s.begin(), s.end(),s.begin(), ::toupper);
seems to work (also dropping the toUpper line):

Alas!! Too much Greek. Note "std::toupper" does NOT work.
return toupper( static_cast< unsigned char >( c ) ) ;

It's undefined behavior to call the std::toupper( int ) above
with any negative value other than EOF, and char is often
signed.



If all the code has to handle is this one string, then there are
much, much easier ways to do it. In general, the above code has
undefined behavior.

Outside of resorting to C char types what would be the efficient
alternative? The above looks more nifty than what I would have
to do in straight C.
And of course, the reason why the code now compiles is because
there is only one isUpper function, so overload resolution has
no problem choosing.

First, I am a total C++ dumbkopf and only an old time C user that
likes to take advantages of some of C++ notational short cuts.
In this case I do not see where the overloading of toupper is coming
from. From what you say it seems to imply that there are multiple
declarations of toupper---where?? I have only seen toupper used
in this basic example and each referred to cctype which I assumed
contained the *only* definition.

Thanks for your comments. They sparked me to look a little further.
 
J

James Kanze

static int toUpper(int c) { return toupper(c); }
Works. This seems to negate Prada's argument about type problems.

The "type problem" is that the compiler cannot deduce the type
to instantiate std::transform.
Also
transform(s.begin(), s.end(),s.begin(), ::toupper);
seems to work (also dropping the toUpper line):

As it should---*if* you include <ctype.h>, rather than <cctype>.
According to the standard, it shouldn't work if you only include
<cctype>, but in practice, it will with most, if not all,
implementations, and the standard has been (or will be) modified
to allow (but not require) it to work with said:
Alas!! Too much Greek. Note "std::toupper" does NOT work.

Again, with the includes above, its unspecified whether it works
Outside of resorting to C char types what would be the
efficient alternative? The above looks more nifty than what I
would have to do in straight C.

There are two general solutions. One, as others have pointed
out, is to wrap the ctype function, casting the argument to
unsigned char in order to ensure that it will be in range. The
other is to create a functional object which takes a locale as
an argument (probably with a default to locale()), extracts the
ctype facet, and uses this.

Finally, of course, if portability is not a concern, and your
compiler has an option to make char unsigned, you can use that,
and:
transform(s.begin(), s.end(),s.begin(), ::toupper);
The undefined behavior only occurs if plain char is signed.
(IMHO, this is, technically, the best option. It has a serious
management problem, however, in that your code depends heavily
on a compiler option, which someone in the future might change
without checking. Perhaps with something like:
#if CHAR_MIN != 0
#error requires plain char to be unsigned, must use /J option
#endif
at the top of the file?)
First, I am a total C++ dumbkopf and only an old time C user
that likes to take advantages of some of C++ notational short
cuts. In this case I do not see where the overloading of
toupper is coming from.

From the fact that the C++ standard (unlike the C standard)
allows a standard header to include any other standard headers.
And the standard header <locale> defines a template function
toupper:
template< typename CharT >
CharT toupper( CharT, locale const& ) ;
(in std::, of course). Given:
transform(s.begin(), s.end(),s.begin(), toupper);
after a "using namespace std;", or
transform(s.begin(), s.end(),s.begin(), std::toupper);
the compiler should take into account that this function
template exists, and be unable to deduce the type of the fourth
parameter. (For some reason, VC++ version 8 seems to ignore
function templates when doing this type deduction. Which is not
conform, but may be intentional, and not an error.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,961
Messages
2,570,131
Members
46,689
Latest member
liammiller

Latest Threads

Top