conversion of string to all lower case

D

DJ

Can someone tell me the library call that converts strings to lower case or
retrns a new string that is lower case of the original, thanks

im using <string>

David
 
J

Julie

DJ said:
Or perhaps even better a compare that ignores case.

thanks

The discussion regarding the (international) caveats of lower/upper case and
case-insensitive *word* comparisons comes up monthly. Check the Google Groups
archives for more blather than you want to read, as well as a couple of
(somewhat) portable/internationalized solutions.
 
I

Ioannis Vranos

Julie said:
The discussion regarding the (international) caveats of lower/upper case and
case-insensitive *word* comparisons comes up monthly. Check the Google Groups
archives for more blather than you want to read, as well as a couple of
(somewhat) portable/internationalized solutions.


I am confused by your terminology "international" here. What do you mean?
 
I

Ioannis Vranos

DJ said:
Can someone tell me the library call that converts strings to lower case or
retrns a new string that is lower case of the original, thanks

im using <string>

David


Check std::toupper() and std::tolower() functions of <cctype>.
 
R

Rolf Magnus

Ioannis said:
I am confused by your terminology "international" here. What do you mean?

One example is the german character ß that doesn't have a single uppercase
equivalent. 'Fuß' would need to compare equal to 'FUSS'.
 
I

Ioannis Vranos

Rolf said:
One example is the german character ß that doesn't have a single uppercase
equivalent. 'Fuß' would need to compare equal to 'FUSS'.


This is not the case here, since we are talking about std::string.

About multilingual characters, one should use wchar_t, std::wstring and
the std::towlower(), std::towupper() of <cwctype>, all guaranteed to work.


C++98:

"Type wchar_t is a distinct type whose values can represent distinct
codes for all members of the largest extended character set specified
among the supported locales (22.1.1). Type wchar_t shall have the same
size, signedness, and alignment requirements (3.9) as one of the other
integral types, called its underlying type."
 
R

Rolf Magnus

Ioannis said:
This is not the case here, since we are talking about std::string.

About multilingual characters, one should use wchar_t, std::wstring and
the std::towlower(), std::towupper() of <cwctype>, all guaranteed to work.

How do those handle such a conversion? The main point here is that the
number of characters in the uppercase version and in the lowercase version
are not equal. Character-based toupper and tolower can't handle that.
 
I

Ioannis Vranos

Rolf said:
How do those handle such a conversion? The main point here is that the
number of characters in the uppercase version and in the lowercase version
are not equal. Character-based toupper and tolower can't handle that.




However they work for Greek and English and I assume all languages with
one to one, lower-case to upper-case correspondence, so I guess it is
for such languages and up to the programmer to take this decision.
 
J

Julie

Ioannis said:
I am confused by your terminology "international" here. What do you mean?

I mean that there are languages that apparently do not have a 1-1
correspondence between upper and lower case words (and characters).

For English, u/l case comparisons are trivial. For German, there are issues.

This is what I mean about 'international' -- if the OP is writing a
locale-independent application (assumed to be the case unless indicated
otherwise), they will have to contend w/ such 'international' issues.
 
I

Ioannis Vranos

Julie said:
I mean that there are languages that apparently do not have a 1-1
correspondence between upper and lower case words (and characters).

For English, u/l case comparisons are trivial. For German, there are issues.

This is what I mean about 'international' -- if the OP is writing a
locale-independent application (assumed to be the case unless indicated
otherwise), they will have to contend w/ such 'international' issues.


However the OP was talking about std::string and not std::wstring.
 
J

Julie

Ioannis said:
However the OP was talking about std::string and not std::wstring.

OP:

"im using <string>"

No further information was provided about specific type or locale dependence,
therefore not assumed in my responses.
 
I

Ioannis Vranos

Julie said:
OP:

"im using <string>"

No further information was provided about specific type or locale dependence,
therefore not assumed in my responses.


From the subject "conversion of string to all lower case" and the question

"Can someone tell me the library call that converts strings to lower
case or retrns a new string that is lower case of the original, thanks

im using <string>"


it looks like he is asking about the usual stuff.
 
J

Julie

Ioannis said:
From the subject "conversion of string to all lower case" and the question

"Can someone tell me the library call that converts strings to lower
case or retrns a new string that is lower case of the original, thanks

im using <string>"

it looks like he is asking about the usual stuff.

Right -- and I gave the usual answer.

nfc
 
C

Catalin Pitis

Ioannis Vranos said:
I don't think so. In simple words, he is talking about chars and you about
wchar_ts.

Is the header <string> or the class <string>?

Catalin
 
R

Richard Herring

Ioannis Vranos said:
I don't think so. In simple words, he is talking about chars and you
about wchar_ts.

German ß is part of ISO8859-1, which is commonly stored in char, not
wchar_t.
 
I

Ioannis Vranos

Richard said:
German ß is part of ISO8859-1, which is commonly stored in char, not
wchar_t.


Nope.


TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".

"A type wchar_t is provided to hold characters of a larger character set
such as Unicode. It is a distinct type. The size of wchar_t is
implementation-defined and large enough to hold the largest character
set supported by the implementation’s locale (see 21.7, C.3.3)."


To give an example, in Windows GUI applications, char is guaranteed to
work only for English characters, for any other language you should use
wchar_t.
 
R

Rolf Magnus

Ioannis said:
Yup.

TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".

Right. Nothing here forbids non-ASCII characters.
"A type wchar_t is provided to hold characters of a larger character set
such as Unicode. It is a distinct type. The size of wchar_t is
implementation-defined and large enough to hold the largest character
set supported by the implementation’s locale (see 21.7, C.3.3)."

And what does that have to do with ISO-8895-1? It's neither a unicode
character set, not a multibyte character set. It's an 8bit character set,
so each character of it will always fit into a byte. So char is perfect for
holding it.
To give an example, in Windows GUI applications, char is guaranteed to
work only for English characters, for any other language you should use
wchar_t.

Is that so?
 
I

Ioannis Vranos

Rolf said:
Right. Nothing here forbids non-ASCII characters.


This discussion can't reach a reasonable conclusion. Just give more
thought on the subject.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,968
Members
47,518
Latest member
TobiasAxf

Latest Threads

Top