T
Travis
Is there an easy to convert from UnicodeString to string or char *?
Is there an easy to convert from UnicodeString to string or char *?
You mean ICU UnicodeString? You'll need to convert it to UTF-8. See
the ICU converters: http://icu-project.org/userguide/codepageConverters.html
Rather than simply UTF-8, you'd be better off converting it to the
character encoding of std::string::value_type WRT the std::codecvt facet
of the current global locale (if you expect to do any string processing
or stream it to a std:stream) or to the std::codecvt facet of the
locale you're going to output the string to if you're going to use some
binary I/O API.
But how you can be sure that all the characters from the UnicodeString
fit into char encoding of the current global locale?
Another thing to remember when dealing with most
implementations is that comparison with even a German Unicode
std::collate facet will not find U +f6(o-umlaut) and
U+6f(o),U+308(umlaut) to be equal. Which just seems wrong to
me - I would have thought they'd be equal in *all* Unicode
collations - Unicode documents them to be equivalent
representations.
On a related note, if anybody knows how to use
std::lexicographical_compare with a specific locale without having to
change the global locale (which is really not safe in a nontrivial
program), please answer - I'm becoming upset.
I'd especially like to be able to specify a locale for each
sequence of encoding atoms so if the implementation has a
mapping between them, it will be able to tell me if they are
equal even when they're in different encodings.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.