L
Licheng Fang
I want to store Chinese in Unicode internally in my program, and give
output in UTF-8 or GBK format. After two days of searching and reading,
I still cannot find a simple and straightforward way to do the code
conversions. In particular, I want portability of the code across
platfroms (Windows and Linux), and I don't like having to refer the
user of my code to some third party libraries for compiling.
Some STL references point to the class "codecvt<>" for this task, but
it seems that I must rely on non-standard, third-party specializations
of this class. The STL itself doesn't implement the code conversions.
Another option I've read about is using GNU's "iconv", which is
implemented in C, and Glib provides a C++ wrapper of "iconv". Again,
re-compiling my source code can be a trouble if I relied heavily on
these libraries. Boost also seems to have some tools for code
conversion. Considering the huge size of the boost libraries, I would
have to pass that as an option.
These are the only possible ways I know of so far. I have to say that
my idea of how this task should be done is somewhat influenced by the
Python way, which is simple and elegant:
if 's' is a string in GBK.
unicode_s = s.decode('gbk')
and when I need to output in GBK I simply convert it back by
output = unicode_s.encode('gbk')
or, I can let the file object know what's the external coding:
import codecs
f = open('somefile', 'r', 'gbk')
I know it's not fair to expect the same things from two different
languages. I wonder, however, how can such a seemingly trivial task be
so infuriatingly complicated in C++.
output in UTF-8 or GBK format. After two days of searching and reading,
I still cannot find a simple and straightforward way to do the code
conversions. In particular, I want portability of the code across
platfroms (Windows and Linux), and I don't like having to refer the
user of my code to some third party libraries for compiling.
Some STL references point to the class "codecvt<>" for this task, but
it seems that I must rely on non-standard, third-party specializations
of this class. The STL itself doesn't implement the code conversions.
Another option I've read about is using GNU's "iconv", which is
implemented in C, and Glib provides a C++ wrapper of "iconv". Again,
re-compiling my source code can be a trouble if I relied heavily on
these libraries. Boost also seems to have some tools for code
conversion. Considering the huge size of the boost libraries, I would
have to pass that as an option.
These are the only possible ways I know of so far. I have to say that
my idea of how this task should be done is somewhat influenced by the
Python way, which is simple and elegant:
if 's' is a string in GBK.
unicode_s = s.decode('gbk')
and when I need to output in GBK I simply convert it back by
output = unicode_s.encode('gbk')
or, I can let the file object know what's the external coding:
import codecs
f = open('somefile', 'r', 'gbk')
I know it's not fair to expect the same things from two different
languages. I wonder, however, how can such a seemingly trivial task be
so infuriatingly complicated in C++.