unicode, C++, python 2.2

G

Guest

I am currently writing a python interface to a C++ library. Some of the
functions in this library take unicode strings (UTF-8, mostly) as arguments.

However, when getting these data I run into problem on python 2.2
(RHEL3) - while the data is all nice UCS4 in 2.3, in 2.2 it seems to be
UTF-8 on top of UCS4. UTF8 encoded in UCS4, meaning that 3 bytes of the
UCS4 char is 0 and the first one contains a byte of the string encoding
in UTF-8.

Is there a trick to get python 2.2 to do UCS4 more cleanly?
 
G

Guest

Trond said:
I am currently writing a python interface to a C++ library. Some of the
functions in this library take unicode strings (UTF-8, mostly) as
arguments.

However, when getting these data I run into problem on python 2.2
(RHEL3) - while the data is all nice UCS4 in 2.3, in 2.2 it seems to be
UTF-8 on top of UCS4. UTF8 encoded in UCS4, meaning that 3 bytes of the
UCS4 char is 0 and the first one contains a byte of the string encoding
in UTF-8.

Is there a trick to get python 2.2 to do UCS4 more cleanly?

It's hard to tell from your message what your problem really is, as we
have not clue what "these data" are. How do you know they are "nice
UCS4" in 2.3? Are you looking at the internal representation at the
C level, or are you looking at something else? Do you use byte strings
or Unicode strings?

You tried to explain what "UTF8 encoded in UCS4" might be, but I'm
not sure I understand the explanation: what precise sequence of
statements did you use to create such a thing, and what precisely
does it look like (what exact byte is first, what is second, and so
on)?

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,264
Messages
2,571,315
Members
48,001
Latest member
Wesley9486

Latest Threads

Top