UTF16 codec doesn't round-trip?

  • Thread starter John Perks and Sarah Mount
  • Start date
J

John Perks and Sarah Mount

(My Python uses UTF16 natively; can someone with UTF32 Python let me
know if that behaves differently?)
u'\ud800'
codecs.utf_16_be_encode(_)[0]
'\xd8\x00'
codecs.utf_16_be_decode(_)[0]
Traceback (most recent call last):
File "<input>", line 1, in ?
UnicodeDecodeError: 'utf16' codec can't decode bytes in position 0-1:
unexpected end of data

If the ascii can't be recognized as UTF16, then surely the codec
shouldn't have allowed it to be encoded in the first place? I could
understand if it was trying to decode ascii into (native) UTF32.

On a similar note, if you are using UTF32 natively, are you allowed to
have raw surrogate escape sequences (paired or otherwise) in unicode
literals?

Thanks

John
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

John said:
If the ascii can't be recognized as UTF16, then surely the codec
shouldn't have allowed it to be encoded in the first place? I could
understand if it was trying to decode ascii into (native) UTF32.

Please don't call the thing you are trying to decode "ascii". ASCII
is the name of the American Standard Code for Information Interchange;
it is a 7-bit code, and what you are trying to decode certainly isn't
ascii. Call it "bytes" instead.

So you are trying to decode bytes as UTF-16. The bytes you have
definitely are not UTF-16 - the specific sequence of bytes is invalid
in UTF-16. Therefore, the codec is right to reject it when decoding.

It might be considered as a bug that the codec encoded the characters
in the first place.
On a similar note, if you are using UTF32 natively, are you allowed to
have raw surrogate escape sequences (paired or otherwise) in unicode
literals?

Python accepts such literals.

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top