M
Mister Yu
hi experts,
i m new to python, i m writing crawlers to extract data from some
chinese websites, and i run into a encoding problem.
i have a unicode object, which looks like this u'\xd6\xd0\xce\xc4'
which is encoded in "gb2312", but i have no idea of how to convert it
back to utf-8
to re-create this one is easy:
this will work
============================¤¤¤å -> (same as the original string)
============================
but this doesn't,why
===========================Traceback (most recent call last):
File "<console>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-3: ordinal not in range(128)
===========================
thank you
i m new to python, i m writing crawlers to extract data from some
chinese websites, and i run into a encoding problem.
i have a unicode object, which looks like this u'\xd6\xd0\xce\xc4'
which is encoded in "gb2312", but i have no idea of how to convert it
back to utf-8
to re-create this one is easy:
this will work
============================¤¤¤å -> (same as the original string)
============================
but this doesn't,why
===========================Traceback (most recent call last):
File "<console>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-3: ordinal not in range(128)
===========================
thank you