S
slowness.chen
I have two files:
test.py:
--------------------------------------------------
# -*- encoding : utf8 -*-
print 'in this file', repr('ÖÐÎÄ')
# tt.txt is saved as utf8 encoding
f = file('tt.txt')
line1 = f.readline().strip()
print 'another file', repr(line1)
-------------------------------------------------------
tt.txt:
----------------------------------------------------
ÖÐÎÄ
test
-------------------------------------------------------
run test.py and I get the following output:
in this file '\xe4\xb8\xad\xe6\x96\x87'
another file '\xef\xbb\xbf\xe4\xb8\xad\xe6\x96\x87'
and I cann't encode line1 like:
line1.decode('utf8').encode('gbk')
get this error:
UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
position 0:
illegal multibyte sequence
why did I get the different repr values?
test.py:
--------------------------------------------------
# -*- encoding : utf8 -*-
print 'in this file', repr('ÖÐÎÄ')
# tt.txt is saved as utf8 encoding
f = file('tt.txt')
line1 = f.readline().strip()
print 'another file', repr(line1)
-------------------------------------------------------
tt.txt:
----------------------------------------------------
ÖÐÎÄ
test
-------------------------------------------------------
run test.py and I get the following output:
in this file '\xe4\xb8\xad\xe6\x96\x87'
another file '\xef\xbb\xbf\xe4\xb8\xad\xe6\x96\x87'
and I cann't encode line1 like:
line1.decode('utf8').encode('gbk')
get this error:
UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
position 0:
illegal multibyte sequence
why did I get the different repr values?