G
Greg
Hi, I am having some encoding problems when I first parse stuff from a
non-english website using BeautifulSoup and then write the results to
a txt file.
I have the text both as a normal (text) and as a unicode string
(utext):
print repr(text)
'Branie zak\xc2\xb3adnik\xc3\xb3w'
print repr(utext)
u'Branie zak\xb3adnik\xf3w'
print text or print utext (fileSoup.prettify() also shows 'wrong'
symbols):
Branie zak³adników
Now I am trying to save this to a file but I never get the encoding
right. Here is what I tried (+ lot's of different things with encode,
decode...):
outFile=open(filePath,"w")
outFile.write(text)
outFile.close()
outFile=codecs.open( filePath, "w", "UTF8" )
outFile.write(utext)
outFile.close()
Thanks!!
non-english website using BeautifulSoup and then write the results to
a txt file.
I have the text both as a normal (text) and as a unicode string
(utext):
print repr(text)
'Branie zak\xc2\xb3adnik\xc3\xb3w'
print repr(utext)
u'Branie zak\xb3adnik\xf3w'
print text or print utext (fileSoup.prettify() also shows 'wrong'
symbols):
Branie zak³adników
Now I am trying to save this to a file but I never get the encoding
right. Here is what I tried (+ lot's of different things with encode,
decode...):
outFile=open(filePath,"w")
outFile.write(text)
outFile.close()
outFile=codecs.open( filePath, "w", "UTF8" )
outFile.write(utext)
outFile.close()
Thanks!!