P
patrick.waldo
Hi Everyone,
I am using Python 2.4 and I am converting an excel spreadsheet to a
pipe delimited text file and some of the cells contain utf-8
characters. I solved this problem in a very unintuitive way and I
wanted to ask why. If I do,
csvfile.write(cell.encode("utf-8"))
I get a UnicodeDecodeError. However if I do,
c = unicode(cell.encode("utf-8"),"utf-8")
csvfile.write(c)
Why should I have to encode the cell to utf-8 and then make it unicode
in order to write to a text file? Is there a more intuitive way to
get around these bothersome unicode errors?
Thanks for any advice,
Patrick
Code:
# -*- coding: utf-8 -*-
import xlrd,codecs,os
xls_file = "/home/pwaldo2/work/docpool_plone/2008-12-4/
EU-2008-12-4.xls"
book = xlrd.open_workbook(xls_file)
bibliography_sheet = book.sheet_by_index(0)
csv = os.path.split(xls_file)[0] + '/' + os.path.split(xls_file)[1]
[:-4] + '.csv'
csvfile = codecs.open(csv,'w',encoding='utf-8')
rowcount = 0
data = []
while rowcount<bibliography_sheet.nrows:
data.append(bibliography_sheet.row_values(rowcount,
start_colx=0,end_colx=None))
rowcount+=1
for row in data:
for cell in row:
#csvfile.write(cell.encode("utf-8")) This causes the
UnicodeDecodeError
c = unicode(cell.encode("utf-8"),"utf-8")
csvfile.write(c)
csvfile.write('|')
csvfile.write('\r\n')
csvfile.close()
I am using Python 2.4 and I am converting an excel spreadsheet to a
pipe delimited text file and some of the cells contain utf-8
characters. I solved this problem in a very unintuitive way and I
wanted to ask why. If I do,
csvfile.write(cell.encode("utf-8"))
I get a UnicodeDecodeError. However if I do,
c = unicode(cell.encode("utf-8"),"utf-8")
csvfile.write(c)
Why should I have to encode the cell to utf-8 and then make it unicode
in order to write to a text file? Is there a more intuitive way to
get around these bothersome unicode errors?
Thanks for any advice,
Patrick
Code:
# -*- coding: utf-8 -*-
import xlrd,codecs,os
xls_file = "/home/pwaldo2/work/docpool_plone/2008-12-4/
EU-2008-12-4.xls"
book = xlrd.open_workbook(xls_file)
bibliography_sheet = book.sheet_by_index(0)
csv = os.path.split(xls_file)[0] + '/' + os.path.split(xls_file)[1]
[:-4] + '.csv'
csvfile = codecs.open(csv,'w',encoding='utf-8')
rowcount = 0
data = []
while rowcount<bibliography_sheet.nrows:
data.append(bibliography_sheet.row_values(rowcount,
start_colx=0,end_colx=None))
rowcount+=1
for row in data:
for cell in row:
#csvfile.write(cell.encode("utf-8")) This causes the
UnicodeDecodeError
c = unicode(cell.encode("utf-8"),"utf-8")
csvfile.write(c)
csvfile.write('|')
csvfile.write('\r\n')
csvfile.close()