P
Peter Wilkinson
Hello tlistmembers,
I am using the encoding function to convert unicode to ascii. At one point
this code was working just fine, however, now it has broken.
I am reading a text file that has is in unicode (I am unsure of which
flavour or bit depth). as I read in the file one line at a time
(readlines()) it converts to ascii. Simple enough. At the same time I am
copressing to bz2 with the bz2 module but that works just fine. The code
is and error reported appears below. I am unsure what to do.
I assume that because it is reporting that ordinal is not in range, that
something to do with the character width that I am reading?
Peter W.
def encode_file(file_path, encode_type, compress='N'):
"""
Changes encoding of file
"""
new_encode = encode_type
old_file_path = file_path + '.old'
new_file_path = file_path
os.rename(file_path,old_file_path)
file_in = file(old_file_path,'r')
if compress == 'Y' or compress == 'y':
bz_file_path = file_path + '.bz2'
bz_file_out = bz2.BZ2File(bz_file_path, 'w')
for line in file_in.readlines():
bz_file_out.write(line.encode(new_encode))
bz_file_out.close()
else:
file_out = file(file_path,'w')
for line in file_in.readlines():
file_out.write(line.encode(new_encode))
file_out.close()
file_in.close()
os.remove(old_file_path)
ERROR Reported:
Parsing
X:\GenomeQuebec_repository\microarray\HIS\M15K\Step_1_repository\HISH0224.txt
Traceback (most recent call last):
File "C:\Program Files\ActiveState Komodo 2.5\callkomodo\kdb.py", line
433, in _do_start
self.kdb.run(code_ob, locals, locals)
File "C:\Python23\lib\bdb.py", line 350, in run
exec cmd in globals, locals
File "C:\Python23\Lib\site-packages\xBio\Scripts\unicodeToAscii.py",
line 158, in ?
main()
File "C:\Python23\Lib\site-packages\xBio\Scripts\unicodeToAscii.py",
line 75, in main
encode_file(fileToProcess, options.encode, 'Y')
File "C:\Python23\Lib\site-packages\xBio\Scripts\unicodeToAscii.py",
line 144, in encode_file
bz_file_out.write(line.encode(new_encode))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0:
ordinal not in range(128)
I am using the encoding function to convert unicode to ascii. At one point
this code was working just fine, however, now it has broken.
I am reading a text file that has is in unicode (I am unsure of which
flavour or bit depth). as I read in the file one line at a time
(readlines()) it converts to ascii. Simple enough. At the same time I am
copressing to bz2 with the bz2 module but that works just fine. The code
is and error reported appears below. I am unsure what to do.
I assume that because it is reporting that ordinal is not in range, that
something to do with the character width that I am reading?
Peter W.
def encode_file(file_path, encode_type, compress='N'):
"""
Changes encoding of file
"""
new_encode = encode_type
old_file_path = file_path + '.old'
new_file_path = file_path
os.rename(file_path,old_file_path)
file_in = file(old_file_path,'r')
if compress == 'Y' or compress == 'y':
bz_file_path = file_path + '.bz2'
bz_file_out = bz2.BZ2File(bz_file_path, 'w')
for line in file_in.readlines():
bz_file_out.write(line.encode(new_encode))
bz_file_out.close()
else:
file_out = file(file_path,'w')
for line in file_in.readlines():
file_out.write(line.encode(new_encode))
file_out.close()
file_in.close()
os.remove(old_file_path)
ERROR Reported:
Parsing
X:\GenomeQuebec_repository\microarray\HIS\M15K\Step_1_repository\HISH0224.txt
Traceback (most recent call last):
File "C:\Program Files\ActiveState Komodo 2.5\callkomodo\kdb.py", line
433, in _do_start
self.kdb.run(code_ob, locals, locals)
File "C:\Python23\lib\bdb.py", line 350, in run
exec cmd in globals, locals
File "C:\Python23\Lib\site-packages\xBio\Scripts\unicodeToAscii.py",
line 158, in ?
main()
File "C:\Python23\Lib\site-packages\xBio\Scripts\unicodeToAscii.py",
line 75, in main
encode_file(fileToProcess, options.encode, 'Y')
File "C:\Python23\Lib\site-packages\xBio\Scripts\unicodeToAscii.py",
line 144, in encode_file
bz_file_out.write(line.encode(new_encode))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0:
ordinal not in range(128)