B
Baptiste Lepilleur
I activated httplib debug, and when trace are printed, a UnicodeError
exception is thrown. I have already set sys.stdout to use utf-8 encoding
(this removed the exception when *I* was printing unicode), but from the
stacktrace below, the encoding seems to magically have switched to 'ascii'
when httplib does the printing...
import codecs
import sys
sys.stdout = codecs.getwriter("utf-8")(sys.__stdout__)
....
def fetchURL( url ):
request = urllib2.Request( url )
opener = urllib2.build_opener( urllib2.HTTPHandler(debuglevel=1) )
feeddata = opener.open(request)
data = feeddata.read()
return data.decode( 'utf-8', 'replace' )
....
Here is the traceback exceprt:
File "updater.py", line 120, in getStoryChapter
content = fetchURL( url )
File "updater.py", line 43, in fetchURL
feeddata = opener.open(request)
File "D:\python24\lib\urllib2.py", line 358, in open
response = self._open(req, data)
File "D:\python24\lib\urllib2.py", line 376, in _open
'_open', req)
File "D:\python24\lib\urllib2.py", line 337, in _call_chain
result = func(*args)
File "D:\python24\lib\urllib2.py", line 1021, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "D:\python24\lib\urllib2.py", line 994, in do_open
r = h.getresponse()
File "D:\python24\lib\httplib.py", line 863, in getresponse
response.begin()
File "D:\python24\lib\httplib.py", line 365, in begin
print "header:", hdr,
File "D:\python24\lib\codecs.py", line 178, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x89 in position 840:
ordinal not in range(128)
Why is the print statement using the 'ascii' codec instead of utf-8? Is
there a way to ensure print always work (I'm just using it for debugging
purpose) ?
I'm using python 2.4.2.
Baptiste.
exception is thrown. I have already set sys.stdout to use utf-8 encoding
(this removed the exception when *I* was printing unicode), but from the
stacktrace below, the encoding seems to magically have switched to 'ascii'
when httplib does the printing...
import codecs
import sys
sys.stdout = codecs.getwriter("utf-8")(sys.__stdout__)
....
def fetchURL( url ):
request = urllib2.Request( url )
opener = urllib2.build_opener( urllib2.HTTPHandler(debuglevel=1) )
feeddata = opener.open(request)
data = feeddata.read()
return data.decode( 'utf-8', 'replace' )
....
Here is the traceback exceprt:
File "updater.py", line 120, in getStoryChapter
content = fetchURL( url )
File "updater.py", line 43, in fetchURL
feeddata = opener.open(request)
File "D:\python24\lib\urllib2.py", line 358, in open
response = self._open(req, data)
File "D:\python24\lib\urllib2.py", line 376, in _open
'_open', req)
File "D:\python24\lib\urllib2.py", line 337, in _call_chain
result = func(*args)
File "D:\python24\lib\urllib2.py", line 1021, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "D:\python24\lib\urllib2.py", line 994, in do_open
r = h.getresponse()
File "D:\python24\lib\httplib.py", line 863, in getresponse
response.begin()
File "D:\python24\lib\httplib.py", line 365, in begin
print "header:", hdr,
File "D:\python24\lib\codecs.py", line 178, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x89 in position 840:
ordinal not in range(128)
Why is the print statement using the 'ascii' codec instead of utf-8? Is
there a way to ensure print always work (I'm just using it for debugging
purpose) ?
I'm using python 2.4.2.
Baptiste.