P
philip20060308
Hi all,
Has anyone ever seen Python 2.4.1's httplib choke when reading chunked
content? I'm using it via urrlib2, and I ran into a particular server
that returns something that httplib doesn't expect. Specifically, in
the code below where the error occurs, line == ''.
Python 2.4.1 (#2, Oct 12 2005, 01:36:32)
[GCC 3.4.4 [FreeBSD] 20050518] on freebsd6
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.4/socket.py", line 285, in read
data = self._sock.recv(recv_size)
File "/usr/local/lib/python2.4/httplib.py", line 456, in read
return self._read_chunked(amt)
File "/usr/local/lib/python2.4/httplib.py", line 495, in
_read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int():
I'm running Python 2.4.1 under FreeBSD 6.0. Interestingly, I can't
recreate the problem using Python 2.3 under OS X.
I've done a little digging for clues. First, the response headers
include:
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
I reckon that if that popular server was sending out broken chunked
content, it'd be a well-known problem but that doesn't seem to be the
case. So I assume (big assumption) that it is sending correct
responses. Another clue is that the content fits all in one chunk.
Under my 2.3 installation (where I can fetch the content successfully),
len(content) == 0x303. The first chunk size reported by the server is
0x311, so I guess that adds up when one adds a fudge factor for \r\n
and so forth.
My guess is that httplib is somehow reading the blank line that
signifies the end of chunked content as part of the content. I don't
know enough about debugging HTTP conversations to go any further. Can
anyone at least confirm the problem elsewhere?
Thanks
Philip
PS - The email address with which this was posted is live; you can also
email Philip Semanchuk: my first name @ my last name .com
Has anyone ever seen Python 2.4.1's httplib choke when reading chunked
content? I'm using it via urrlib2, and I ran into a particular server
that returns something that httplib doesn't expect. Specifically, in
the code below where the error occurs, line == ''.
Python 2.4.1 (#2, Oct 12 2005, 01:36:32)
[GCC 3.4.4 [FreeBSD] 20050518] on freebsd6
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.4/socket.py", line 285, in read
data = self._sock.recv(recv_size)
File "/usr/local/lib/python2.4/httplib.py", line 456, in read
return self._read_chunked(amt)
File "/usr/local/lib/python2.4/httplib.py", line 495, in
_read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int():
I'm running Python 2.4.1 under FreeBSD 6.0. Interestingly, I can't
recreate the problem using Python 2.3 under OS X.
I've done a little digging for clues. First, the response headers
include:
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
I reckon that if that popular server was sending out broken chunked
content, it'd be a well-known problem but that doesn't seem to be the
case. So I assume (big assumption) that it is sending correct
responses. Another clue is that the content fits all in one chunk.
Under my 2.3 installation (where I can fetch the content successfully),
len(content) == 0x303. The first chunk size reported by the server is
0x311, so I guess that adds up when one adds a fudge factor for \r\n
and so forth.
My guess is that httplib is somehow reading the blank line that
signifies the end of chunked content as part of the content. I don't
know enough about debugging HTTP conversations to go any further. Can
anyone at least confirm the problem elsewhere?
Thanks
Philip
PS - The email address with which this was posted is live; you can also
email Philip Semanchuk: my first name @ my last name .com