Trouble with seek(0) command

Randy Gamage · Nov 7, 2003

I can't figure out why this script gets an error. This is script that gets
a web page, then parses the title out of the web page. When it's done
parsing, I would like to reset the pointer to the beginnning of the response
file object, but the seek(0) command does not work. Anybody know why?

The error message says: AttributeError: addinfourl instance has no attribute
'seek'

But in PythonWin IDE, when I type response and then a ".", the popup options
include both read and seek. What's going on?
Here's the code:

#!/usr/bin/python
import urllib2, string

def Title(response):
# Returns the title of a web page
page = response.read()
page = page[string.find(page,'<title>'):string.find(page,'</title>')]
page = page[string.find(page,'>')+1:]
response.seek(0) # This causes an error - WHY?
return page

strurl = 'http://www.gamatronix.com'
resp = urllib2.urlopen(strurl)
print Title(resp)
print resp.read() # Without the seek command, this will return nothing,
because the pointer is at the end

Please copy me on responses.

Thanks,
Randy

Peter Hansen · Nov 7, 2003

Randy said:
I can't figure out why this script gets an error. This is script that gets
a web page, then parses the title out of the web page. When it's done
parsing, I would like to reset the pointer to the beginnning of the response
file object, but the seek(0) command does not work. Anybody know why?

The error message says: AttributeError: addinfourl instance has no attribute
'seek'

But in PythonWin IDE, when I type response and then a ".", the popup options
include both read and seek. What's going on?

PythonWin is probably lying to you.

John J. Lee · Nov 8, 2003

Randy Gamage said:
I can't figure out why this script gets an error. This is script that gets
a web page, then parses the title out of the web page. When it's done
parsing, I would like to reset the pointer to the beginnning of the response
file object, but the seek(0) command does not work. Anybody know why?

[...]

urllib2's response objects just don't have a seek. They *can't* have
a seek without caching data, since the data in question is gettin read
from a socket: it just ain't there any more after you've .read() it!

You can just make sure you always keep data read from a response
object, so you can reuse it later -- but that is an annoyance.

If you want a response object that *does* cache, and allow seeking,
you could pinch seek_wrapper from ClientCookie
(http://wwwsearch.sf.net/):

response = seek_wrapper(urllib2.urlopen(url))
page = response.read()
....
response.seek(0)
....

I think Andrew Dalke has also posted a similar thing to seek_wrapper,
that only allows .seek(0). Called ReseekFile, or something similar.

Or just use ClientCookie itself:

from ClientCookie import build_opener, SeekableProcessor

o = build_opener(SeekableProcessor)

response = o.open(url)
page = response.read()
....
response.seek(0)
....

(or, if you prefer, you can ClientCookie.install_opener(o) so you can
do ClientCookie.urlopen(url) instead of o.open(url))

John

XML parsing ExpatError with xml.dom.minidom at line 1, column 0	2	Feb 13, 2014
HTTP post with urllib2	5	Aug 6, 2013
Trouble with UnicodeEncodeError and email	0	Jan 8, 2014
Illegal seek	7	Apr 13, 2007
script to Login a website	8	Jul 31, 2013
How can I upload a tar.bz2 file to OpenStack swift object storage container using the Python swift client?	2	Mar 22, 2024
Only one table shows up with the information	2	Mar 29, 2023
ntlm authentication for urllib2	0	Nov 30, 2012

Trouble with seek(0) command

Randy Gamage

Peter Hansen

John J. Lee

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads