Trouble with seek(0) command

R

Randy Gamage

I can't figure out why this script gets an error. This is script that gets
a web page, then parses the title out of the web page. When it's done
parsing, I would like to reset the pointer to the beginnning of the response
file object, but the seek(0) command does not work. Anybody know why?

The error message says: AttributeError: addinfourl instance has no attribute
'seek'

But in PythonWin IDE, when I type response and then a ".", the popup options
include both read and seek. What's going on?
Here's the code:

#!/usr/bin/python
import urllib2, string

def Title(response):
# Returns the title of a web page
page = response.read()
page = page[string.find(page,'<title>'):string.find(page,'</title>')]
page = page[string.find(page,'>')+1:]
response.seek(0) # This causes an error - WHY?
return page

strurl = 'http://www.gamatronix.com'
resp = urllib2.urlopen(strurl)
print Title(resp)
print resp.read() # Without the seek command, this will return nothing,
because the pointer is at the end

Please copy me on responses.

Thanks,
Randy
 
P

Peter Hansen

Randy said:
I can't figure out why this script gets an error. This is script that gets
a web page, then parses the title out of the web page. When it's done
parsing, I would like to reset the pointer to the beginnning of the response
file object, but the seek(0) command does not work. Anybody know why?

The error message says: AttributeError: addinfourl instance has no attribute
'seek'

But in PythonWin IDE, when I type response and then a ".", the popup options
include both read and seek. What's going on?

PythonWin is probably lying to you.
 
J

John J. Lee

Randy Gamage said:
I can't figure out why this script gets an error. This is script that gets
a web page, then parses the title out of the web page. When it's done
parsing, I would like to reset the pointer to the beginnning of the response
file object, but the seek(0) command does not work. Anybody know why?
[...]

urllib2's response objects just don't have a seek. They *can't* have
a seek without caching data, since the data in question is gettin read
from a socket: it just ain't there any more after you've .read() it!

You can just make sure you always keep data read from a response
object, so you can reuse it later -- but that is an annoyance.

If you want a response object that *does* cache, and allow seeking,
you could pinch seek_wrapper from ClientCookie
(http://wwwsearch.sf.net/):

response = seek_wrapper(urllib2.urlopen(url))
page = response.read()
....
response.seek(0)
....


I think Andrew Dalke has also posted a similar thing to seek_wrapper,
that only allows .seek(0). Called ReseekFile, or something similar.


Or just use ClientCookie itself:

from ClientCookie import build_opener, SeekableProcessor

o = build_opener(SeekableProcessor)

response = o.open(url)
page = response.read()
....
response.seek(0)
....

(or, if you prefer, you can ClientCookie.install_opener(o) so you can
do ClientCookie.urlopen(url) instead of o.open(url))


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,169
Messages
2,570,920
Members
47,463
Latest member
FinleyMoye

Latest Threads

Top