B
Brian D
I'm actually using mechanize, but that's too complicated for testing
purposes. Instead, I've simulated in a urllib2 sample below an attempt
to test for a valid URL request.
I'm attempting to craft a loop that will trap failed attempts to
request a URL (in cases where the connection intermittently fails),
and repeat the URL request a few times, stopping after the Nth attempt
is tried.
Specifically, in the example below, a bad URL is requested for the
first and second iterations. On the third iteration, a valid URL will
be requested. The valid URL will be requested until the 5th iteration,
when a break statement is reached to stop the loop. The 5th iteration
also restores the values to their original state for ease of repeat
execution.
What I don't understand is how to test for a valid URL request, and
then jump out of the "while True" loop to proceed to another line of
code below the loop. There's probably faulty logic in this approach. I
imagine I should wrap the URL request in a function, and perhaps store
the response as a global variable.
This is really more of a basic Python logic question than it is a
urllib2 question.
Any suggestions?
Thanks,
Brian
import urllib2
user_agent = 'Windows; U; Windows NT 5.1; en-US; rv:1.9.0.10) Gecko/
2009042316 Firefox/3.0.10'
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:
1.9.0.16) ' \
'Gecko/2009120208 Firefox/3.0.16 (.NET CLR 3.5.30729)'
headers={'User-Agent':user_agent,}
url = 'http://this is a bad url'
count = 0
while True:
count += 1
try:
print 'attempt ' + str(count)
request = urllib2.Request(url, None, headers)
response = urllib2.urlopen(request)
if response:
print 'True response.'
if count == 5:
count = 0
url = 'http://this is a bad url'
print 'How do I get out of this thing?'
break
except:
print 'fail ' + str(count)
if count == 3:
url = 'http://www.google.com'
purposes. Instead, I've simulated in a urllib2 sample below an attempt
to test for a valid URL request.
I'm attempting to craft a loop that will trap failed attempts to
request a URL (in cases where the connection intermittently fails),
and repeat the URL request a few times, stopping after the Nth attempt
is tried.
Specifically, in the example below, a bad URL is requested for the
first and second iterations. On the third iteration, a valid URL will
be requested. The valid URL will be requested until the 5th iteration,
when a break statement is reached to stop the loop. The 5th iteration
also restores the values to their original state for ease of repeat
execution.
What I don't understand is how to test for a valid URL request, and
then jump out of the "while True" loop to proceed to another line of
code below the loop. There's probably faulty logic in this approach. I
imagine I should wrap the URL request in a function, and perhaps store
the response as a global variable.
This is really more of a basic Python logic question than it is a
urllib2 question.
Any suggestions?
Thanks,
Brian
import urllib2
user_agent = 'Windows; U; Windows NT 5.1; en-US; rv:1.9.0.10) Gecko/
2009042316 Firefox/3.0.10'
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:
1.9.0.16) ' \
'Gecko/2009120208 Firefox/3.0.16 (.NET CLR 3.5.30729)'
headers={'User-Agent':user_agent,}
url = 'http://this is a bad url'
count = 0
while True:
count += 1
try:
print 'attempt ' + str(count)
request = urllib2.Request(url, None, headers)
response = urllib2.urlopen(request)
if response:
print 'True response.'
if count == 5:
count = 0
url = 'http://this is a bad url'
print 'How do I get out of this thing?'
break
except:
print 'fail ' + str(count)
if count == 3:
url = 'http://www.google.com'