Volker M. said:
I want to open a list of URLs with Pythons urllib and the fuction
open(URL) automatically. It is important that the program open ONLY
normal http-sites and no https-sites with user/password-request.
So exists a possibility that I could cancel all site requests with
user/password-dialogues?
Assuming you mean you don't want to handle Basic HTTP Authentication
(and you don't care whether http or https), you can use
urllib2.urlopen() instead of urllib.urlopen() You will then get a
urllib2.HTTPError with a .code of 401 when a site wants Basic
Authentication.
If you do mean https, though, again with urllib2:
class NullHTTPSHandler(urllib2.HTTPSHandler):
def https_open(self, request):
return None
o = urllib2.build_opener(NullHTTPSHandler())
response = o.open(url)
In general, urllib2 splits up the job of opening URLs into handlers,
so it's more 'turn-off-and-on-able' than urllib.
Since you're writing a robot, one other thing: the alpha version of my
ClientCookie package (urllib2-replacement with addons) contains code
for obeying robots.txt files (albeit not yet well tested, IIRC):
import ClientCookie
o = ClientCookie.build_opener(ClientCookie.HTTPRobotRulesProcessor())
response = o.open(url)
Some time soon I'll have to make a distribution of this stuff that
works properly with 2.4 (which includes changes to urllib2 from
ClientCookie)...
John