urllib2 through basic auth'ed proxy

  • Thread starter Alejandro Dubrovsky
  • Start date
A

Alejandro Dubrovsky

I see from googling around that this is a popular topic, but I haven't seen
anyone saying "ah, yes, that works", so here it goes.

How does one connect through a proxy which requires basic authorisation?
The following code, stolen from somewhere, fails with a 407:

proxy_handler = urllib2.ProxyHandler({"http" :
"http://the.proxy.address:3128"})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
proxy_auth_handler.add_password("The name of the realm sniffed from
telnetting to the proxy and doing a
get",'the.proxy.address','theusername','thepassword')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
urllib2.install_opener(opener)
f = urllib2.urlopen('http://www.google.com/')


I still get a 407 if I set the realm to None, I change host to the
'http://the.proxy.address/' form or even 'http://the.proxy.address:3128'
form.

The proxy is squid. Python version is 2.3.4 (I read that this version has a
problem in that it introduces an extra return after the authorisation, but
it isn't even getting to that bit). And yes, going through firefox,
everything works fine.

Can anyone explain me why this fails, or more importantly, code that would
work?

Thanks,
alejandro
 
J

John J. Lee

Alejandro Dubrovsky said:
How does one connect through a proxy which requires basic authorisation?
The following code, stolen from somewhere, fails with a 407:
[...code involving urllib2.ProxyBasicAuthHandler()...]
Can anyone explain me why this fails, or more importantly, code that would
work?

OK, I finally installed squid and had a look at the urllib2 proxy
basic auth support (which I've steered clear of for years despite
doing quite a bit with urllib2). Seems quite broken. Appears to have
been broken back in December 2004, with revision 38092 (note there's a
little revision number oddness in the Python SVN repo, BTW:
http://mail.python.org/pipermail/python-dev/2005-November/058269.html):

--- urllib2.py (revision 38091)
+++ urllib2.py (revision 38092)
@@ -720,7 +720,10 @@
return self.retry_http_basic_auth(host, req, realm)

def retry_http_basic_auth(self, host, req, realm):
- user,pw = self.passwd.find_user_password(realm, host)
+ # TODO(jhylton): Remove the host argument? It depends on whether
+ # retry_http_basic_auth() is consider part of the public API.
+ # It probably is.
+ user, pw = self.passwd.find_user_password(realm, req.get_full_url())
if pw is not None:
raw = "%s:%s" % (user, pw)
....


That can't be right, can it? With a proxy, you're always
authenticating yourself for the whole proxy, and you want to look up
(RFC 2617 section 3.2.1). The ProxyBasicAuthHandler subclass
dutifully passes in the right thing for the host argument, but
AbstractBasicAuthHandler ignores it, which means that it never finds
the password -- e.g. if you're trying to connect to python.org through
myproxy.com, it'll be looking for a username/password for python.org
instead of the needed myproxy.com.

Obviously nobody else uses authenticating proxies either, or at least
nobody who can be bothered to fix urllib2 :-(

A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install):

import urllib2

class DumbProxyPasswordMgr:
def __init__(self):
self.user = self.passwd = None
def add_password(self, realm, uri, user, passwd):
self.user = user
self.passwd = passwd
def find_user_password(self, realm, authuri):
return self.user, self.passwd
proxy_auth_handler = urllib2.ProxyBasicAuthHandler(DumbProxyPasswordMgr())
proxy_handler = urllib2.ProxyHandler({"http": "http://localhost:3128"})
proxy_auth_handler.add_password(None, None, 'john', 'blah')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
f = opener.open('http://python.org/')
print f.read()


Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

I'll try to get some fixes in tomorrow so that 2.5 isn't broken (or at
least flag the issues to let somebody else fix them), but no promises
as usual...


John
 
J

John J. Lee

Alejandro Dubrovsky said:
The proxy is squid. Python version is 2.3.4 (I read that this version has a
problem in that it introduces an extra return after the authorisation, but
it isn't even getting to that bit). And yes, going through firefox,
everything works fine.
[...]

FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.

I think the Examples section of the docs on this are wrong too, though
that's a bit of a moot point when the code is as broken as it seems...


John
 
A

Alejandro Dubrovsky

John said:
FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.

"ah, yes, that works" on 2.3.4. Excellent. (I don't see what's so ugly
about that code, but i'm mostly accustomed to my own)
Thanks,
alejandro
 
J

John J. Lee

Alejandro Dubrovsky said:
"ah, yes, that works" on 2.3.4. Excellent. (I don't see what's so ugly
about that code, but i'm mostly accustomed to my own)

supplying a password surely shouldn't be that complicated...


John
 
J

John J. Lee

Alejandro Dubrovsky <[email protected]> writes:
[...Alejandro complains about non-working HTTP proxy auth in urllib2...]

[...John notes urllib2 bug...]
A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install): [...snip ugly code]
Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!
[...]

In fact the following also works with Python 2.3.4:

import urllib2
proxy_handler = urllib2.ProxyHandler({"http": "http://john:blah@localhost:3128"})
print urllib2.build_opener(proxy_handler).open('http://python.org/').read()


....but only just barely skirts around the bugs!-) :-(

(at least, the current bugs: I've no reason to work out what things
were like back in 2.3.4, but the above certainly works with that
version)


John
 
A

Alejandro Dubrovsky

John said:
Alejandro Dubrovsky <[email protected]> writes:
[...Alejandro complains about non-working HTTP proxy auth in urllib2...]

[...John notes urllib2 bug...]
A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install): [...snip ugly code]
Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!
[...]

In fact the following also works with Python 2.3.4:

import urllib2
proxy_handler = urllib2.ProxyHandler({"http":
"http://john:blah@localhost:3128"}) print
urllib2.build_opener(proxy_handler).open('http://python.org/').read()
It does too. Thanks again. (I think this version is uglier, but easier to
insert into third party code)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,154
Members
46,702
Latest member
LukasConde

Latest Threads

Top