SSL (HTTPS) with 2.4

B

Bloke

Hi all.

Some time ago (years) I had a script on Python 2.2 that would retieve a
HTTPS web site. I used python22-win32-ssl.zip to handle the SSL aspect
and it worked wonderfully. I am revisiting the project and need to
update it to Python 2.4.1. python22-win32-ssl.zip isn't compatable
(duh) and I can't find a newer version. I have had a search and can't
find anything to point me in the right direction.

Can someone please help?
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Bloke said:
Some time ago (years) I had a script on Python 2.2 that would retieve a
HTTPS web site. I used python22-win32-ssl.zip to handle the SSL aspect
and it worked wonderfully. I am revisiting the project and need to
update it to Python 2.4.1. python22-win32-ssl.zip isn't compatable
(duh) and I can't find a newer version. I have had a search and can't
find anything to point me in the right direction.

Can someone please help?

In Python 2.4, you don't need any additional libraries to do SSL on
the HTTP client side - Python 2.4 comes with SSL included (IIRC, you
didn't need these libraries in Python 2.2, either).

Just use httplib.HTTPS or httplib.HTTPSConnection instead.

Regards,
Martin
 
B

Bloke

Thanks Martin. That means my code should work.

I am trying to go through a proxy, which works fine for HTTP sites.
However, when I try a HTTPS site, the program doesn't respond for quite
a while, and returns the error:

" File "C:\Python24\lib\urllib2.py", line 996, in do_open
raise URLError(err)
URLError: <urlopen error (8, 'EOF occurred in violation of protocol')>"

Here is my testing code:

import urllib2

proxy_info = {
'user' : 'me',
'pass' : 'mypassword',
'host' : 'proxy.mycompany.com',
'port' : 8008 } # settings taken from web browser


# build a new opener that uses a proxy requiring authorization
proxy_support =
urllib2.ProxyHandler({"https":"https://%(user)s:%(pass)s@%(host)s:%(port)d"
% proxy_info})
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)

# install it
urllib2.install_opener(opener)

# use it
f = urllib2.urlopen('https://www.directshares.com.au')
print f.headers
f.close()

Any ideas what is wrong?

Bloke
 
B

Bloke

I just removed my installation of Python 2.4.1, which was the one on
the python.org web site. I installed the Activepython 2.4.1 and now I
get the following error with the same code above:

File "C:\Python24\lib\urllib2.py", line 1053, in unknown_open
raise URLError('unknown url type: %s' % type)
URLError: <urlopen error unknown url type: https>

I'm getting a bit frustrated. Do I need to import another library?
Any advise is appreciated.

Bloke.
 
L

Lucas Raab

Bloke said:
I just removed my installation of Python 2.4.1, which was the one on
the python.org web site. I installed the Activepython 2.4.1 and now I
get the following error with the same code above:

File "C:\Python24\lib\urllib2.py", line 1053, in unknown_open
raise URLError('unknown url type: %s' % type)
URLError: <urlopen error unknown url type: https>

I'm getting a bit frustrated. Do I need to import another library?
Any advise is appreciated.

Bloke.

I think it's saying it doesn't understand the HTTPS protocol based on the

raise URL error('unknown url type: %' % type)
URLError: <urlopen error unknown url type: https>

--
--------------------------
Lucas Raab
lvraab"@"earthlink.net
dotpyFE"@"gmail.com
AIM: Phoenix11890
MSN: dotpyfe "@" gmail.com
IRC: lvraab
ICQ: 324767918
Yahoo: Phoenix11890
 
B

Bloke

Yes, on looking into it, sockets.ssl is not installed with
activepython, so it doesn't recognise https. So I have removed it, and
reinstalled the v 2.4.1 which I downloaded from www.python.org . This
leaves me with with the problem where the script 'hangs' for a long
time, then returns:

Traceback (most recent call last):
File "C:\Documents and
Settings\rhall\Desktop\software\python\auth_example_5.py", line 18, in
?
f = urllib2.urlopen('https://www.directshares.com.au/')
File "C:\Python24\Lib\urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "C:\Python24\Lib\urllib2.py", line 358, in open
response = self._open(req, data)
File "C:\Python24\Lib\urllib2.py", line 376, in _open
'_open', req)
File "C:\Python24\Lib\urllib2.py", line 337, in _call_chain
result = func(*args)
File "C:\Python24\Lib\urllib2.py", line 1029, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "C:\Python24\Lib\urllib2.py", line 996, in do_open
raise URLError(err)
URLError: <urlopen error (8, 'EOF occurred in violation of protocol')>

I'm stuck...
 
B

Bloke

I just tried the https connection through a friends internet connection
which uses a transparent proxy as follows:

import urllib2
f = urllib2.urlopen('https://www.directshares.com.au/')
print f.headers
print f.read()
f.close()

This works fine. So it must be a problem with either the proxyhandler
(in Python) or our proxy server. Given that I can connect to the site
through the proxy with a web browser, it must be something to do with
my the way I have set up the proxy handler in python. I have read
multiple ways of setting up the connection, and this is the only way I
can get it to run with http:

import urllib2

proxy_info = {
'user' : 'me',
'pass' : 'password',
'host' : 'companyproxy.com.au',
'port' : 8008 }

# build a new opener that uses a proxy requiring authorization
proxy_support =
urllib2.ProxyHandler({"http":"http://%(user)s:%(pass)s@%(host)s:%(port)d"
% proxy_info})
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)

# install it
urllib2.install_opener(opener)


Is there a better way for me to do this?
 
B

Bloke

Following my above comment, if my script works with http, then what is
the problem with https, even when I change the ProxyHandler to specify
https?
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Bloke said:
Following my above comment, if my script works with http, then what is
the problem with https, even when I change the ProxyHandler to specify
https?

I believe there is a bug in the https implementations of certain Web
services, in particular the Microsoft-ish ones. They are supposed to
send a message to close the connection, but fail to do so. Instead,
they eventually shut down the connection (without sending a
CloseConnection message first).

So part of the problem is that your web server, in violation of the
protocol, just drops the connection.

The other question is why there is a period of inactivity. This, again,
may be the result of a misunderstanding in the protocol implementations,
of which the first problem is only a side effect (i.e. the server might
close the connection because of the inactivity).

This typically means that either side is expecting the other one to send
a message, but both sides fail to do so. Which one specifically is buggy
can only be determined by studying the messages sent back and forth in
more detail.

As a wild guess: try not to use HTTP/1.1, use HTTP/1.0 instead. To do
so, try not using urllib, use httplib directly.

Regards,
Martin
 
T

Trent Mick

[Bloke wrote]
I just removed my installation of Python 2.4.1, which was the one on
the python.org web site. I installed the Activepython 2.4.1 and now I
get the following error with the same code above:

File "C:\Python24\lib\urllib2.py", line 1053, in unknown_open
raise URLError('unknown url type: %s' % type)
URLError: <urlopen error unknown url type: https>

I'm getting a bit frustrated. Do I need to import another library?
Any advise is appreciated.

Unfortunately ActivePython cannot include the SSL library by default
because of crypto export regulations. For SSL support in an ActivePython
installation you'd have to add the _ssl.pyd extension separately (either
building it yourself, or finding an available binary).

Sincerely,
Trent
 
P

Paul Rubin

Trent Mick said:
Unfortunately ActivePython cannot include the SSL library by default
because of crypto export regulations.

That hasn't been true for several years. In principle you're supposed
to notify the commerce department but in fact they seem to just ignore
the notices:

http://www.bxa.doc.gov/Encryption

Mozilla, MSIE, Windows, etc. all come with crypto by default.

If you want SSL in pure Python, try <http://trevp.net/tlslite>. It's
a really nice piece of code, though (so far) not what I'd call full
featured.
 
B

Bloke

Thanks Martin.

The problem seems to lie with our company proxy (which requires
authentication). I have tried retrieving the page on another network
with a transparent proxy, and it all works fine. Unfortnately, any
https page I try to retrieve on the company network fails in this way
with after a long period of inactivity. However, I can retrieve the
page using a standard browser through the same company network. I
think there must be something weird going on with our proxy server.
 
A

Andrew Bushnell

I am interested in any further progress with this, you may have made? I
am facing a similar issue. In short, I need to connect to a https site
(in this case a WSDL) that I need to access through an Http proxy. I
have tried various things to no avail. I did find a couple recipes on
the ASPN python cookbook that talk about tunneling via a "CONNECT"
request is what I believe I need to do. The recipes are:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/301740

this particular example establishes the proxy connection and then issues
the connect request to establish the connection to the https site, then
tries to establish an SSL Socket connection to the proxy then talk
through it to get the data from the https via the proxy, I get the same
EOF error when I run this sample with our proxy..

The other recipe I looked at was:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/213238

This is a tunneling mechanism. When I try this, I basically get
"nowhere" in other words, the get request never gets picked up by the
various socket/servers setup to do the tunneling and it just loops
around seemingly forever..

I am running on Windows XP if that matters.

Of course, Mozilla and I.E. work fine accessing the https site in
question via our proxy. In short, I simply need to connect to a http
proxy and through it get at a url which is an https. I do not need any
user name or password etc. to get to the site. The standard Python
libraries fail as the "GET" request through the proxy is dismissed as a
bad request.

Any help is greatly appreciated.

- Andrew Bushnell
Thanks Martin.

The problem seems to lie with our company proxy (which requires
authentication). I have tried retrieving the page on another network
with a transparent proxy, and it all works fine. Unfortnately, any
https page I try to retrieve on the company network fails in this way
with after a long period of inactivity. However, I can retrieve the
page using a standard browser through the same company network. I
think there must be something weird going on with our proxy server.

--
************************************
Andrew Bushnell
Lead Development Engineer
Fluent Inc.
10 Cavendish Court
Centerra Resource Park
Lebanon, NH 03766
(e-mail address removed)
Phone: 603-643-2600, ext. 757
Fax: 603-643-1721
www.fluent.com
************************************
 
B

Bloke

Andrew,

It seems I'm not the only one going nuts here. I have just spent the
last 4 hrs stepping through the code in the debugger. It seems to get
stuck somewhere in the socket module (when it calls ssl) but haven't as
yet figured out exactly where.

I am _very_ interested to find that you have the same prob with a
non-authenticating proxy. I had considered I was doing something wrong
with the authentication, but from what you say, and from what I have
deduced from the code, it is not the authentication that is at fault.

Like you, a standard browser works fine, so I'm inclined to think there
is something buggy with the way the sockets module talks to the proxy.
There has been some suggestion that it may me a 'Microsoftish' proxy
which is at fault, but I believe it is a Squid proxy our company uses.

There is an interesting note here (
http://www.squid-cache.org/Doc/FAQ/FAQ-11.html setcion 11.34 )
regarding malformed https requests sent through Squid with buggy
clients. It may be worth looking into.

Anyway, if you have any luck, _please_ let me know - I'm getting
desparate.
 
A

Andrew Bushnell

Thanks for the update. I will/can keep you posted. I know for a fact we
use a Squid proxy which sounds like what you are using. I am going to
check out the faq you sent and see what it comes up with. I have also
been perusing the net a bit and looking at other client packages and see
if they work, such as cURL etc.

Thanks,

Andrew
Andrew,

It seems I'm not the only one going nuts here. I have just spent the
last 4 hrs stepping through the code in the debugger. It seems to get
stuck somewhere in the socket module (when it calls ssl) but haven't as
yet figured out exactly where.

I am _very_ interested to find that you have the same prob with a
non-authenticating proxy. I had considered I was doing something wrong
with the authentication, but from what you say, and from what I have
deduced from the code, it is not the authentication that is at fault.

Like you, a standard browser works fine, so I'm inclined to think there
is something buggy with the way the sockets module talks to the proxy.
There has been some suggestion that it may me a 'Microsoftish' proxy
which is at fault, but I believe it is a Squid proxy our company uses.

There is an interesting note here (
http://www.squid-cache.org/Doc/FAQ/FAQ-11.html setcion 11.34 )
regarding malformed https requests sent through Squid with buggy
clients. It may be worth looking into.

Anyway, if you have any luck, _please_ let me know - I'm getting
desparate.

--
************************************
Andrew Bushnell
Lead Development Engineer
Fluent Inc.
10 Cavendish Court
Centerra Resource Park
Lebanon, NH 03766
(e-mail address removed)
Phone: 603-643-2600, ext. 757
Fax: 603-643-1721
www.fluent.com
************************************
 
A

andreas

Hi!

HTTPS over a proxy (CONNECT) hasn't worked for a long time in python
(actually it has never worked).

A quick glance at the 2.4 Changelog doesn't suggest that this has been
fixed.

So basically you've got the following options:
a) redo your own http/https support.
b) look around on the net for some patches to httplib (google is your friend)
be aware that these are quite old patches.
c) using some external solution, like pycURL.

Andreas
 
A

Andrew Bushnell

Thanks for the feedback. andreas. I am looking into how to work my own
connection logic into the code. Google has quickly become my friend and
I am actually poking at cURL (pyCurl) to see what benefit it will be.

Thanks again.


Hi!

HTTPS over a proxy (CONNECT) hasn't worked for a long time in python
(actually it has never worked).

A quick glance at the 2.4 Changelog doesn't suggest that this has been
fixed.

So basically you've got the following options:
a) redo your own http/https support.
b) look around on the net for some patches to httplib (google is your friend)
be aware that these are quite old patches.
c) using some external solution, like pycURL.

Andreas

--
************************************
Andrew Bushnell
Lead Development Engineer
Fluent Inc.
10 Cavendish Court
Centerra Resource Park
Lebanon, NH 03766
(e-mail address removed)
Phone: 603-643-2600, ext. 757
Fax: 603-643-1721
www.fluent.com
************************************
 
P

pyguy2

If you need some help, send me an email and if we figure this out we
can post a resolution. I have used both approaches (having authored
them). Or at least let me know what site you are going to and I will
try them on a windows box and see if I can debug that the !@#@ is
going on.


john
 
B

Bloke

Ideally, we should aim at a 'fix' that can be included in the
distribution. I am going to look at what communication goes on between
the proxy server and a working browser by monitoring the traffic. From
what i understand, the proxy needs to be told first to set up a secure
connection with the web site, and only then do you pass the url to the
proxy.


Bloke
 
A

Andrew Bushnell

That would be nice if something could be added to the distribution.

In general, what needs to be done is as follows:

#1: Connect to proxy host:port
#2: Send "CONNECT" request with host:443 of secure url you want to
"tunnel" to. Additional headers can be added depending on authorization
needed for connection.
#3: Once connection is established, setup a SSL handshake/connection via
the proxy then start getting/sending data.

I have steps #1 and #2 down and working with no problems, I keep getting
hung up on step 3 and that is where the SSL Errors are occuring for me
with squid etc.



Ideally, we should aim at a 'fix' that can be included in the
distribution. I am going to look at what communication goes on between
the proxy server and a working browser by monitoring the traffic. From
what i understand, the proxy needs to be told first to set up a secure
connection with the web site, and only then do you pass the url to the
proxy.


Bloke

--
************************************
Andrew Bushnell
Lead Development Engineer
Fluent Inc.
10 Cavendish Court
Centerra Resource Park
Lebanon, NH 03766
(e-mail address removed)
Phone: 603-643-2600, ext. 757
Fax: 603-643-1721
www.fluent.com
************************************
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,240
Messages
2,571,208
Members
47,845
Latest member
vojosay

Latest Threads

Top