urllib2 - closing sockets

A

Anand Pillai

I recently noted that urllib2.urlopen(...) for http:// urls
does not make an explicit call to close the underlying
HTTPConnection socket once the data from the socket is read.

This might not be required since the garbage collector will
close & collect open sockets that are not closed, but it might
cause the system to run out of socket memory if there are
multiple threads, each opening a socket and the gc not running
in between.

This specifically happens in my HarvestMan program which uses
multiple threads to achieve fast offline web downloads.

A patch to fix this in urllib2.py would be nice.

Thanks

-Anand
 
S

Steve Holden

Anand said:
I recently noted that urllib2.urlopen(...) for http:// urls
does not make an explicit call to close the underlying
HTTPConnection socket once the data from the socket is read.

This might not be required since the garbage collector will
close & collect open sockets that are not closed, but it might
cause the system to run out of socket memory if there are
multiple threads, each opening a socket and the gc not running
in between.

This specifically happens in my HarvestMan program which uses
multiple threads to achieve fast offline web downloads.

A patch to fix this in urllib2.py would be nice.

Thanks

-Anand

In which case you'd be well advised to add this as a bug report on
Sourceforge, as that is the only way to guarantee it will come to (and
stay in) the developers' attention.

It isn't that hard to do.

regards
Steve
 
J

John J. Lee

I recently noted that urllib2.urlopen(...) for http:// urls
does not make an explicit call to close the underlying
HTTPConnection socket once the data from the socket is read.

This might not be required since the garbage collector will
close & collect open sockets that are not closed, but it might
cause the system to run out of socket memory if there are
multiple threads, each opening a socket and the gc not running
in between.

This specifically happens in my HarvestMan program which uses
multiple threads to achieve fast offline web downloads.

Does this cause trouble in your app?

I tried using urllib2 with threads a while back, and failed miserably.
Do you reckon the problem you've found could cause deadlock? Seems
unlikely, but I was at the point of having to read the threading
module's code to get any further with my deadlock bug, so I'm
clutching at straws...

A patch to fix this in urllib2.py would be nice.

if-wishes-were-horses-then-beggars-would-ride-<wink>-ly y'rs


John
 
J

John J. Lee

Steve Holden said:
In which case you'd be well advised to add this as a bug report on
Sourceforge, as that is the only way to guarantee it will come to (and
stay in) the developers' attention.

It isn't that hard to do.

Steve's right that it's not hard to do.

Unfortunately, since people who actually fix bugs (ie. Martin von
Loewis <0.8 wink>) are in short supply, that doesn't guarantee that it
will come to anybody's attention. Of course, you have a *vastly*
higher chance of getting a bug fixed if you provide a patch with
appropriate docs and test code to go with it.


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,201
Messages
2,571,048
Members
47,651
Latest member
VeraPiw932

Latest Threads

Top