Image scraping from behind a proxy

Abhishek Ghose · Jun 4, 2008

Hi,

I was looking at this post in the forum for downloading image files from
the www:
http://www.ruby-forum.com/topic/133833

But it doesnt work for me, apparently because I am behind a proxy. For
the above code(s) I get errors like the following:

c:/ruby/lib/ruby/1.8/net/http.rb:564:in `initialize': No connection
could be mad
e because the target machine actively refused it. - connect(2)
(Errno::ECONNREFU
SED)
from c:/ruby/lib/ruby/1.8/net/http.rb:564:in `open'
from c:/ruby/lib/ruby/1.8/net/http.rb:564:in `connect'
from c:/ruby/lib/ruby/1.8/timeout.rb:48:in `timeout'
from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
from c:/ruby/lib/ruby/1.8/net/http.rb:564:in `connect'
from c:/ruby/lib/ruby/1.8/net/http.rb:557:in `do_start'
from c:/ruby/lib/ruby/1.8/net/http.rb:546:in `start'
from c:/ruby/lib/ruby/1.8/open-uri.rb:243:in `open_http'
... 7 levels...
from test.rb:48:in `write_images'
from test.rb:45:in `each'
from test.rb:45:in `write_images'
from test.rb:76

I had run into similar problems when I had tried to obtain a http
response. Back then I started doing this (which works perfectly for me):

$proxy_addr = 'proxyservername'
$proxy_port = 8080
$proxy=Net::HTTP:

roxy($proxy_addr, $proxy_port)

http_query="http://www.yahoo.com"
url = URI.parse(http_query)
http_response = $proxy.get_response(url)

Is there something similar I can do for obtaining image files? I did
tweak the above code to have a http image file location in the
http_query and store the http_response.body into a normal file. Though
that didnt give me any errors, my jpeg is unreadable.

Abhishek Ghose · Jun 4, 2008

While I was writing my query I figured out what I am supposed to do

Sorry for the thread. I hope it helps other visitors to the forum.

Here's how it works now:

$proxy_addr = 'proxyservername'
$proxy_port = 8080

Net::HTTP:

roxy($proxy_addr, $proxy_port).start("static.flickr.com") {
|http|
resp = http.get("/92/218926700_ecedc5fef7_o.jpg")
open("fun.jpg", "wb") { |file|
file.write(resp.body)
}
}

The above is tweaked version of the example available here:
http://www.rubynoob.com/articles/2006/8/21/how-to-download-files-with-a-ruby-script

It just uses Net::HTTP:

roxy instead of Net::HTTP

connection failed error	0	Nov 27, 2008
Connection failure message trying to run my ruby file	0	Nov 11, 2009
connecting via a proxy	6	Dec 19, 2007
Setting the proxy address	1	Apr 9, 2007
Get file from web page that require user and pass	1	Jun 29, 2009
Still cannot connect to http	0	Dec 24, 2007
Unable to run the code	7	Dec 24, 2007
Mechanize Timeout Exception - How To?	2	Feb 25, 2008

Image scraping from behind a proxy

Abhishek Ghose

Abhishek Ghose

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads