V
Venkat Bagam
Hi all,
I am very new at scraping and using mechanize. Its all a smooth
run until I faced this problem of handling Timeout while fetching a web
page.
The Timeout::timeout() is unable to rescue this kind of errors. Here is
my code
require 'timeout'
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
begin
Timeout::timeout(10) do
agent.get('http://www.r-knowsys.com') #this url doesn't exist
end
rescue Timeout::Error
puts "timeout the page doesnt exist"
end
and when I run it, the error stack is as followed
/usr/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo: Name or
service not known (SocketError)
from /usr/lib/ruby/1.8/net/http.rb:560:in `open'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/timeout.rb:48:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/net/http.rb:553:in `do_start'
from /usr/lib/ruby/1.8/net/http.rb:542:in `start'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:352:in
`fetch_page'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:143:in
`get'
from test.rb:8
from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
from test.rb:7
How do I handle such a case? Any help appreciated.
regards,
venkat
I am very new at scraping and using mechanize. Its all a smooth
run until I faced this problem of handling Timeout while fetching a web
page.
The Timeout::timeout() is unable to rescue this kind of errors. Here is
my code
require 'timeout'
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
begin
Timeout::timeout(10) do
agent.get('http://www.r-knowsys.com') #this url doesn't exist
end
rescue Timeout::Error
puts "timeout the page doesnt exist"
end
and when I run it, the error stack is as followed
/usr/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo: Name or
service not known (SocketError)
from /usr/lib/ruby/1.8/net/http.rb:560:in `open'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/timeout.rb:48:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/net/http.rb:553:in `do_start'
from /usr/lib/ruby/1.8/net/http.rb:542:in `start'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:352:in
`fetch_page'
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:143:in
`get'
from test.rb:8
from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
from test.rb:7
Exit code: 1
How do I handle such a case? Any help appreciated.
regards,
venkat