A
Arun Kumar
Hi all,
I'm using net/http in order to scrap html contents of different
websites. As a preliminary requirement I also want to scrap the html
data of the redirecting urls too. For that I used the following code
response = Net::HTTP.get_response(URI.parse(url))
response = Net::HTTP.get_response(uri)
case response
# if the url is redirecting then fetch the contents of the
redirected url
when Net::HTTPRedirection then uri =
URI.parse(response['Location'])
response =
Net::HTTP.get_response(uri)
# in case of a bad request error
when Net::HTTPBadRequest then http = Net::HTTP.start(uri.host,
uri.port)
#getting the html data by setting the path as '/' and using a user
agent
response = http.get("/", "User-Agent"=>"Mozilla/4.0 (compatible;
MSIE 5.5; Windows NT 5.0)")
end
data = response.body
Now I have a doubt whether this code snippet will completely solve the
problem in the case of url redirections. I want to solve the problem
completely without any problem. Can somebody please tell me whether the
above code snippet will be suitable for the issue or whether I have to
add some additional functionalities to it. If so how??
Thanks
Sunny
I'm using net/http in order to scrap html contents of different
websites. As a preliminary requirement I also want to scrap the html
data of the redirecting urls too. For that I used the following code
response = Net::HTTP.get_response(URI.parse(url))
response = Net::HTTP.get_response(uri)
case response
# if the url is redirecting then fetch the contents of the
redirected url
when Net::HTTPRedirection then uri =
URI.parse(response['Location'])
response =
Net::HTTP.get_response(uri)
# in case of a bad request error
when Net::HTTPBadRequest then http = Net::HTTP.start(uri.host,
uri.port)
#getting the html data by setting the path as '/' and using a user
agent
response = http.get("/", "User-Agent"=>"Mozilla/4.0 (compatible;
MSIE 5.5; Windows NT 5.0)")
end
data = response.body
Now I have a doubt whether this code snippet will completely solve the
problem in the case of url redirections. I want to solve the problem
completely without any problem. Can somebody please tell me whether the
above code snippet will be suitable for the issue or whether I have to
add some additional functionalities to it. If so how??
Thanks
Sunny