grabbing html from pages that require login

A

Adam Akhtar

Hi ive searched the forums already but couldnt come up with any concrete
information.

Ive written a script that accesses certain pages on ebay and scapes
information. Its using open-uri to grab html and just standard regexps
to grab the data i want. When building the script I used ebay webpages I
saved to my harddisk. All worked fine.

However when the script attmepts to access the actual online webpages I
get an "file or directory not found ..." error.
What I think is happening is that ebay is automatically redirecting the
script to the user login. Even if im already logged in via firefox i
guess the fact that the request is coming from something else outside of
firefox makes ebay suspect its from another location and thus for
security reasons requests the user to reenter their detials. Im assuming
this is to do with cookies???

So how do get round this issue? Is there a way to get avoid this login
requirement or somehow providing my script with the info it needs to
login and continue from there?
 
R

Rob Biedenharn

You might be able to use httpclient for this. It manages a session
much like a browser might and can track cookies that the site might
send. I was able to use this for doing some pretty ugly manipulation
of some pop-up windows that was clearly intended to be used only by a
human with a browser.

gem install httpclient

-Rob

Hi ive searched the forums already but couldnt come up with any
concrete
information.

Ive written a script that accesses certain pages on ebay and scapes
information. Its using open-uri to grab html and just standard regexps
to grab the data i want. When building the script I used ebay
webpages I
saved to my harddisk. All worked fine.

However when the script attmepts to access the actual online
webpages I
get an "file or directory not found ..." error.
What I think is happening is that ebay is automatically redirecting
the
script to the user login. Even if im already logged in via firefox i
guess the fact that the request is coming from something else
outside of
firefox makes ebay suspect its from another location and thus for
security reasons requests the user to reenter their detials. Im
assuming
this is to do with cookies???

So how do get round this issue? Is there a way to get avoid this login
requirement or somehow providing my script with the info it needs to
login and continue from there?
--

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
B

brabuhr

You might be able to use httpclient for this. It manages a session much
like a browser might and can track cookies that the site might send. I was
able to use this for doing some pretty ugly manipulation of some pop-up
windows that was clearly intended to be used only by a human with a browser.

gem install httpclient

You may also wish to look at: mechanize, scrubyt, firewatir, etc.
 
R

reuben doetsch

[Note: parts of this message were removed to make it a legal post.]

Use mechanize, it makes logins very easy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top