Following a Javascript Link Using Mechanize

M

Matt White

Hello,

I am on a page that lists 78 items and can only show 25 per page, so
are links for each page of results and a "Next >" link. I'd like to
get Mechanize to follow the Next link but the link looks like this:

<a class='searchlinks' href='javascript:lnkclick(2);'>Next ></a>

If I try to "click" the link, Mechanize raises an "unsupported scheme"
exception. At this point I am using all sorts of fun regular
expressions to parse the Javascript and send the appropriate values to
the page with a WWW::Mechanize.post call. Is there an easier way?
Thanks.
 
R

Ryan Leavengood

If I try to "click" the link, Mechanize raises an "unsupported scheme"
exception. At this point I am using all sorts of fun regular
expressions to parse the Javascript and send the appropriate values to
the page with a WWW::Mechanize.post call. Is there an easier way?
Thanks.

This would require Mechanize to have a Javascript interpreter. In
theory this is possible (there are several open source Javascript
interpreters), but for most cases this is a bit of overkill. Of course
with all the AJAX sites these days it might not be a terrible idea.

In addition the HTML parser might need to be better and would need to
provide a DOM interface for the Javascript to use.

Either way it is a big complicated project, and I doubt something the
Mechanize maintainer would want to implement. If you REALLY want to
look into this I would recommend taking a look at the WebKit
(www.webkit.org) Javascript interpreter, it seems to be pretty
self-contained and quite fast.

Ryan
 
G

Gregory Brown

Either way it is a big complicated project, and I doubt something the
Mechanize maintainer would want to implement. If you REALLY want to
look into this I would recommend taking a look at the WebKit
(www.webkit.org) Javascript interpreter, it seems to be pretty
self-contained and quite fast.

Actually, I think Aaron is working on a project RKelly, which converts
Javascript to Ruby. I can only guess that it would at least give him
(at least limited) support for js in mechanize.
 
D

Dan Zwell

Matt said:
Hello,

I am on a page that lists 78 items and can only show 25 per page, so
are links for each page of results and a "Next >" link. I'd like to
get Mechanize to follow the Next link but the link looks like this:

<a class='searchlinks' href='javascript:lnkclick(2);'>Next ></a>

If I try to "click" the link, Mechanize raises an "unsupported scheme"
exception. At this point I am using all sorts of fun regular
expressions to parse the Javascript and send the appropriate values to
the page with a WWW::Mechanize.post call. Is there an easier way?
Thanks.

As others have said, you would need a javascript interpreter. I think I
know about one of those. It's called a web browser =). This is probably
not the ideal solution, but it will work without many headaches. You can
use one of two libraries (firewatir and watir) to control a web browser
(firefox or internet explorer) from ruby. You can do just about
everything besides download files (though you can get at the source of a
page, and probably a .css file, as well). It might be worth a try.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,262
Messages
2,571,310
Members
47,977
Latest member
MillaDowdy

Latest Threads

Top