Ruby screen scraping

P

Peter Szinek

Chris said:
Turns out I actually ended up abandonning HTree and the rest. I used
net/http in order to fetch the page and then took the table of the page
that I was interested in examining and converted that using rexml. I
have now been able to grab the values that I wanted using XPath :)
If you are keen on XPaths, why not:

table = XPath.first(doc, "//table[@class='index' && @width='100%']")

then use 'table' instead of 'converted_data'...

or even

module_name = XPath.first(doc, "//table[@class='index' &&
@width='100%']//td[@class='data']/a/]")

etc.

(Untested since I don't have your doc, but it should +- work)

Cheers,
Peter

__
http://www.rubyrailways.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,219
Messages
2,571,117
Members
47,729
Latest member
taulaju99

Latest Threads

Top