From a URL to XPath 2.0

Evan Senter · Feb 20, 2008

Hi,

I am trying to write a small script that allows me to scrape HTML using
XPath 2.0. As much as I enjoyed using hPricot, it's lack of support for
indexed paths has forced me to look to a different tool (I've heard
REXML has the best XPath support). In order to use REXML however, I need
to first convert the HTML to XML and I'm yet to find a good gem / plugin
to do that.

As I mentioned however, my main interest is having index support for
XPath queries against an HTML page arbitrarily pulled from a generated
URL. Anyone know of a good approach to handle this?

Thank you,

Ruby.new(user)

Guillaume Carbonneau · Feb 21, 2008

Evan said:
Hi,

I am trying to write a small script that allows me to scrape HTML using
XPath 2.0. As much as I enjoyed using hPricot, it's lack of support for
indexed paths has forced me to look to a different tool (I've heard
REXML has the best XPath support). In order to use REXML however, I need
to first convert the HTML to XML and I'm yet to find a good gem / plugin
to do that.

As I mentioned however, my main interest is having index support for
XPath queries against an HTML page arbitrarily pulled from a generated
URL. Anyone know of a good approach to handle this?

Thank you,

Ruby.new(user)

Hi, you might want to try HTML tidy

project : http://tidy.sourceforge.net/
try it online (output XML): http://infohound.net/tidy/

How to make XML::XPath ignore namespaces?	0	May 21, 2013
Any equivalent to Ruby's 'hpricot' html/xpath/css selector package?	6	Dec 28, 2008
Want to host websites that I will probably be the only user from home. Sacrilege, I know, but it has always been a dream of mine. Where do I start?	2	Aug 13, 2024
Watir 2.0 Released	0	Aug 10, 2011
How to go about building a crud app when you are a noob	1	Jan 2, 2023
Introducing Xaggly, a C-based XML Parser for Ruby	6	Dec 31, 2006
problems reading xml from a db field and using it in REXML	0	Jun 30, 2008
I'm tempted to quit out of frustration	1	Aug 13, 2023

From a URL to XPath 2.0

Evan Senter

Guillaume Carbonneau

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads