W
_why
Please enjoy a succulent, new Hpricot. A bit faster, some Ruby 1.9
support, and assorted fixes.
gem install hpricot --source http://code.whytheluckystiff.net
It should show up at Rubyforge in a bit.
I'm sure you're wondering what's the reason for Hpricot updates, in
the face of heated competition from the Nokogiri and LibXML
libraries. Remember that Hpricot has no dependencies and is smaller
than either of those libs. Hpricot uses its own Ragel-based
parser, so you have the freedom to hack the parser itself, the code
is dwarven by comparison.
Best of all, Hpricot has run on JRuby in the past. And I am in the
process of merging some IronRuby code[1] and porting 0.7 to
JRuby. This means your code will run on a variety of Ruby platforms
without alteration. That alone makes it worthwhile, wouldn't you
agree?
Clearly, the benchmarks you see on Ruby Inside are skewed to favor
Nokogiri. They parse XML through Hpricot without using Hpricot.XML(),
which is not only wrong, but puts XML through needless HTML cleanup
operations. I am sure that Hpricot 0.7 still fares slower on large
documents. However, for instance, try testing a large amount of
small documents (a much more common scenario) with this latest
version.
You have to question a benchmark that is entirely based on two XML
documents. What about HTML fix ups? What about various platforms
and CPUs? Why not treat Hpricot fairly and use it properly in the
benchmarks? It reeks of something.
_why
[1] http://github.com/nrk/ironruby-hpricot/tree/master
support, and assorted fixes.
gem install hpricot --source http://code.whytheluckystiff.net
It should show up at Rubyforge in a bit.
I'm sure you're wondering what's the reason for Hpricot updates, in
the face of heated competition from the Nokogiri and LibXML
libraries. Remember that Hpricot has no dependencies and is smaller
than either of those libs. Hpricot uses its own Ragel-based
parser, so you have the freedom to hack the parser itself, the code
is dwarven by comparison.
Best of all, Hpricot has run on JRuby in the past. And I am in the
process of merging some IronRuby code[1] and porting 0.7 to
JRuby. This means your code will run on a variety of Ruby platforms
without alteration. That alone makes it worthwhile, wouldn't you
agree?
Clearly, the benchmarks you see on Ruby Inside are skewed to favor
Nokogiri. They parse XML through Hpricot without using Hpricot.XML(),
which is not only wrong, but puts XML through needless HTML cleanup
operations. I am sure that Hpricot 0.7 still fares slower on large
documents. However, for instance, try testing a large amount of
small documents (a much more common scenario) with this latest
version.
You have to question a benchmark that is entirely based on two XML
documents. What about HTML fix ups? What about various platforms
and CPUs? Why not treat Hpricot fairly and use it properly in the
benchmarks? It reeks of something.
_why
[1] http://github.com/nrk/ironruby-hpricot/tree/master