[Note: parts of this message were removed to make it a legal post.]
Hi all,
Sorry for the silence from Nokogiri HQ on this one. Let me first say that
I'm sad none of this discussion occurred on the nokogiri-talk mailing list,
where it might be highly visible for someone to read in the future. I'd like
to encourage people with Nokogiri questions, particularly questions about
installation, to ask on nokogiri-talk first.
I'd also like to preface my remarks by saying that I *think* that
http://nokogiri.org/tutorials/installing_nokogiri.html might have helped
anyone/everyone with this problem. If it needs to be edited, I'd love to
hear constructive feedback.
That said, comments inline below ...
http://github.com/mxcl/homebrew/commits/master/Library/Formula/libxml2.rb
This is the important part. The other machine I mentioned is one that I've
been using to develop a pretty substantial web application. We have been
having a peculiar problem with ruby-libxml for likely 18 months now. We've
eliminated the library from our code base in favour of Nokogiri. The problem
is a memory corruption error (we think caused by traversals of the the DOM)
that occurs some time *after* the fact as a GC error. Over time, through a
process of elimination over took a month or two, we are convinced that it
was the ruby-libxml library causing the problem.
Well, there are two possible root causes for this. One is that, if you're
using a buggy version of LibXML (in particular, v2.6.16 that comes by
default with Leopard), then you will crash. It's just a buggy version of the
library. You'll note that on
http://nokogiri.org/tutorials/installing_nokogiri.html Nokogiri HQ states in
no uncertain terms that 2.6.16 should not be used.
The other possible root cause is libxml-ruby, which has known issues with
how it interacts with libxml's (rather hairy) memory management, and does
not appear at this point in time to be actively maintained.
On Friday I switched to homebrew, rvm, using Nokogiri. The memory
corruption problem reappeared. After a lot of sweating and a little googling
and I found this blog post:
http://bennyfreshness.posterous.com/installing-nokogiri-with-homebrew-install-of
Please read that if you are using OS X. If you do what he says, for me most
importantly:
gem install nokogiri --
--with-xml2-include=/usr/local/Cellar/libxml2/2.7.7/include/libxml2
--with-xml2-lib=/usr/local/Cellar/libxml2/2.7.7/lib
They key point here is that, if you have two versions of libxml2 installed
on your machine, it's necessary to be very careful that Nokogiri's C
extension is compiled against the same version of libxml2 that is
dynamically linked (at runtime). Nokogiri will warn you if you "cross
versions", e.g. compile against 2.7.3 but dynamically link in 2.6.16. This
is another possible source of crashing / memory corruption.
This form of installation command is referenced at
http://nokogiri.org/tutorials/installing_nokogiri.html under "nonstandard
libxml/libxslt installations", which I think we can all agree Homebrew is
(i.e., nonstandard). Is there some way that page can be clarified to help
future Nokogiri users?
I'm in complete earnest when I say I'm interested in your opinion. I've even
open-sourced the Nokogiri.org tutorials to try to encourage active
participation:
http://github.com/flavorjones/nokogiri.org-tutorials because,
I think you all know, after a while developing a project, it's not always
obvious what documentation might be missing or unclear.
the problem goes away.
I am now suspicious that our problems with ruby-libxml were for the same or
similar reasons. libxml2 comes with OS X, maybe there's some kind of
confusion there? Don't know.
As I said, it may have been libxml-ruby, or the version of libxml2 you were
using, or a combination of the two.