D
Daniel N
Hi,
I hope this is the right place for this.
I'm writing a markup mangler that I'm using libxml for. Primarily for
speed reasons.
I've got some basic functionality going, it's not finished, and when
I'm doing a benchmark I get some good figures.
I'm using a recursive function to process my data. The general gist
of it is as follows
http://pastie.caboo.se/63584
But...
When I run the benchmark I get weird things happening.
All benchmarks exhibit steadily increasing memory usage until the end
of the benchmark
10000 iterations were fine
very much above this and I started to get
malloc errors
or
segmentation fault
I changed
doc.root.to_s.gsub( /\<\/?#{artificial_root_tag}\>/, "" )
to
doc.root.to_a.join.gsub( /\<\/?#{artificial_root_tag}\>/, "" )
taking a hit on performance, but it allowed me to get through 100,000
iterations.
The memory usage though was stupid at 235Mb Real and 233Mb Virtual
The HTML I am parsing is a fairly small snippet
I'm running on Mac OsX.
Any ideas what and why?
thankyou
Daniel
I hope this is the right place for this.
I'm writing a markup mangler that I'm using libxml for. Primarily for
speed reasons.
I've got some basic functionality going, it's not finished, and when
I'm doing a benchmark I get some good figures.
I'm using a recursive function to process my data. The general gist
of it is as follows
http://pastie.caboo.se/63584
But...
When I run the benchmark I get weird things happening.
All benchmarks exhibit steadily increasing memory usage until the end
of the benchmark
10000 iterations were fine
very much above this and I started to get
malloc errors
or
segmentation fault
I changed
doc.root.to_s.gsub( /\<\/?#{artificial_root_tag}\>/, "" )
to
doc.root.to_a.join.gsub( /\<\/?#{artificial_root_tag}\>/, "" )
taking a hit on performance, but it allowed me to get through 100,000
iterations.
The memory usage though was stupid at 235Mb Real and 233Mb Virtual
The HTML I am parsing is a fairly small snippet
I'm running on Mac OsX.
Any ideas what and why?
thankyou
Daniel