Excessive Memory Usage with libxml

D

Daniel N

Hi,

I hope this is the right place for this.

I'm writing a markup mangler that I'm using libxml for. Primarily for
speed reasons.

I've got some basic functionality going, it's not finished, and when
I'm doing a benchmark I get some good figures.

I'm using a recursive function to process my data. The general gist
of it is as follows

http://pastie.caboo.se/63584


But...

When I run the benchmark I get weird things happening.

All benchmarks exhibit steadily increasing memory usage until the end
of the benchmark
10000 iterations were fine
very much above this and I started to get

malloc errors
or
segmentation fault


I changed
doc.root.to_s.gsub( /\<\/?#{artificial_root_tag}\>/, "" )

to
doc.root.to_a.join.gsub( /\<\/?#{artificial_root_tag}\>/, "" )

taking a hit on performance, but it allowed me to get through 100,000
iterations.
The memory usage though was stupid at 235Mb Real and 233Mb Virtual

The HTML I am parsing is a fairly small snippet

I'm running on Mac OsX.

Any ideas what and why?
thankyou

Daniel
 
D

Daniel N

Hi,

I hope this is the right place for this.

I'm writing a markup mangler that I'm using libxml for. Primarily for
speed reasons.

I've got some basic functionality going, it's not finished, and when
I'm doing a benchmark I get some good figures.

I'm using a recursive function to process my data. The general gist
of it is as follows

http://pastie.caboo.se/63584


But...

When I run the benchmark I get weird things happening.

All benchmarks exhibit steadily increasing memory usage until the end
of the benchmark
10000 iterations were fine
very much above this and I started to get

malloc errors
or
segmentation fault


I changed
doc.root.to_s.gsub( /\<\/?#{artificial_root_tag}\>/, "" )

to
doc.root.to_a.join.gsub( /\<\/?#{artificial_root_tag}\>/, "" )

taking a hit on performance, but it allowed me to get through 100,000
iterations.
The memory usage though was stupid at 235Mb Real and 233Mb Virtual

The HTML I am parsing is a fairly small snippet

I'm running on Mac OsX.

Any ideas what and why?
thankyou

Daniel

I've managed to reproduce the error by changing to Benchmark.bmbm. It is

ruby(3775) malloc: *** Deallocation of a pointer not malloced:
0x3903a60; This could be a double free(), or free() called with the
middle of an allocated block; Try setting environment variable
MallocHelp to see tools to help debug
ruby(3775) malloc: *** Deallocation of a pointer not malloced:
0x75b0750; This could be a double free(), or free() called with the
middle of an allocated block; Try setting environment variable
MallocHelp to see tools to help debug
ruby(3775) malloc: *** Deallocation of a pointer not malloced:
0x75c4950; This could be a double free(), or free() called with the
middle of an allocated block; Try setting environment variable
MallocHelp to see tools to help debug
/markup_mangler_benchmark.rb:79: [BUG] Segmentation fault
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top