memory considerations when parsing XML file

M

Mr_Tibs

Hi,

I need to quickly parse a large XML file. I didn't want to use DOM
parsing since this way you end up with the whole file in the memory. I
looked at writing my own SAX-style parser and came across this post:
http://www.janvereecken.com/2007/4/11/event-driven-xml-parser-in-ruby.

My question is about reading the file from the hard drive into the
parse_stream method:
REXML::Document.parse_stream(File.open(filename).read, MyListener.new)
Won't File.open...read read the whole file in the memory? If yes, then
nothing was gained. I might just as well read and parse the document
using DOM since it is easier.

Thanks,
Tiberiu
 
M

Mr_Tibs

I'm sorry for rushing with the post. I just read that parse_stream
also takes an IO object, so I don't have to do read on the filename.

Tiberiu
 
R

Robert Klemme

I'm sorry for rushing with the post. I just read that parse_stream
also takes an IO object, so I don't have to do read on the filename.

:)

And the idiom should rather read

File.open(filename, 'rb') do |io|
REXML::Document.parse_stream(io, MyListener.new)
end

i.e. use the block form of File.open.

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top