REXML document creation speed

T

tedmilkey

Hi,

Please excuse my ignorance, but I'm new to this.

I've written a script that downloads historical stock quotes as .csv,
parses it, then writes out an XML doc that I can then use elsewhere.

It works as designed, but it's *dog slow* and I don't see why. It can
take over 5 minutes to run, and nearly all of that time is writing the
XML docs (I've tested it running the script with the XML document
creation lines commented out and it takes only seconds).

The script is below:

require 'rubygems'
require 'net/http'
require 'FasterCSV'
require 'rexml/document'
include REXML
puts "Start #{Time.now()}"
symbols = Array.new
xml_symbols_doc = Document.new(File.new("symbols.xml"))
for i in 1..xml_symbols_doc.root.elements.size
symbols = xml_symbols_doc.root.elements.get_text.value
end
for i in 1..(symbols.length - 1)
quote_source_url = "http://ichart.finance.yahoo.com/table.csv?
s=#{URI.encode(symbols)}"
quote_response =
Net::HTTP.get_response(URI.parse(quote_source_url))
csv_quotes = FasterCSV.parse(quote_response.body, {:headers =>
true, :header_converters => :symbol})
xml_quotes = Document.new
xml_quotes << XMLDecl.new
xml_quotes.add_element("quotes", {"symbol" => "#{symbols}"})
for j in 0..(csv_quotes.length - 1)
quote = Element.new("quote")
for k in 0..(csv_quotes.headers.length - 1)
quote.add_element("#{csv_quotes.headers()[k]}").text =
"#{csv_quotes[j][k]}"
end
xml_quotes.root << quote
end
xml_quotes_output_file = File.new("#{symbols}.xml", "w+")
xml_quotes.write(xml_quotes_output_file, 3)
puts "#{symbols} OK. File here: #{xml_quotes_output_file.path}"
end
puts "End #{Time.now()}"

Any suggestions as to how I can make this script run (much) faster are
greatly appreciated!

Thanks for your help!
Ted
 
R

Robert Klemme

Hi,

Please excuse my ignorance, but I'm new to this.

I've written a script that downloads historical stock quotes as .csv,
parses it, then writes out an XML doc that I can then use elsewhere.

It works as designed, but it's *dog slow* and I don't see why. It can
take over 5 minutes to run, and nearly all of that time is writing the
XML docs (I've tested it running the script with the XML document
creation lines commented out and it takes only seconds).

REXML is not particularly fast but what makes you sure that it's in
REXML an not in the way you prepare the data? Did you test with "-r
profile"? Did you notice that you have three nested levels of loops -
that may well be the source of the slowness.

A few stylistic remarks: you should use the block form of File.open in
order to ensure proper and timely cleanup.

You can make your live easier by using Ruby's iterating idioms and not
for with array indexes.

Note also that there's XPath expressions that you can use for iterating
an XML document.
The script is below:

<snip/>

Kind regards

robert
 
D

Dejan Dimic

REXML is not particularly fast but what makes you sure that it's in
REXML an not in the way you prepare the data? Did you test with "-r
profile"? Did you notice that you have three nested levels of loops -
that may well be the source of the slowness.

A few stylistic remarks: you should use the block form of File.open in
order to ensure proper and timely cleanup.

You can make your live easier by using Ruby's iterating idioms and not
for with array indexes.

Note also that there's XPath expressions that you can use for iterating
an XML document.


<snip/>

Kind regards

robert

From my personal experience the Hpricot was much faster then REXML.
As Robert already mentioned iterate trough collections.
The first thing you should do is to add some metrics to find what the
slowest part of your program is. Without measurement you can not
determine if you are on the right track of improvement. Not just from
start to finish but add some check points.
As you have a list of symbols to download you should make multi
threaded approach to it.

You guess it right - there is a lot of space to improve this
application but measure the performance first than act and measure the
improvement
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,697
Latest member
AugustNabo

Latest Threads

Top