A
Alex Ciarlillo
I have an application which is supposed to create a set of excel
documents to display sales information for dining areas. The core of it
is a class called SaleReport which is initialized with a sqlite database
connection. The main function takes an excel worksheet object, a row
number and some info about the location and queries the database to
populate the excel sheet. Since none of the locations rely on each
others data, I thought it would be pretty useful to thread that part, so
that each location report is run in its own thread. The problem is, even
though this seems to work, it has not improved performance at all and I
am not sure where the bottleneck is. Here are my theories and the
example code is at the bottom:
1) My first theory was that using a single excel application instance
was blocking the threads so that only one could have access at a time,
but now I rewrote it to use multiple excel instances and still no dice.
2) The connection to the database is limiting access to a single thread
at a time. This shouldnt be the case since each instance of the
SaleReport class gets it own connection, and SQLite is threadsafe.
3) I'm flat out using the threads incorrectly.
The example code:
#################################################################################
require 'win32ole'
require 'SaleReport.rb'
startTime = Time.now
excel_1 = WIN32OLE::new('excel.Application')
WIN32OLE.const_load(excel_1)
excel_2 = WIN32OLE::new('excel.Application')
book1 = excel_1.Workbooks.add
book2 = excel_2.Workbooks.add
books = {'book1' => book1, 'book2' => book2}
locations = [ {'name' => 'location1', 'num' => 841},
{'name' => 'location2', 'num' => 842},
{'name' => 'location3', 'num' => 843} ]
books.each do |name, book|
puts "Working on #{name}."
threads = []
locations.each do |loc|
threads << Thread.new(loc, name, book) { |l, n, b|
row = 1
worksheet = b.Worksheets(1)
sales = SaleReport.new
puts "Starting thread - #{l} || #{n} || #{b}"
sales.outputLocation(ws, row, l['num'], l['name'], 2007)
}
end
puts "Joining..."
threads.each { |t| t.join }
book.SaveAs "#{name}.xls"
book.Close(1)
end
excel_1.quit
excel_2.quit
excel_1 = excel_2 = nil
GC.start #garbage collect
endTime = Time.now
puts "Time to run: #{endTime-startTime} seconds."
###############################################################################
Running this with 2 locations without any threading gave a running time
of ~240 seconds. When I added the threading it was almost exactly the
same. The running time increases linearly with each location I add even
with multiple threads.
Thanks,
-Alex
documents to display sales information for dining areas. The core of it
is a class called SaleReport which is initialized with a sqlite database
connection. The main function takes an excel worksheet object, a row
number and some info about the location and queries the database to
populate the excel sheet. Since none of the locations rely on each
others data, I thought it would be pretty useful to thread that part, so
that each location report is run in its own thread. The problem is, even
though this seems to work, it has not improved performance at all and I
am not sure where the bottleneck is. Here are my theories and the
example code is at the bottom:
1) My first theory was that using a single excel application instance
was blocking the threads so that only one could have access at a time,
but now I rewrote it to use multiple excel instances and still no dice.
2) The connection to the database is limiting access to a single thread
at a time. This shouldnt be the case since each instance of the
SaleReport class gets it own connection, and SQLite is threadsafe.
3) I'm flat out using the threads incorrectly.
The example code:
#################################################################################
require 'win32ole'
require 'SaleReport.rb'
startTime = Time.now
excel_1 = WIN32OLE::new('excel.Application')
WIN32OLE.const_load(excel_1)
excel_2 = WIN32OLE::new('excel.Application')
book1 = excel_1.Workbooks.add
book2 = excel_2.Workbooks.add
books = {'book1' => book1, 'book2' => book2}
locations = [ {'name' => 'location1', 'num' => 841},
{'name' => 'location2', 'num' => 842},
{'name' => 'location3', 'num' => 843} ]
books.each do |name, book|
puts "Working on #{name}."
threads = []
locations.each do |loc|
threads << Thread.new(loc, name, book) { |l, n, b|
row = 1
worksheet = b.Worksheets(1)
sales = SaleReport.new
puts "Starting thread - #{l} || #{n} || #{b}"
sales.outputLocation(ws, row, l['num'], l['name'], 2007)
}
end
puts "Joining..."
threads.each { |t| t.join }
book.SaveAs "#{name}.xls"
book.Close(1)
end
excel_1.quit
excel_2.quit
excel_1 = excel_2 = nil
GC.start #garbage collect
endTime = Time.now
puts "Time to run: #{endTime-startTime} seconds."
###############################################################################
Running this with 2 locations without any threading gave a running time
of ~240 seconds. When I added the threading it was almost exactly the
same. The running time increases linearly with each location I add even
with multiple threads.
Thanks,
-Alex