B
Brandon
Hi all,
I'm currently working on a project where we have a need to expose an
Excel spreadsheet on the web as a webservice that needs to be reliable
and available 24x7x365. I've implemented a prototype of this in python
using the win32com library to interface with the Excel Automation API
via COM. I've also used web.py as a way to expose this automation
functionality to the network. This works just fine as a prototype that
proves that it is possible to do something like this.
I'm not revisiting the code with the goal of making it more reliable
and available, able to support multiple users (although only allowing
one at a time to use Excel). This means that Excel is going to
potentially be used from multiple threads -- something I know that has
historically caused problems for a lot of people based on my searches
of the newsgroup. What I'd like to know is what is the proper strategy
that I should be taking here? I see several options:
1) Ignore threads entirely. Let every incoming web thread interface
directly with excel (one web thread at a time -- I'll have an
interpreter scoped lock preventing multiple threads from accessing).
2) Dedicate a single worker thread to do all interaction with Excel.
Pass data off to this thread from the incoming web thread via shared
memory and let the worker thread get Excel to process it. This will
ensure that exactly 1 thread ever interacts with Excel.
3) Spawn a new interpreter (literally spawn another python process)
for each incoming web thread. This will ensure that when a process is
finished with Excel there is no chance of any lingering state or
something improperly cleaned up.
I've actually attempted to implement 1) and 2) and have had problems
with sporadic failures that cause the Excel to no longer be accessible
from the interpreter. I've tried very hard to ensure that
pythoncom._GetInterfaceCount() always returns 0 when I believe I'm
finished with excel. I don't think I'm perfect at always getting a 0,
but I'm pretty close. I believe that some of my problems are caused by
not being perfect about this but haven't been able to track down the
leaking references. I'm definitely not an expert in COM or its
threading models.
What techniques does the group suggest? Is spawning a new interpreter
for every request a bit too extreme (I don't mind the performance hit).
Has anyone ever done anything like this? How did you get around these
problems?
Thanks,
Brandon
I'm currently working on a project where we have a need to expose an
Excel spreadsheet on the web as a webservice that needs to be reliable
and available 24x7x365. I've implemented a prototype of this in python
using the win32com library to interface with the Excel Automation API
via COM. I've also used web.py as a way to expose this automation
functionality to the network. This works just fine as a prototype that
proves that it is possible to do something like this.
I'm not revisiting the code with the goal of making it more reliable
and available, able to support multiple users (although only allowing
one at a time to use Excel). This means that Excel is going to
potentially be used from multiple threads -- something I know that has
historically caused problems for a lot of people based on my searches
of the newsgroup. What I'd like to know is what is the proper strategy
that I should be taking here? I see several options:
1) Ignore threads entirely. Let every incoming web thread interface
directly with excel (one web thread at a time -- I'll have an
interpreter scoped lock preventing multiple threads from accessing).
2) Dedicate a single worker thread to do all interaction with Excel.
Pass data off to this thread from the incoming web thread via shared
memory and let the worker thread get Excel to process it. This will
ensure that exactly 1 thread ever interacts with Excel.
3) Spawn a new interpreter (literally spawn another python process)
for each incoming web thread. This will ensure that when a process is
finished with Excel there is no chance of any lingering state or
something improperly cleaned up.
I've actually attempted to implement 1) and 2) and have had problems
with sporadic failures that cause the Excel to no longer be accessible
from the interpreter. I've tried very hard to ensure that
pythoncom._GetInterfaceCount() always returns 0 when I believe I'm
finished with excel. I don't think I'm perfect at always getting a 0,
but I'm pretty close. I believe that some of my problems are caused by
not being perfect about this but haven't been able to track down the
leaking references. I'm definitely not an expert in COM or its
threading models.
What techniques does the group suggest? Is spawning a new interpreter
for every request a bit too extreme (I don't mind the performance hit).
Has anyone ever done anything like this? How did you get around these
problems?
Thanks,
Brandon