Exposing Excel as a Webservice

B

Brandon

Hi all,

I'm currently working on a project where we have a need to expose an
Excel spreadsheet on the web as a webservice that needs to be reliable
and available 24x7x365. I've implemented a prototype of this in python
using the win32com library to interface with the Excel Automation API
via COM. I've also used web.py as a way to expose this automation
functionality to the network. This works just fine as a prototype that
proves that it is possible to do something like this.

I'm not revisiting the code with the goal of making it more reliable
and available, able to support multiple users (although only allowing
one at a time to use Excel). This means that Excel is going to
potentially be used from multiple threads -- something I know that has
historically caused problems for a lot of people based on my searches
of the newsgroup. What I'd like to know is what is the proper strategy
that I should be taking here? I see several options:

1) Ignore threads entirely. Let every incoming web thread interface
directly with excel (one web thread at a time -- I'll have an
interpreter scoped lock preventing multiple threads from accessing).

2) Dedicate a single worker thread to do all interaction with Excel.
Pass data off to this thread from the incoming web thread via shared
memory and let the worker thread get Excel to process it. This will
ensure that exactly 1 thread ever interacts with Excel.

3) Spawn a new interpreter (literally spawn another python process)
for each incoming web thread. This will ensure that when a process is
finished with Excel there is no chance of any lingering state or
something improperly cleaned up.

I've actually attempted to implement 1) and 2) and have had problems
with sporadic failures that cause the Excel to no longer be accessible
from the interpreter. I've tried very hard to ensure that
pythoncom._GetInterfaceCount() always returns 0 when I believe I'm
finished with excel. I don't think I'm perfect at always getting a 0,
but I'm pretty close. I believe that some of my problems are caused by
not being perfect about this but haven't been able to track down the
leaking references. I'm definitely not an expert in COM or its
threading models.

What techniques does the group suggest? Is spawning a new interpreter
for every request a bit too extreme (I don't mind the performance hit).
Has anyone ever done anything like this? How did you get around these
problems?

Thanks,
Brandon
 
U

utabintarbo

Disclaimer: I am not an expert in python, or even programming, for that
matter....

In any case, option #2 sounds like the most theoretically sound. It
sounds like you are using Excel as a database, and your worker thread
as a transaction queue.

Something to consider: do you really need to modify directly a
spreadsheet? Can you not use another format to feed a template
spreadsheet (ie. xml or csv), and then modify that file as a text file
without the overhead of Excel?

Just a thought....
 
B

Brandon

Thanks for the reply.

Unfortunately I cannot use a different format for the data since I'm
really using Excel as a calculation engine. I don't own the authoring
of these spreadsheets or even the data inside of them so I cannot
change the format. The spreadsheets also are complicated enough and
change frequently enough that it's not feasible for me to try to
extract the logic and data inside of them into a different calculation
engine (unless there's some existing tool which will do this for me?).
My program is just one user of the spreadsheets among hundreds of human
users. Unfortunately what's easy for the humans is always going to win
out over what's easy for my program.

My concern with my option 2) is that I appear to be experiencing some
level of contamination between calls that I can't resolve. It's not
truly transactional in that a call into the system guarantees that it
will leave everything how it found it. That's sort of why I thought up
option 3). Let the OS help me somewhat to clean up. I strongly
suspect that my option 2) is to some extent at the mercy of the python
garbage collector. I'm not positive of that though.


Thanks,
Brandon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top