Exposing Excel as a Webservice

Brandon · Sep 14, 2006

Hi all,

I'm currently working on a project where we have a need to expose an
Excel spreadsheet on the web as a webservice that needs to be reliable
and available 24x7x365. I've implemented a prototype of this in python
using the win32com library to interface with the Excel Automation API
via COM. I've also used web.py as a way to expose this automation
functionality to the network. This works just fine as a prototype that
proves that it is possible to do something like this.

I'm not revisiting the code with the goal of making it more reliable
and available, able to support multiple users (although only allowing
one at a time to use Excel). This means that Excel is going to
potentially be used from multiple threads -- something I know that has
historically caused problems for a lot of people based on my searches
of the newsgroup. What I'd like to know is what is the proper strategy
that I should be taking here? I see several options:

1) Ignore threads entirely. Let every incoming web thread interface
directly with excel (one web thread at a time -- I'll have an
interpreter scoped lock preventing multiple threads from accessing).

2) Dedicate a single worker thread to do all interaction with Excel.
Pass data off to this thread from the incoming web thread via shared
memory and let the worker thread get Excel to process it. This will
ensure that exactly 1 thread ever interacts with Excel.

3) Spawn a new interpreter (literally spawn another python process)
for each incoming web thread. This will ensure that when a process is
finished with Excel there is no chance of any lingering state or
something improperly cleaned up.

I've actually attempted to implement 1) and 2) and have had problems
with sporadic failures that cause the Excel to no longer be accessible
from the interpreter. I've tried very hard to ensure that
pythoncom._GetInterfaceCount() always returns 0 when I believe I'm
finished with excel. I don't think I'm perfect at always getting a 0,
but I'm pretty close. I believe that some of my problems are caused by
not being perfect about this but haven't been able to track down the
leaking references. I'm definitely not an expert in COM or its
threading models.

What techniques does the group suggest? Is spawning a new interpreter
for every request a bit too extreme (I don't mind the performance hit).
Has anyone ever done anything like this? How did you get around these
problems?

Thanks,
Brandon

utabintarbo · Sep 15, 2006

Disclaimer: I am not an expert in python, or even programming, for that
matter....

In any case, option #2 sounds like the most theoretically sound. It
sounds like you are using Excel as a database, and your worker thread
as a transaction queue.

Something to consider: do you really need to modify directly a
spreadsheet? Can you not use another format to feed a template
spreadsheet (ie. xml or csv), and then modify that file as a text file
without the overhead of Excel?

Just a thought....

Brandon · Sep 15, 2006

Thanks for the reply.

Unfortunately I cannot use a different format for the data since I'm
really using Excel as a calculation engine. I don't own the authoring
of these spreadsheets or even the data inside of them so I cannot
change the format. The spreadsheets also are complicated enough and
change frequently enough that it's not feasible for me to try to
extract the logic and data inside of them into a different calculation
engine (unless there's some existing tool which will do this for me?).
My program is just one user of the spreadsheets among hundreds of human
users. Unfortunately what's easy for the humans is always going to win
out over what's easy for my program.

My concern with my option 2) is that I appear to be experiencing some
level of contamination between calls that I can't resolve. It's not
truly transactional in that a call into the system guarantees that it
will leave everything how it found it. That's sort of why I thought up
option 3). Let the OS help me somewhat to clean up. I strongly
suspect that my option 2) is to some extent at the mercy of the python
garbage collector. I'm not positive of that though.

Thanks,
Brandon

Excel Automation & Lots of Mapping Coordinates	0	Mar 9, 2022
I want to make such a page in which i can put my excel file.	1	Jun 23, 2023
How to loop in folder through all excel files and all sheets using pandas?	0	Dec 1, 2022
Collect Excel Data from Website	5	Apr 30, 2022
boost-python: exposing constructor with an array of other class asargument	0	Jan 28, 2014
Need a Programmer in javascript	5	Jul 24, 2023
Help with datascraping script	1	Aug 26, 2024
Anybody know what 'Shiply' use as Back-end/System?	1	Oct 2, 2023

Exposing Excel as a Webservice

Brandon

utabintarbo

Brandon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads