zodb troubles - seeking advice for app design

  • Thread starter Diez B. Roggisch
  • Start date
D

Diez B. Roggisch

Hi,

the docs on zodb are spare, so I try my luck in this ng.

I'm currently developing a application server that exposes its functionality
using corba/onmiORB. The basic purpose of this app is to take a chunk of
text, classify it using the crm114 text classificator and return the
probabilities of the text belonging to a certain class for all defined
classes.

Now as the classification is rather expensive (around 400ms on my current
machine) I want to store the text chunk in a zodb and serve the
probabilities from there the next time. Another reason for this is that for
training the crm114, I need a base of items to rely upon. And if an item
gets classified wrong, a manual override must be possible.


Everything works fine - in a single threaded app. But under load, the
omniorb will dispatch the incoming calls on several worker threads. zodb
requires that for each thread a separate connection has to be used. I'm not
sure how bad sharing the connection would be, but can imagine that this
isn't the best idea....

Now my problem is that the data object instances are separate for each
connection. So changes made in one thread aren't reflected in the instances
of other threads.

I've created some metaclass-magic to create proxy-objects that will access a
data-item created local to each thread - that works in a test, but means
opening and closing the db for each incoming call. No idea how that affects
performanec.

And in the app server itself I end up having broken persistent instances - a
problem google also doesn't tell me much about.

Now I'm somewhat lost on how to actually implement my server - I could share
the connection and data objects amongst my threads and commit changes on a
regular basis in a background worker thread that locks the connection for
the needed time. That makes me lose all transactional benefits. Or I follow
my already started path, and try to access all data transparently through
proxies - but the connection-handling-stuff gets tough. Another way might
be to share one connection for every user over several calls and thus
threads, but the base problem remains: how to synchronize the object graph.

Any advices? I also thought about ditching zodb for postgres, but its my
current believe that the actual problems remain.

No idea if you people out there can help me - but I have to admit that I'm
quite lost, so maybe someone can prod me in the right direction or show me
that my whole idea is flowed and I can go for something totally
different....
 
D

Duncan Grisby

Diez B. Roggisch said:
Everything works fine - in a single threaded app. But under load, the
omniorb will dispatch the incoming calls on several worker threads. zodb
requires that for each thread a separate connection has to be used. I'm not
sure how bad sharing the connection would be, but can imagine that this
isn't the best idea....

Now my problem is that the data object instances are separate for each
connection. So changes made in one thread aren't reflected in the instances
of other threads.

If you want to stick with zodb, one option is to use a POA with the
main thread model threading policy. That will dispatch all calls with
the main thread, so your access to zodb will be single threaded. Of
course, that will kill the performance gain you would have had from
the threads.

If you're not set on zodb, another option is to use Berkeley DB. That
works very nicely in a multi-threaded environment with omniORB.

Cheers,

Duncan.
 
D

Dieter Maurer

Diez B. Roggisch said:
the docs on zodb are spare, so I try my luck in this ng.

You did not look well enough!

There is the "zodb3.pdf" document and a "ZODB guide" (maybe "ZODB
tutorial"). At lease they cover your essential question...
...
Everything works fine - in a single threaded app. But under load, the
omniorb will dispatch the incoming calls on several worker threads. zodb
requires that for each thread a separate connection has to be used. I'm not
sure how bad sharing the connection would be, but can imagine that this
isn't the best idea....

It is very bad!

The ZODB does not provide a locking facility.
It would need to if several threads could change concurrently
the same data. This would drastically complicate ZODB usage.

Instead, the ZODB gives each connection its own (partial) copy
of the ZODB content. Threads work on this copy only.
The ZODB expects that no two threads access the same connection
at the same time. If this condition is meet, then different threads never
modify the same data.

Of course, different threads can modify different copies of
the same ZODB object. However, the ZODB will recognize such
a fact during the (second) commit and raise a ConflictError
in this case.

As you see, different threads *MUST NOT* use the same connection.
Open a new connection for each thread (and close it before
the thread dies).


Dieter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top