Concurrent data access management

C

carmelo

Hi everybody,
I developed a multiuser app which manages data stored on a MySql db. I
need to manage properly the following scenario of concurrent data
access:

1) user_1 opens a form to display a record data read from db
2) later user_2 opens the same record and updates it
3) user_1 updates the record


On this scenario are obviously created dirty data if I don't manage
properly it...
What do you suggest to do?


Thank you very much in advance for your help
Carmelo
 
M

Martin Gregorie

Hi everybody,
I developed a multiuser app which manages data stored on a MySql db. I
need to manage properly the following scenario of concurrent data
access:

1) user_1 opens a form to display a record data read from db 2) later
user_2 opens the same record and updates it 3) user_1 updates the record
Any database with decent transaction management should handle that
transparently, but there are two design issues:

1) be careful with auto-commit. You may be able to use it successfully,
but you'll have better control if you turn it off and use explicit
commits

2) If the user can read a record, look at it, scratch his head, get a
coffee and then make a change within the same transaction you *will*
get locking issues if the database is small. As the access key
population grows significantly bigger than the user population,
locking collisions become much less frequent, but their possibility
must not be ignored.

The best design will use two transactions:
- user reads data for display and immediately commits to release
read locks

- user wants to apply changes: when he hits 'enter' start an updating
transaction, re-read all rows, apply updates and commit without any
user interaction during this time.

This will work OK if the update merely makes additive changes to existing
rows, e.g., reduces a stock holding and inserts an associated invoice
line.

However, if the update /replaces/ values in an existing row you need
something better. Check whether the record is in use during the read
transaction and warn the user if it is. If not, mark the row(s) as 'in
use' and commit. Clear the 'in use marker' as part of the update
transaction and:
(a) make sure that you also unset it if the user decides not to
update anything
(b) design and implement an overnight batch process that clears
any left-over 'in use' markers as part of the daily backup.

Note that such 'in use' markers are application data. They are never
automatically managed by the DBMS and their management is non-trivial, so
if possible, design the database to avoid having to use them.
 
M

Mark Space

Martin said:
- user wants to apply changes: when he hits 'enter' start an updating
transaction, re-read all rows, apply updates and commit without any
user interaction during this time.


In a more general case, wouldn't you want to stop the transaction if you
detected any changed datum? Even if a record is not "in use," the fact
that someone else mucked with it might be reason enough to report that
fact to the user, and ask them to re-evaluate their own changes. This
might be the safest as a change elsewhere might prompt the user to make
a different change, or no change.

In a high use situation, I can see also allowing a user to make an
explicit lock on a database entry, or set of entries. This might be the
only way for them actually make a change with out getting stopped by the
above algorithm.

I'm sure folks who actually work with databases more than I do may have
other solutions too. E.g. some sort of queue that holds requests to
lock rows sets so that all users have a chance to update datum that is
very highly contested.
 
W

Wojtek

carmelo wrote :
Hi everybody,
I developed a multiuser app which manages data stored on a MySql db. I
need to manage properly the following scenario of concurrent data
access:

1) user_1 opens a form to display a record data read from db
2) later user_2 opens the same record and updates it
3) user_1 updates the record


On this scenario are obviously created dirty data if I don't manage
properly it...
What do you suggest to do?

Using transactions to lock the row is OK if you are sure that user1
will do the update in a timely fashion.

In the Web world you cannot make that assumption.

I keep a column named 'updated' which holds a timestamp. It is written
to everytime a change is made to the row.

So user1 reads the row (including the updated column). User2 reads the
row. User1 makes some chages and commits the row. Note that this
updates the database 'updated' value.

When user2 tries to update the row it fails because the user2 'updated'
value is different than the database value. EVERY update statement
contains:

where row_id=5 and updated='2009.01.01 13:23:21'

So if the update fails, I tell the user that someone else modified the
row, and I give them a chance to re-read the new information.
 
M

Martin Gregorie

Ugh. I thought the modern RDBMS take care of colliding transactions.
It's probably best to rely on the RDBMS telling you whether an update
was successful or not. In that case, the user/application gets to retry.
Yes, of course they can, but RULE #1 in an interactive environment is
"don't hold locks through a user interaction". This means that a simple
retrieve - inspect - make changes dialogue will always involve at least
two transactions, so it is up to the application to somehow mark the
affected record(s) to show they might get updated shortly. Wojtek's
solution works for the simple two transaction case and, indeed, is better
than mine.

However, his solution is probably insufficient to deal with a more
complex situation, e.g. making a purchase using a multi-item shopping
basket. You don't want to annoy the punter by suddenly telling them at
checkout or after the credit checks have being done that some of the
items they put in the basket are now out of stock, so you'll need to
reserve stock as they select stuff and to mark the items to show that
there are specific reserved allocations against them. IOW you're applying
updates that may need to be backed out throughout a multi-minute
transaction. There are actually more decision points along the way that
cause the purchase to be backed out and abandoned than there are that let
it go to completion (credit card declined, a dozy US vendor only telling
you he won't ship outside the States when you enter your address are two
examples). You've no possibility of holding row or table locks through
the entire process and a DMBS has no other way of knowing that a bunch of
DB changes belong to Joe's purchase than by holding locks. Hence, like it
or lump it, it *is* the application's job to take care of task
separation.

In this sort of situation I know that the marker technique will work but
only careful analysis on a case by case basis would tell whether
timestamping would also handle it successfully.
 
R

Roedy Green

Hi everybody,
I developed a multiuser app which manages data stored on a MySql db. I
need to manage properly the following scenario of concurrent data
access:

1) user_1 opens a form to display a record data read from db
2) later user_2 opens the same record and updates it
3) user_1 updates the record


On this scenario are obviously created dirty data if I don't manage
properly it...
atomic updates are mainly SQL's problem.

see http://mindprod.com/jgloss/sql.html#ATOMICUPDATES

read up on JDBC transactions.

see http://mindprod.com/jgloss/jdbc.html
http://mindprod.com/jgloss/transaction.html

google or scan your doc indexes for words like transaction, rollback,
commit.
--
Roedy Green Canadian Mind Products
http://mindprod.com

"Nationalism is an infantile disease. It is the measles of mankind."
~ Albert Einstein
 
C

carmelo

carmelo wrote :




Using transactions to lock the row is OK if you are sure that user1
will do the update in a timely fashion.

In the Web world you cannot make that assumption.

I keep a column named 'updated' which holds a timestamp. It is written
to everytime a change is made to the row.

So user1 reads the row (including the updated column). User2 reads the
row. User1 makes some chages and commits the row. Note that this
updates the database 'updated' value.

When user2 tries to update the row it fails because the user2 'updated'
value is different than the database value. EVERY update statement
contains:

where row_id=5 and updated='2009.01.01 13:23:21'

So if the update fails, I tell the user that someone else modified the
row, and I give them a chance to re-read the new information.


Thank you Wojtek for your answer, I think this is a good solution.
I'll implement it, but before of doing that manually I'd like to know
if this concurrent access management is embedded into my framework...
I'm using Netbeans 6.5.1 with JPA and TopLink. Do you think is there
any way to let the framework manage it?


Thank you again guys for your help
Carmelo


-
http://carmelosaffioti.blogspot.com
-
 
U

Uwe Plonus

bugbear said:
I assume you have some kind of "purger" in case user1 quits his browser
and walks over to the coffee machine, and NEVER completes the update.

BugBear

Why do you need a purger? If user1 only reads the daaset and then leaves
the browser the updated field is _not updated_. The update itself is
only done in a short time so this is a really perfect way to handle such
situations.

Uwe
 
C

carmelo

bugbear schrieb:






Why do you need a purger? If user1 only reads the daaset and then leaves
the browser the updated field is _not updated_. The update itself is
only done in a short time so this is a really perfect way to handle such
situations.

Uwe

I'm using JPA with TopLink, and I discovered that I can let the
framework manage the optimistic locking... It requires an additional
integer field on the database, lets call it VERSION, which is
incremented at each update. The update fails if the stored VERSION is
different respect the one read before. If the update fails is thrown
an exception.

I only had to add this code on the Entity bean:
@Version
protected int version;


Cheers
Carmelo

-
http://carmelosaffioti.blogspot.com
-
 
W

Wojtek

Uwe Plonus wrote :
Why do you need a purger? If user1 only reads the daaset and then leaves the
browser the updated field is _not updated_. The update itself is only done in
a short time so this is a really perfect way to handle such situations.

Actually, since the update to the 'updated' field is within the same
SQL as the 'where', the new value set is done in (effectively) zero
time so there is NO chance of two users updating at the same time. One
of them will lose the race.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top