Searching the web and books for information on this, I can't seem to
find a definitive yes or no to my question:
"Is there a portable way to do file locking?"
Some of the problems that are mentioned are that NFS systems make
almost everything non-atomic, locking methods often depend on fcntl
which is not available on all systems, and the dreaded race
condition when test and set are non-atomic.
Ruby can be used on the Mac, PC, and Unix, so I'm really after
something that portable. I can't use a Mutex because I need this to
be exclusive across process boundaries (several invokations of the
program).
My searching suggests this is a common problem, but the answer to it
is rare!
Thank you
Hugh
i been doing alot of experiments with locking myself, mainly on nfs systems
for some designs for a distributed work queue i'm working on, and have come to
largely the same conclusions. however, you defintely want fcntl based locking
for NFS systems. as far as i know any posix compliant sytem has fcntl but i'm
a windows dummy (windows people insert correction)...
you might want to check out a few things i've done - most of them were done
__very__ quickly and further testing is in order but:
* c ext to replace File.flock with fcntl based impl
http://www.codeforpeople.com/lib/ruby/posixlock/
* a simpler, but less portable?, pure ruby solution provided by matz
http://www.codeforpeople.com/lib/ruby/nfslock/
* interface to liblockfile (man 1 lockfile)
http://www.codeforpeople.com/lib/ruby/lockfile/
the tests i've been running (day at a time) consist of multiple processes on
multiple hosts competing to update a queue in an ordered fashion... if the
queue is ever out of order, or a marshall error is thrown, the test 'fails'.
i also mark the times each node aquires the lock and gather stats on the
min/max/avg time required to obtain the lock. i've run using all three
methods above, plus system calls to lockfile, for my locking mechanism and
have the following observation
* they all work on nfs - i get a core dump every now and again in the
liblockfile impl which is almost certainly a bug in my own code
* lockd sucks at giving at sort of 'even' distribution to the processes,
what i generally see is one node hogging the lock for a while, then
eventually lockd seems to realize this and give it another node for a
while. for my uses this is not a big deal since the competition in
production would not actually be that fierce... it DOES work though with
a sufficiently new lockd impl or a rather expensive netap...
* the max time between locks for 6 or so process competing for a fcntl based
lock on our systems is around 30 seconds
* lockfile seems to work really well - given max/min/avg of about 1 sec for
all nodes. this really suprised me.
* the big drawback to lockfiles is potential hangs and inability to grant
read-locks. there is serious locking package on CPAN which claims to do
this (read/write nfs safe lockfiles) at
http://search.cpan.org/~bbb/File-NFSLock-1.20/lib/File/NFSLock.pm
the idea of this seems quite sketchy. i have not tested it.
if you are interested in my test code drop me a line - it's one script that
you run on all the node, and a monitoring script that goes with it.... nice
a terrible like my testing code tends to be...
in any case - i would think implementing the algorithim used by liblockfile in
ruby might be a good solution. the hard work at making things portable has
been done for you by matz and co. i made a stab at that (it's in the lockfile
package) but it is NOT finished... i should probably take it out of there...
i'm very interested in any findings you have along these lines. please keep
us informed.
-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| ADDRESS :: E/GC2 325 Broadway, Boulder, CO 80305-3328
| URL ::
http://www.ngdc.noaa.gov/stp/
| TRY :: for l in ruby perl;do $l -e "print \"\x3a\x2d\x29\x0a\"";done
===============================================================================