Monitoring directory for new files on Solaris

M

Mikhail Teterin

Hello!

An application we need to write has to monitor a directory and process any
files appearing in there -- as quickly as possible.

The usual approach to this task is to rescan the directory every once in a
while, which is horrible -- it loads the system keeping the program in RAM
and still has latency measured in seconds...

How can I ask the kernel to notify my program, when the directory is updated
in any way -- via something like poll(), or select(), or

http://developers.sun.com/solaris/articles/event_completion.html

?

Is there a Java way of doing this already (without periodic polling), or
will I need to make my own JNI class?

Thanks for any pointers!

-mi
 
W

Wolfgang

Mikhail said:
Hello!

An application we need to write has to monitor a directory and process any
files appearing in there -- as quickly as possible.

The usual approach to this task is to rescan the directory every once in a
while, which is horrible -- it loads the system keeping the program in RAM
and still has latency measured in seconds...

what is a rescan in your sense? to scan an mostly empty dir, why should
that be slow, dont you empty dir after processing? Which language your
app use, pure java?

first idea (not java): install gnu find, scan all 5 seconds for file
newer then 5 sec., but take care (read last sentence).

second idea: in filesystem terms: just look for changes in the
directory, like (again just system commands): "cat /dir > /tmp/dirread1
; sleep 5 ; cat /dir > /tmp/dirread2 ; diff dirread1 dirread2" (sorry,
at the moment I can not proof what cat does on dirs, maybe dd does the
trick).
are there any routines in C or java for just reading the dir themself,
not the inodes in it?

In my opinion: the more difficult part in your software are the file
locking during processing the files and to ashure not to process files
which are still in access of other processes, e.g. which still not
complete or just not closed by the writing process.

W.
 
M

Mark Thornton

Mikhail said:
Hello!

An application we need to write has to monitor a directory and process any
files appearing in there -- as quickly as possible.

The usual approach to this task is to rescan the directory every once in a
while, which is horrible -- it loads the system keeping the program in RAM
and still has latency measured in seconds...

How can I ask the kernel to notify my program, when the directory is updated
in any way -- via something like poll(), or select(), or

http://developers.sun.com/solaris/articles/event_completion.html

?

Is there a Java way of doing this already (without periodic polling), or
will I need to make my own JNI class?

Thanks for any pointers!

-mi

I seem to remember that on Unix the modification time of the directory
normally changes when there is any change to its content. This may be a
bit quicker than listing the content. While repeated listing the
directory seems ugly, it isn't all that slow or inefficient (even from
Java) as everything is cached. This is especially true if processed
files are removed, thus leaving the directory empty.
Warning, the directory modification time does not change on Windows, so
don't try this approach there.

Mark Thornton
 
M

Mark Clements

Hello!

An application we need to write has to monitor a directory and process
any
files appearing in there -- as quickly as possible.

The usual approach to this task is to rescan the directory every once in
a
while, which is horrible -- it loads the system keeping the program in
RAM
and still has latency measured in seconds...

How can I ask the kernel to notify my program, when the directory is
updated
in any way -- via something like poll(), or select(), or

http://developers.sun.com/solaris/articles/event_completion.html

?

Is there a Java way of doing this already (without periodic polling), or
will I need to make my own JNI class?

from the docs for JNotify:

"JNotify works on both Windows (Windows 2000, XP, Vista) and Linux with
INotify support (Kernel 2.6.14 and above)."

This probably won't help you much if you're on Solaris.

java-fam may give you some ideas.

Mark
 
V

Victor Sudakov

In comp.unix.solaris Mikhail Teterin said:
An application we need to write has to monitor a directory and process any
files appearing in there -- as quickly as possible.
The usual approach to this task is to rescan the directory every once in a
while, which is horrible -- it loads the system keeping the program in RAM
and still has latency measured in seconds...
How can I ask the kernel to notify my program, when the directory is updated
in any way -- via something like poll(), or select(), or

Is there something like kqueue() in Solaris?
In FreeBSD, many people find the wait_on utility very useful.
See http://freebsd.unixfreunde.de/sources/wait_on-1.1.tar.gz
 
W

Wolfgang

Wolfgang said:
second idea: in filesystem terms: just look for changes in the
directory,

I forgot (or in german: den Wald vor lauter Baeumen nicht sehen):
as mentioned in another post, unix has also for directorys as for every
inode the atime/mtime/ctime. I think to monitor just the mtime of the
dir give you the trigger to scan for new files, if you dont have subdirs.

W.
 
V

victorfeng1973

what is a rescan in your sense? to scan an mostly empty dir, why should
that be slow, dont you empty dir after processing? Which language your
app use, pure java?

first idea (not java): install gnu find, scan all 5 seconds for file
newer then 5 sec., but take care (read last sentence).

GNU find has option -cmin, but I could not find option for second.

Victor
 
G

Gordon Beaton

An application we need to write has to monitor a directory and
process any files appearing in there -- as quickly as possible.

The usual approach to this task is to rescan the directory every
once in a while, which is horrible -- it loads the system keeping
the program in RAM and still has latency measured in seconds...

Three suggestions:

- Poll the last modified time of the directory itself at a suitable
interval. It will be updated when changes are made to the directory
(e.g. files added, removed, renamed). When you see that something
has changed, rescan the directory to see what it was.

- I believe Solaris has an mechanism for subscribing to file system
events. Write a JNI wrapper for that.

- Modify the application producing these files, so that it can notify
your application.

/gordon

--
 
G

Gordon Beaton

GNU find has option -cmin, but I could not find option for second.

Remember what the most recent file was that you processed, and use
"-cnewer" instead. If you have to remove the files as you process
them, you can create a marker file of your own solely for this
purpose.

/gordon

--
 
G

Gordon Beaton

Remember what the most recent file was that you processed, and use
"-cnewer" instead. If you have to remove the files as you process
them, you can create a marker file of your own solely for this
purpose.

On the other hand, there is no need to use find when your existing
application already does exactly this kind of scanning. Find doesn't
offer any shortcuts here.

/gordon

--
 
C

Casper H.S. Dik

This is not official, I presume. Not supported by Sun, is it?


What do you mean by "not official"? It's not in Solaris 10 or
earlier if that is what you mean.


Casper
 
V

Victor Sudakov

In comp.unix.solaris Casper H.S. Dik said:
What do you mean by "not official"? It's not in Solaris 10 or
earlier if that is what you mean.

I mean it's not from Sun but from some volunteer contributor, is it?
 
M

Mikhail Teterin

Wolfgang said:
what is a rescan in your sense? to scan an mostly empty dir, why should
that be slow, dont you empty dir after processing? Which language your
app use, pure java?
first idea (not java): install gnu find, scan all 5 seconds for file
newer then 5 sec., but take care (read last sentence).

The actual rescan is not difficult to implement -- as others (and yourself)
have pointed out, one just needs to check the directory's mtime.

However, the problems common with all /periodic/ algorithms is that they are
inherently inefficient. You want to be event driven: getting woken up, when
stuff happens, instead of -- as bored children do to their parents when
riding in a car -- continuously asking the kernel: "Are we there yet? No?
How about now? Are we there yet?"

For example, you suggested 5 seconds, so let's go with that... Polling
means, your program must always reside in memory -- the OS can not really
page it out, because the program needs to run every 5 seconds. On the other
hand, your average latency will still be 2.5 seconds (and up to 5), which
may well be too long for your application...

Moving the 5 seconds figure into any direction eases one of these two
negatives while exacerbating the other...

Better OSes have a way for programs to request notification from the OS --
the kernel knows, when it changes things, so why can't it tell the process,
if asked nicely? Unfortunately, Solaris has not joined this club yet,
according to the link kindly posted by Ian Collins in this thread:

http://blogs.sun.com/praks/entry/file_events_notification

For now, however, no good solution appears to exist and I will have to use
polling...

Yours,

-mi
 
D

Darren Dunham

In comp.unix.solaris Mikhail Teterin said:
http://blogs.sun.com/praks/entry/file_events_notification

For now, however, no good solution appears to exist and I will have to use
polling...

It's not the best solution, but you could use 'dtrace' on Solaris 10.
It would not require polling, but recieves probes as events. That said,
I don't think you can guarantee its correctness. You'd want to fall
back on polling (at a longer frequency) as a safety check.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top