Need a method for detecting when files are written to a directory

E

Elliott

Hi all:

Perhaps someone can help a relatively new Java programmer. I am
writing an application that sits in a network and processes files when
they are written to a designated directory. I am wondering if there
is some elegant way to do this. I realize I can create a loop in
which the process sleeps for say, five seconds and then checks to see
if a directory has a file in it and then performs an action
appropriately. However, I wonder if there is a better way to do
this. If there is some way that a file being written to a designated
directory triggers some kind of event that wakes my program up to
begin processing, that would be better, but how does one do this?

Any help would be greatly appreciated.

Thanks,

Elliott
 
M

Michael Borgwardt

Elliott said:
Perhaps someone can help a relatively new Java programmer. I am
writing an application that sits in a network and processes files when
they are written to a designated directory. I am wondering if there
is some elegant way to do this. I realize I can create a loop in
which the process sleeps for say, five seconds and then checks to see
if a directory has a file in it and then performs an action
appropriately. However, I wonder if there is a better way to do
this. If there is some way that a file being written to a designated
directory triggers some kind of event that wakes my program up to
begin processing, that would be better, but how does one do this?

This is possible on some operating systems, but only through very
OS-specific APIs, so you'd have to use JNI.
 
W

Will Hartung

Elliott said:
Hi all:

Perhaps someone can help a relatively new Java programmer. I am
writing an application that sits in a network and processes files when
they are written to a designated directory. I am wondering if there
is some elegant way to do this. I realize I can create a loop in
which the process sleeps for say, five seconds and then checks to see
if a directory has a file in it and then performs an action
appropriately. However, I wonder if there is a better way to do
this. If there is some way that a file being written to a designated
directory triggers some kind of event that wakes my program up to
begin processing, that would be better, but how does one do this?

Nothing portable. Your best bet is to regularly take a directory snapshot
and check last modification times for the files.

Compare those times to your last snapshot. Any files that haven't changed
for "some threshold" can be considered closed and safe to be moved,
otherwise they may still be filling up, for the file will appear in the
directory before it's actually done being written (which could be Bad).

We do just this using Unix commands, checking an input directory every
minute, and moving those that haven't changed in that minute to a staging
directory (since same filesystem moves are atomic), and then we have another
process that acts on all the files in the staging directory knowing that
they're safe files to work with.

Regards,

Will Hartung
([email protected])
 
S

Sudsy

Elliott said:
Hi all:

Perhaps someone can help a relatively new Java programmer. I am
writing an application that sits in a network and processes files when
they are written to a designated directory. I am wondering if there
is some elegant way to do this. I realize I can create a loop in
which the process sleeps for say, five seconds and then checks to see
if a directory has a file in it and then performs an action
appropriately. However, I wonder if there is a better way to do
this. If there is some way that a file being written to a designated
directory triggers some kind of event that wakes my program up to
begin processing, that would be better, but how does one do this?

Ya know, this is the kind of application for which JMS is ideally
suited. Your client creates the file then sends a notification on
a queue which drives invocation of onMessage on a regular client
or Message Driven Bean which then processes the file.
Just an idea, after all it does address "asynchronous" processing
with loose-coupling.
 
D

David Hilsee

Sudsy said:
Ya know, this is the kind of application for which JMS is ideally
suited. Your client creates the file then sends a notification on
a queue which drives invocation of onMessage on a regular client
or Message Driven Bean which then processes the file.
Just an idea, after all it does address "asynchronous" processing
with loose-coupling.

Bonus points if the file can be eliminated altogether by placing its
contents in the message.
 
S

Sudsy

David said:
"Sudsy" <[email protected]> wrote in message

Bonus points if the file can be eliminated altogether by placing its
contents in the message.

David,
I agree with one proviso: large files shouldn't be queued, IMHO.
If you're talking in the range of a few to a few tens of kilobytes
then I'm in complete agreement. I actually almost suggested it my-
self! ;-) But when you start talking about MBs...
Glad to see that we're both on the same track. It's an elegant
solution to a recurring problem. Not universal perhaps, but worthy
of consideration. That's the only reason I suggested it in the first
place.
As I always say, "why reinvent the wheel?".
 
A

Albretch

you may as well, very easily, concieve a file sytem type of
functionality yourself in a Java-based portable way, by designing a
FAT kind of table in a database and coordinating the transfer of files
to the directory through a method that checks this table . . .

Most DBMSs (many of them open source/free) have triggers, too . . .
 
C

Chris Smith

Sudsy said:
Ya know, this is the kind of application for which JMS is ideally
suited. Your client creates the file then sends a notification on
a queue which drives invocation of onMessage on a regular client
or Message Driven Bean which then processes the file.
Just an idea, after all it does address "asynchronous" processing
with loose-coupling.

Maybe... if there are a dozen other things that work out in favor of JMS
as well. For example, asking clients to send a message to a JMS bean is
far more complex than asking them to drop a file into a directory. One
can be done by hand by a human being, while the other requires software.
One can be done by zillions of document-oriented programs including such
things as Microsoft Word, while the other requires using your own
proprietary software. Not to mention that you've just introduced an
application server to the architecture.

A migration from a simple architecture like the OP proposed to something
with JMS and message-driven beans ought to be guided by something
stronger than a distaste for polling.

In any case, I'm getting more and more tired of seeing people who were
hired to find the best way to do something default to an EJB solution
that is an order of magnitude more complex than needed, just because
that's what they are familiar with. I'm afraid that this reply, taken
at face value, encourages this.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
S

Sudsy

Chris Smith wrote:
In any case, I'm getting more and more tired of seeing people who were
hired to find the best way to do something default to an EJB solution
that is an order of magnitude more complex than needed, just because
that's what they are familiar with. I'm afraid that this reply, taken
at face value, encourages this.

You are completely wrong. I prefer to suggest the simpler solutions when
applicable. I recommend an Apache/Tomcat combination over a J2EE server
in most cases, for example.
But I also recognize "appropriate uses of technology". The problem of
how to detect when a file transferred via FTP is ready for processing is
one which arises frequently. Various parties have suggested differing
approaches over the years. A brief perusal of the archives will prove
this out.
Messaging systems which de-couple clients and servers have been around
a lot longer than J2EE; take a look at IBM's MQSeries. They obviously
address a need. Hence my suggestion to at least consider it as an
option.
I didn't say that it was the right or only approach, merely something
which might fit into a complex, heterogeneous architecture.
It's the same with Web Services or XML; they could be considered over-
kill in many scenarios but they DO have their place.
YMMV.
 
D

David Hilsee

Sudsy said:
David,
I agree with one proviso: large files shouldn't be queued, IMHO.
If you're talking in the range of a few to a few tens of kilobytes
then I'm in complete agreement. I actually almost suggested it my-
self! ;-) But when you start talking about MBs...
Glad to see that we're both on the same track. It's an elegant
solution to a recurring problem. Not universal perhaps, but worthy
of consideration. That's the only reason I suggested it in the first
place.
As I always say, "why reinvent the wheel?".

Large files are a tough call. When I used the word "if", I was thinking of
possible performance issues that might arise if the message were very large.
Some implementations might handle messages that are on the order of MBs just
fine. If the JMS provider can handle large messages, and your application
has no problem with dealing with them, then I think it's best to go with it.
I'd probably change the solution only if the implementation could not handle
it.
 
J

Jacob

Elliott said:
Any help would be greatly appreciated.

Polling is the only portable way (unless you control the
file writing process in some way), and is sufficient in
most cases.

Look at the FileMonitor class at:

http://geosoft.no/software/index.html#filemonitor

It accepts any File (including a directory) object. Monitoring
a directory will cause an event if a file is created/deleted/
modified within that directory.
 
D

David Hilsee

Chris Smith said:
Maybe... if there are a dozen other things that work out in favor of JMS
as well. For example, asking clients to send a message to a JMS bean is
far more complex than asking them to drop a file into a directory. One
can be done by hand by a human being, while the other requires software.
One can be done by zillions of document-oriented programs including such
things as Microsoft Word, while the other requires using your own
proprietary software. Not to mention that you've just introduced an
application server to the architecture.

A migration from a simple architecture like the OP proposed to something
with JMS and message-driven beans ought to be guided by something
stronger than a distaste for polling.

In any case, I'm getting more and more tired of seeing people who were
hired to find the best way to do something default to an EJB solution
that is an order of magnitude more complex than needed, just because
that's what they are familiar with. I'm afraid that this reply, taken
at face value, encourages this.

Your complaints may have merit. However, there are a lot of things to
consider. For example, is the application that produces this file
JMS-aware? Can it understand MOM? Is it even written in Java? Can it be
modified? If it can only copy a file into a directory, then I agree that
JMS may not be appropriate. If the creator of the file is just an
application that Elllott wrote where he really wanted to use asychronous IPC
and chose to do it by copying a file into a shared directory, then I bet
there are better solutions to his problem than monitoring the filesystem.
 
J

Jacob

Chris said:
In any case, I'm getting more and more tired of seeing people who were
hired to find the best way to do something default to an EJB solution
that is an order of magnitude more complex than needed, just because
that's what they are familiar with. I'm afraid that this reply, taken
at face value, encourages this.

Yes.

Throwing as much technology as possible at problems seems
to be the norm these days. I have been in "enterprise" projects
where young professionals are so focused on technical issues
that they tend to forget the domain problem they are up to.
Include the vast technical overhead in the typical modern
development environments. All this has a huge impact on project
cost, quality, maintainability and longevity (and clutter
programmer CVs with _tools_ rather than _skills_).

I favor the approach of solving problems with as little
technology as possible.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top