program to copy files - problems - unix ksh to java

K

kaeli

Hi all,

I've got a shell script (ksh) that does some file copying and deleting.
I'm running into some problems with it that I'm wondering if I can
solve. Since I plan on re-writing it with java (1.4), I figure I might
as well do it right this time.

Here's the drill:
Cron runs this code every 5 minutes.
Program looks on one machine, uses ssh to copy a file to another
machine, changes the filename, the owner and permissions (chmod), and
then deletes the file from the source machine.
Sounds simple enough...

Problems:
Large files (we're talking gigabytes) take more than 5 minutes to copy.
Something is causing the file to be deleted before it has finished
copying. We lose the whole file, as it doesn't show up on either
machine.
Program gets called again while an instance is running, so it tries to
copy files that are currently being copied.

I was going to solve this with the usual .running type fix, but we
really need the program to actually run every 5 minutes (more than one
instance will be needed). If an instance is already copying the file,
the file should just be ignored. The file should not be deleted if the
copy hasn't finished.

Does anyone know of any system stuff I should be looking at for java?
Specifically, Unix Solaris interface so I can tell if a file is in use?
Also, how can I make sure that the copy was finished before deleting the
source? I expected the script to wait for the copy to finish before
deleting, but it appears that it is not doing that. Should I use threads
for this?

Thanks for any ideas, input, etc...

--
 
H

hiwa

kaeli said:
Hi all,

I've got a shell script (ksh) that does some file copying and deleting.
I'm running into some problems with it that I'm wondering if I can
solve. Since I plan on re-writing it with java (1.4), I figure I might
as well do it right this time.

Here's the drill:
Cron runs this code every 5 minutes.
Program looks on one machine, uses ssh to copy a file to another
machine, changes the filename, the owner and permissions (chmod), and
then deletes the file from the source machine.
Sounds simple enough...

Problems:
Large files (we're talking gigabytes) take more than 5 minutes to copy.
Something is causing the file to be deleted before it has finished
copying. We lose the whole file, as it doesn't show up on either
machine.
Program gets called again while an instance is running, so it tries to
copy files that are currently being copied.

I was going to solve this with the usual .running type fix, but we
really need the program to actually run every 5 minutes (more than one
instance will be needed). If an instance is already copying the file,
the file should just be ignored. The file should not be deleted if the
copy hasn't finished.

Does anyone know of any system stuff I should be looking at for java?
Specifically, Unix Solaris interface so I can tell if a file is in use?
Also, how can I make sure that the copy was finished before deleting the
source? I expected the script to wait for the copy to finish before
deleting, but it appears that it is not doing that. Should I use threads
for this?

Thanks for any ideas, input, etc...

--
Are you syncing or flushing with proper synchronization?
 
H

Harald Kirsch

kaeli said:
Cron runs this code every 5 minutes.
Program looks on one machine, uses ssh to copy a file to another
machine, changes the filename, the owner and permissions (chmod), and
then deletes the file from the source machine.
Sounds simple enough...

Problems:
Large files (we're talking gigabytes) take more than 5 minutes to copy.
Something is causing the file to be deleted before it has finished
copying. We lose the whole file, as it doesn't show up on either
machine.

Hard to believe on a *NIX machine as the system allows to delete
files held open by other processes. It may be a possible problem
if the file is on NFS. In any case it
sounds like the first process, when finished, deletes the file,
while the 2nd copy process, started while the first was still
running, then gets in trouble and messes things up.
Program gets called again while an instance is running, so it tries to
copy files that are currently being copied.

A solution might be to rename the file locally *before*
copying. This way the process starting 5 minutes later will
not pick up the same file again. If this is not an option,
create an empty file with another extension than the
big file as a mark that the file is being worked on.
Also, how can I make sure that the copy was finished before deleting the
source? I expected the script to wait for the copy to finish before
deleting, but it appears that it is not doing that.

Assuming that you do *not* start the copy process in
the background (&), the script does wait. You have to
look for a different reason why the file is deleted
too early, maybe as I suggested above.

There is no need to solve this task in Java.

Harald.
 
K

kaeli

Hard to believe on a *NIX machine as the system allows to delete
files held open by other processes. It may be a possible problem
if the file is on NFS. In any case it
sounds like the first process, when finished, deletes the file,
while the 2nd copy process, started while the first was still
running, then gets in trouble and messes things up.

That's probably it.
A solution might be to rename the file locally *before*
copying. This way the process starting 5 minutes later will
not pick up the same file again. If this is not an option,
create an empty file with another extension than the
big file as a mark that the file is being worked on.

That won't help.
The code copies any files in a directory on one machine to a directory
on another. So it will still grab the file. The code would have to move
the file, which is already the problem.
There is no need to solve this task in Java.

I need threading (I think) because right now, the solution is to not run
two instances of the code at the same time. That is not a good solution.
We need code that runs almost continuously, looking in directories and
copying and deleting the files.
(it's a DMZ, in case that helps you see why this needs to be done - it
takes files people uploaded and moves them to a machine inside our
firewall)

So, as far as I see, I need C or Java, and I've not coded C in over a
year. :)

We want a process that runs pretty much all the time. I'm thinking a
program that looks in directories over and over. When it finds a file,
it starts a thread that copies it then deletes it. As part of the
thread, it can put the name of the file in a vector. Any new threads
check that vector before bothering a file...

I dunno, am I way off on that?

--
 
T

Thomas Weidenfeller

kaeli said:
I've got a shell script (ksh) that does some file copying and deleting.
I'm running into some problems with it that I'm wondering if I can
solve. Since I plan on re-writing it with java (1.4), I figure I might
as well do it right this time.

There are several ways to fix this (to a "good enough" level), without
using Java. One example:
Here's the drill:
Cron runs this code every 5 minutes.
Program looks on one machine,

Check for the particular file. If found, rename the file to something
temporary - all in one operation. E.g. (Bourne-Shell syntax):

if [ mv "$file" "$file.$$" ] ; then
# Found file, renamed it.
# can start copying
# A second invocation will not find
# this file any more, and leave it alone.
uses ssh to copy a file to another fi

machine, changes the filename, the owner and permissions (chmod), and
then deletes the file from the source machine.

Delete the renamed file instead.
Sounds simple enough...

It is. You might want to add a sanity check which e.g. runs once a day
and checks if there are old renamed files lying around and either
collect them, or delete them.

Other solutions include setting empty files as markers to indicate if a
file is already copied. But this can result in a race condition if you
start the script multiple times at the same time:

if [ ! -f "$file.mark" ] ; then
# race condition can happen here

# place a mark
touch "$file.mark"
# now copy

# after copy, delete
rm "$file" "$file.mark"
fi

Instead of setting the marker on the remote machine, you could also set
the marker on the local machine, but you would have to add the remote
host name in order to distinguish the markers.

Another solution would be to separate the script into two scripts. One
doing the copying, and another one checking if there is already a
copying script running for a particular remote machine. Have fun with ps
or pgrep.
Problems:
Large files (we're talking gigabytes) take more than 5 minutes to copy.
Something is causing the file to be deleted before it has finished
copying.

There is something else wrong. Try to find this "something" first. Most
likely it is the application writing the file, or there is something
wrong in your script. Unix is robust when it comes to the deletion of
files which are currently in use. A deletion during a copy should not
affect the copy.
I was going to solve this with the usual .running type fix, but we
really need the program to actually run every 5 minutes (more than one
instance will be needed). If an instance is already copying the file,
the file should just be ignored. The file should not be deleted if the
copy hasn't finished.

You do check the exit code of the copy command, don't you?
Does anyone know of any system stuff I should be looking at for java?

There is absolutely no need for Java. In fact, you will find that you
gain nothing by using Java, but that you will e.g. get problems in
setting the file owner and mode. You would have to invoke the Unix
commands from Java via exec(), or the system calls via JNI.
Specifically, Unix Solaris interface so I can tell if a file is in use?

Java has no public platform interface, not even on Sun. You would have
to use exec() or JNI.
Also, how can I make sure that the copy was finished before deleting the
source?

Check the return code of the copy command.

May I suggest a good book for learning Unix scripting and a lot of other
Unix command-line tricks? "Unix Power Tools" by Peek, O'Reilly, and
Loukides.
I expected the script to wait for the copy to finish before
deleting, but it appears that it is not doing that. Should I use threads
for this?

You already have concurrency problems, and you want to use threads to
move your concurrency problems to another level? I would not do this.

/Thomas
 
H

Harald Kirsch

kaeli said:
That won't help.
The code copies any files in a directory on one machine to a directory
on another. So it will still grab the file. The code would have to move
the file, which is already the problem.

I am still not convinced that renaming would not work. Isn't there
a directory on the source machine which does not have to be copied.
You 'mv' (rename) the files to be copied to this directory and
then copy them to their destination from there in the background.
We want a process that runs pretty much all the time. I'm thinking a
program that looks in directories over and over. When it finds a file,
it starts a thread that copies it then deletes it. As part of the
thread, it can put the name of the file in a vector.

Don't forget to delete the file name from the vector, once it is
done. And a Set would actually be more appropriate than a Vector.
And if you go for Java 1.5, you'll find BlockingQueue which is
what you really want.

Harald.
 
K

kaeli

I am still not convinced that renaming would not work. Isn't there
a directory on the source machine which does not have to be copied.
You 'mv' (rename) the files to be copied to this directory and
then copy them to their destination from there in the background.

Same issue. What if in the middle of the move to the other directory,
the cron calls the code again. It still sees the file in DIR_A, even
though it's currently being copied to DIR_B. It starts to move it, but
in the middle of that move, the first invocation finishes it's move,
deleting the file from DIR_A. The first invocation may then copy to the
other machine, I suppose, but what happens when the second invocation
tries to move a file that no longer exists? Or even worse, overwrites
the destination on the new machine with an empty or half-empty file?

Currently, this problem is being handled with lockfiles. We don't like
that way if we can find another.

--
 
H

Harald Kirsch

kaeli said:
Same issue. What if in the middle of the move to the other directory,
the cron calls the code again. It still sees the file in DIR_A, even

If the two directories are on the same file system, moving always
takes the same time, independent of file size. It may take a few
milliseconds and I cannot imagine a scenario where it takes
5 minutes, except if the whole machine (OS/hardware) is in
deep trouble anyway.

Harald.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top