I
Ian Esling
I have some code that polls directories, when it finds one or more
files in there it imports the contents into a database. Very
occasionally (like once in every 5000 files) it will pick up a file
and archive it without actually putting the contents into the
database, there's no error thrown anywhere and all the logging looks
like the file's been processed without a hitch. Putting the file back
into the import directory gets it imported and read correctly into the
database.
The logging consists of a couple of statements written to a logfile
saying what files it's found in what directory then the name of each
individual file as it processes them. There's also a table in the
database where we record the filenames, time of processing, how many
records were read in, how many contained errors etc. In the case of
this error occurring the log files look exactly how you'd expect if
the file had been imported correctly and the table in the database
shows it processed fine, however the numbers shown are zero (which is
actually correct, it did only process zero!) when it should have read
in at least one record.
The import process consists of moving each file to be processed from
the import directory into a working directory whilst it's worked on,
then moved again into an archive directory once it's done with.
I'm a bit baffled how this error could have occured without any
exceptions being thrown or logged, any suggestions welcome. I've got
a theory that it might be due to the file handling in the code, which
looks like this:
for (File file : filesToImport())
{
importFileHandlingExceptions(file);
}
public void importFileHandlingExceptions(File file)
{
log.debug("Importing file " + file);
try
{
importFile(file);
}
catch (Exception e)
{
handleImportException(file, e);
}
}
public void importFile(File file) throws IOException
{
file = workingDirectoryCreator.moveFileIntoDir(file);
importer.importFile(file, summaries);
file = archiveDirectoryCreator.moveFileIntoDir(file);
summaries.setArchiveFilename(file.getName());
}
public File moveFileIntoDir(File file) throws IOException
{
return moveToFile(file,
unusedFileFinder.findFile(file.getName()));
}
public static File moveToFile(File moveMe, File destinationFile)
throws IOException
{
boolean success = moveMe.renameTo(destinationFile);
if (!success)
{
throw new IOException("Unable to move " + inspectFile(moveMe)
+ " to " + inspectFile(destinationFile));
}
return destinationFile;
}
What I'm wondering is if it's due to picking up the file at the
beginning (in the for (File file... bit) and then the subsequent
processing is done on that variable being passed around. Occasionally
we might pick up that file before the process that ftps it into the
import directory has actually finished writing it, so at that moment
the file variable is actually holding an empty file. The subsequent
moving of the actual file works OK because they're small files and by
the time that code executes the ftp process has finished with it and
released it, but when we do our subsequent processing we're still
working on the original file variable?
I'm busy working on some test code to try and replicate this but
realise I could well be barking up the wrong tree, and not even sure
what I've just suggested could happen, hopefully someone out there has
encountered something similar to this and could share their experience?
files in there it imports the contents into a database. Very
occasionally (like once in every 5000 files) it will pick up a file
and archive it without actually putting the contents into the
database, there's no error thrown anywhere and all the logging looks
like the file's been processed without a hitch. Putting the file back
into the import directory gets it imported and read correctly into the
database.
The logging consists of a couple of statements written to a logfile
saying what files it's found in what directory then the name of each
individual file as it processes them. There's also a table in the
database where we record the filenames, time of processing, how many
records were read in, how many contained errors etc. In the case of
this error occurring the log files look exactly how you'd expect if
the file had been imported correctly and the table in the database
shows it processed fine, however the numbers shown are zero (which is
actually correct, it did only process zero!) when it should have read
in at least one record.
The import process consists of moving each file to be processed from
the import directory into a working directory whilst it's worked on,
then moved again into an archive directory once it's done with.
I'm a bit baffled how this error could have occured without any
exceptions being thrown or logged, any suggestions welcome. I've got
a theory that it might be due to the file handling in the code, which
looks like this:
for (File file : filesToImport())
{
importFileHandlingExceptions(file);
}
public void importFileHandlingExceptions(File file)
{
log.debug("Importing file " + file);
try
{
importFile(file);
}
catch (Exception e)
{
handleImportException(file, e);
}
}
public void importFile(File file) throws IOException
{
file = workingDirectoryCreator.moveFileIntoDir(file);
importer.importFile(file, summaries);
file = archiveDirectoryCreator.moveFileIntoDir(file);
summaries.setArchiveFilename(file.getName());
}
public File moveFileIntoDir(File file) throws IOException
{
return moveToFile(file,
unusedFileFinder.findFile(file.getName()));
}
public static File moveToFile(File moveMe, File destinationFile)
throws IOException
{
boolean success = moveMe.renameTo(destinationFile);
if (!success)
{
throw new IOException("Unable to move " + inspectFile(moveMe)
+ " to " + inspectFile(destinationFile));
}
return destinationFile;
}
What I'm wondering is if it's due to picking up the file at the
beginning (in the for (File file... bit) and then the subsequent
processing is done on that variable being passed around. Occasionally
we might pick up that file before the process that ftps it into the
import directory has actually finished writing it, so at that moment
the file variable is actually holding an empty file. The subsequent
moving of the actual file works OK because they're small files and by
the time that code executes the ftp process has finished with it and
released it, but when we do our subsequent processing we're still
working on the original file variable?
I'm busy working on some test code to try and replicate this but
realise I could well be barking up the wrong tree, and not even sure
what I've just suggested could happen, hopefully someone out there has
encountered something similar to this and could share their experience?