J
jeffotn
I have a scenario I would like to discuss.
I have a set of servers that get constant ftp of files and a subset of
those files need to be transferred to a central file storage system.
The files are also processed internally on those servers for their own
local file storage for review. A java batch job runs to determine
which files meet the criteria to be sent to the central storage.
What is the best way to transfer all of these files via the network to
the central file storage server, control directory size, and support
multiple disk partition? I currently use a network share to create
move the files over to the central share vs. FTP. I clocked around
over 530 files per minute transferred to date.
There is no problem when the central server processes them and makes a
file visible online, however as the base directory where the regional
servers transfers files to the central files server grows, I expect the
processing engine to slow down.
Also how should multiple disk partition get handled? Simply via a NAS
implementation or should this be considered in the code.
Expected volume of files daily is around 48,000 with about 30-50% of
those files being sent to the central file storage server. Currently a
directory is created for each day of processing and processing occurs
every hour, so daily you are looking at around 24 new directories.
I dont want to reinvent the wheel if something exists that I could
purchase, leverage via Open Source etc. I did some digging and
understand that network latency will be an issue that it is recommended
that you have about 10K files per directory, and that you should try to
have numerous subdirectories to maintain the browsing at the parent
directory level small.
I have a set of servers that get constant ftp of files and a subset of
those files need to be transferred to a central file storage system.
The files are also processed internally on those servers for their own
local file storage for review. A java batch job runs to determine
which files meet the criteria to be sent to the central storage.
What is the best way to transfer all of these files via the network to
the central file storage server, control directory size, and support
multiple disk partition? I currently use a network share to create
move the files over to the central share vs. FTP. I clocked around
over 530 files per minute transferred to date.
There is no problem when the central server processes them and makes a
file visible online, however as the base directory where the regional
servers transfers files to the central files server grows, I expect the
processing engine to slow down.
Also how should multiple disk partition get handled? Simply via a NAS
implementation or should this be considered in the code.
Expected volume of files daily is around 48,000 with about 30-50% of
those files being sent to the central file storage server. Currently a
directory is created for each day of processing and processing occurs
every hour, so daily you are looking at around 24 new directories.
I dont want to reinvent the wheel if something exists that I could
purchase, leverage via Open Source etc. I did some digging and
understand that network latency will be an issue that it is recommended
that you have about 10K files per directory, and that you should try to
have numerous subdirectories to maintain the browsing at the parent
directory level small.