S
sasuke
Hello to all Java programmers out there.
I was just wondering what would be the most time / space efficient way
of concatenating contents of different files to a single file. Sample
usage would be:
java Concat targetFile.txt sourceFileOne.txt sourceFileTwo.txt ...
Using threads to open a stream to the source files is out of question
since the data needs to be written in a ordered manner in which it
exists in the source files i.e. no ad hoc writing. Reading the entire
contents of the file into memory (by using a StingBuffer /
StringBuilder) also isn't a good choice considering that we can come
across really large text files (~10 MB, typical for db dumps). Reading
the source file line by line doesn't seem attractive given that it
would increase I/O and again for really large files might turn out to
be a I/O bottleneck. One solution which comes to mind is to read the
file in chunks; i.e. read the data in char array of 8KB or a string
array of size 100.
My question here is -» Is there any ideal solution which comes to
mind when solving this problem or does the solution really depend on
the domain in consideration and the kind of sacrifices we are ready to
make (e.g. lose the ordering of data, memory trade off when reading
entire file in a buffer, I/O hit)?
Pardon me for asking such trivial / silly question but just a
thought.
Regards,
/~sasuke
I was just wondering what would be the most time / space efficient way
of concatenating contents of different files to a single file. Sample
usage would be:
java Concat targetFile.txt sourceFileOne.txt sourceFileTwo.txt ...
Using threads to open a stream to the source files is out of question
since the data needs to be written in a ordered manner in which it
exists in the source files i.e. no ad hoc writing. Reading the entire
contents of the file into memory (by using a StingBuffer /
StringBuilder) also isn't a good choice considering that we can come
across really large text files (~10 MB, typical for db dumps). Reading
the source file line by line doesn't seem attractive given that it
would increase I/O and again for really large files might turn out to
be a I/O bottleneck. One solution which comes to mind is to read the
file in chunks; i.e. read the data in char array of 8KB or a string
array of size 100.
My question here is -» Is there any ideal solution which comes to
mind when solving this problem or does the solution really depend on
the domain in consideration and the kind of sacrifices we are ready to
make (e.g. lose the ordering of data, memory trade off when reading
entire file in a buffer, I/O hit)?
Pardon me for asking such trivial / silly question but just a
thought.
Regards,
/~sasuke