reading multiple files

L

leorulez

Is there any way to read multiple files (more than 1000 files) and then
write into one single output file using C? Right now in my program, I
have a loop which asks for the filename and writes into the output file
but this is tedious. Imagine typing 1000 filenames...is there a
efficient way to do this??

Thanks
 
R

Roberto Waltman

Is there any way to read multiple files (more than 1000 files) and then
write into one single output file using C? Right now in my program, I
have a loop which asks for the filename and writes into the output file
but this is tedious. Imagine typing 1000 filenames...is there a
efficient way to do this??

Thanks

<OFF-TOPIC>
You are asking, "how?"
The first question that comes to my mind is "why?"
This is trivial to do with the built in facilities of most command
shells. What problem are you trying to solve? Why do you want to do
it "using C"?
</OFF-TOPIC>
 
L

leorulez

I was thinking of using C because I can do better in C compared to
other languages. Could you please brief about the built in facilities
of the command shells to do this task?
 
S

SM Ryan

(e-mail address removed) wrote:
# Is there any way to read multiple files (more than 1000 files) and then
# write into one single output file using C? Right now in my program, I
# have a loop which asks for the filename and writes into the output file
# but this is tedious. Imagine typing 1000 filenames...is there a
# efficient way to do this??

The file name is fopen is an (char*) expression. It can be a
string constant or anything else that is (char*). For example
to open the file with names like fwxyzDDD,
int i; for (i=0; i<1000; i++) {
static char F[] = "fwxyz%03d";
char f[sizeof F+3];
sprintf(f,F,i);
FILE *fn = fopen(f,"r"); if (!fn) {perror(fn); continue;}
...
fclose(fn);
}
Or if you have list of file names file (for example the output of
the unix find command), you can fgets the file names, and open
the file name you fgets.
 
R

Roberto Waltman

Could you please brief about the built in facilities
of the command shells to do this task?

This is trivial to do with the built in facilities of most command

<OFF-TOPIC>
Let's assume you want to concatenate all "*.log" files in a directory
into a single file:
In a Windows 2000 command window:
copy *.log all_together.now
In Linux/Unix/???BSD under a bash shell:
cat *.log >all_together.now
It is very easy to extend this to copy only files whose names match a
pattern, or where modified between specific dates, or in several
directories, etc.
But these issues have nothing to do with the C language, please post
again in a newsgroup dedicated to the computer platform you are using.
</OFF-TOPIC>
 
L

leorulez

SM said:
(e-mail address removed) wrote:
# Is there any way to read multiple files (more than 1000 files) and then
# write into one single output file using C? Right now in my program, I
# have a loop which asks for the filename and writes into the output file
# but this is tedious. Imagine typing 1000 filenames...is there a
# efficient way to do this??

The file name is fopen is an (char*) expression. It can be a
string constant or anything else that is (char*). For example
to open the file with names like fwxyzDDD,
int i; for (i=0; i<1000; i++) {
static char F[] = "fwxyz%03d";
char f[sizeof F+3];
sprintf(f,F,i);
FILE *fn = fopen(f,"r"); if (!fn) {perror(fn); continue;}
...
fclose(fn);
}
Or if you have list of file names file (for example the output of
the unix find command), you can fgets the file names, and open
the file name you fgets.



Lets consider I have a list of all the file names in a file called
list.txt. I wrote the following code

fp2=fopen("list.txt","r");
while ( fgets(line, sizeof(line), fp2) != NULL)
{
printf("%s", line);
strcpy(name,line);
printf("%s", name);
printf("%d\n",strlen(name));


fp = fopen(name, "r" );
fp1 = fopen("out.txt","a");
if (fp==NULL)
{
printf("error in opening\n");
exit(1);
}
.....
.....
.....

The file pointer returns null resulting in "error in opening". when I
checked the string length of the filename its always 2 characters more
then the original length. Could you please tell me how I can use the
variable "name" (or "line") in fopen.
 
R

Richard Heathfield

Before I tackle your question, could I just say that I'm rather concerned
that the reaction to your perfectly legitimate question was to say: (a) you
should use your shell to do this, and (b) shells are off-topic, clear off.
If that isn't hostility, I don't know what is.

Folks, we have no evidence that this guy even /has/ a shell that can
concatenate many files into one large one. The C Standard imposes no such
requirement on implementations. Your assumption that he has a shell is
completely off-topic, and those who made it should be ashamed of
themselves.

It's the worst kind of "topic-cop" behaviour - a bit like sneaking a bag of
class A drugs into a guy's pocket and then busting him for possession.

The question is reasonable and, in my opinion, topical.

Okay - the priority is to get the filenames into some kind of
computer-readable list, and (e-mail address removed) appears to have achieved
that. So now he just has a tiny problem getting in his way. The fix,
(e-mail address removed), is easy, you'll be glad to hear. See below.

(e-mail address removed) said:

Lets consider I have a list of all the file names in a file called
list.txt. I wrote the following code

Excuse me a minute:

char *end = NULL;

Thanks. (All will become clear shortly!)
fp2=fopen("list.txt","r");

The fopen call might fail, in which case fp2 will be NULL. You should check
for this.
while ( fgets(line, sizeof(line), fp2) != NULL)

Make sure line is at least FILENAME_MAX + 1 bytes in size.
{
printf("%s", line);
strcpy(name,line);

Yes, you've got the filename. Unfortunately, you've also got a newline
character. One easy way to get rid of it is:

end = strchr(name, '\n'); /* find the newline */
if(end != NULL)
{
*end = '\0'; /* chop it off! */
}
else
{
you might want to wonder about why it didn't find a newline; basically, it
means that the filename was so long it couldn't all fit in the buffer, in
which case the rest of the filename will be handed to you on the next
fgets call. Messy. Better to make sure your buffer's big enough to start
with (see above).
}

I hope that helps.
 
E

Eric Sosman

Lets consider I have a list of all the file names in a file called
list.txt. I wrote the following code

fp2=fopen("list.txt","r");
while ( fgets(line, sizeof(line), fp2) != NULL)
{
printf("%s", line);
strcpy(name,line);
printf("%s", name);
printf("%d\n",strlen(name));


fp = fopen(name, "r" );
fp1 = fopen("out.txt","a");
if (fp==NULL)
{
printf("error in opening\n");
exit(1);
}
....
....
....

The file pointer returns null resulting in "error in opening". when I
checked the string length of the filename its always 2 characters more
then the original length. Could you please tell me how I can use the
variable "name" (or "line") in fopen.

When fgets() reads a line, it stores the entire line
including the '\n' at the end. Since the name of the
file you are trying to open is probably "foo.txt" and not
"foo.txt\n", you need to search for and remove that '\n'
character. Something like this will do it:

char *p;
...
p = strchr(line, '\n');
if (p != NULL)
*p = '\0';

This much would explain one extra character per line,
but you've reported seeing two. It may be that you've mis-
counted, and that simply removing the '\n' will solve your
problem. A more troublesome possibility is that "list.txt"
may have originated on a system that uses a different line-
ending convention from yours, and that it wasn't properly
translated to your system's conventions when it was moved
there. For example, Windows systems terminate lines with
the two-character pair '\r','\n', while POSIX systems use
'\n' by itself. If the '\r','\n' pairs weren't translated
to single '\n' characters, then the lines you read from the
file will look like "foo.txt\r\n"; removing the '\n' will
leave you with "foo.txt\r", and fopen() will probably still
fail.

If this is the case, the best cure is to clean up the
procedure that brought you "list.txt" in the first place, so
the line endings will be translated properly. If that's not
possible, you can add a little more code to remove '\r' (if
it's present), just as the code above removes '\n'. This
isn't perfectly bullet-proof, though, because some systems
use even stranger line-ending conventions than '\r','\n' --
for example, a system that used '\n','\r' would produce a
file that if read on a POSIX system would look like

first line\n
\rsecond line\n
\rthird line\n
...

It's for reasons like this that I recommend revisiting your
file-transfer process instead of trying to outguess the
remote system's line-ending conventions.
 
S

SM Ryan

(e-mail address removed) wrote:
#
# SM Ryan wrote:
# > (e-mail address removed) wrote:
# > # Is there any way to read multiple files (more than 1000 files) and then
# > # write into one single output file using C? Right now in my program, I
# > # have a loop which asks for the filename and writes into the output file
# > # but this is tedious. Imagine typing 1000 filenames...is there a
# > # efficient way to do this??
# >
# > The file name is fopen is an (char*) expression. It can be a
# > string constant or anything else that is (char*). For example
# > to open the file with names like fwxyzDDD,
# > int i; for (i=0; i<1000; i++) {
# > static char F[] = "fwxyz%03d";
# > char f[sizeof F+3];
# > sprintf(f,F,i);
# > FILE *fn = fopen(f,"r"); if (!fn) {perror(fn); continue;}
# > ...
# > fclose(fn);
# > }
# > Or if you have list of file names file (for example the output of
# > the unix find command), you can fgets the file names, and open
# > the file name you fgets.
# >
# > --
# > SM Ryan http://www.rawbw.com/~wyrmwif/
# > Death is the worry of the living. The dead, like myself,
# > only worry about decay and necrophiliacs.
#
#
#
# Lets consider I have a list of all the file names in a file called
# list.txt. I wrote the following code
#
# fp2=fopen("list.txt","r");
# while ( fgets(line, sizeof(line), fp2) != NULL)
# {
# printf("%s", line);
# strcpy(name,line);
# printf("%s", name);
# printf("%d\n",strlen(name));

Print it out it in the most irritatingly verbose manner possible, like
{char *q=name; for (; *q; q++) printf("<%02X>%c",*q,' '<=*q && *q<127?*q:'.');}
printf("\n");
and then make sure it's not sneaking in any extra non-printing characters,
like the \n fgets leaves at the end of the buffer.

Voluminous output no longer kills trees. If you can't understand what's
happenning print everything so you know what's on whether than guessing.

# fp = fopen(name, "r" );
# fp1 = fopen("out.txt","a");
# if (fp==NULL)
# {
# printf("error in opening\n");
# exit(1);
# }

Most implementations set errno on fopen failure, so you can do
perror(name)
and get both the name you think you're using and the exact error.
 
L

leorulez

Eric said:
When fgets() reads a line, it stores the entire line
including the '\n' at the end. Since the name of the
file you are trying to open is probably "foo.txt" and not
"foo.txt\n", you need to search for and remove that '\n'
character. Something like this will do it:

char *p;
...
p = strchr(line, '\n');
if (p != NULL)
*p = '\0';

This much would explain one extra character per line,
but you've reported seeing two. It may be that you've mis-
counted, and that simply removing the '\n' will solve your
problem. A more troublesome possibility is that "list.txt"
may have originated on a system that uses a different line-
ending convention from yours, and that it wasn't properly
translated to your system's conventions when it was moved
there. For example, Windows systems terminate lines with
the two-character pair '\r','\n', while POSIX systems use
'\n' by itself. If the '\r','\n' pairs weren't translated
to single '\n' characters, then the lines you read from the
file will look like "foo.txt\r\n"; removing the '\n' will
leave you with "foo.txt\r", and fopen() will probably still
fail.

If this is the case, the best cure is to clean up the
procedure that brought you "list.txt" in the first place, so
the line endings will be translated properly. If that's not
possible, you can add a little more code to remove '\r' (if
it's present), just as the code above removes '\n'. This
isn't perfectly bullet-proof, though, because some systems
use even stranger line-ending conventions than '\r','\n' --
for example, a system that used '\n','\r' would produce a
file that if read on a POSIX system would look like

first line\n
\rsecond line\n
\rthird line\n
...

It's for reasons like this that I recommend revisiting your
file-transfer process instead of trying to outguess the
remote system's line-ending conventions.



Thanks a lot for the help. It worked. I did

p = strchr(line, '\n');
if (p != NULL)
*p = '\0';
q = strchr(line, '\r');
if (q != NULL)
*q = '\0';

This eliminated both "\n" and "\r". Thanks again.
 
K

Keith Thompson

Richard Heathfield said:
Before I tackle your question, could I just say that I'm rather concerned
that the reaction to your perfectly legitimate question was to say: (a) you
should use your shell to do this, and (b) shells are off-topic, clear off.
If that isn't hostility, I don't know what is.

Folks, we have no evidence that this guy even /has/ a shell that can
concatenate many files into one large one. The C Standard imposes no such
requirement on implementations. Your assumption that he has a shell is
completely off-topic, and those who made it should be ashamed of
themselves.

It's the worst kind of "topic-cop" behaviour - a bit like sneaking a bag of
class A drugs into a guy's pocket and then busting him for possession.

The question is reasonable and, in my opinion, topical.
[...]

I agree.

On the other hand, it's quite possible that a solution other than
writing a C program would meet the OP's needs better.

If someone were to ask me how to copy a number of files (names given
as a list in a text file) to a single output file, writing a C program
would not be the first thing I'd think of. On a Unix-like system, a
one-line shell command would do it; on other systems, similar
solutions undoubtedly exist. (Perl springs to mind; a Perl program to
do this would be much shorter than a corresponding C program.)

It seems sensible to me to point this out *and* to help with a C
solution if that's what the OP really wants.
 
R

Roberto Waltman

It wasn't, please read below.
Folks, we have no evidence that this guy even /has/ a shell that can
concatenate many files into one large one. The C Standard imposes no such
requirement on implementations. Your assumption that he has a shell is
completely off-topic, and those who made it should be ashamed of
themselves.

It's the worst kind of "topic-cop" behaviour - a bit like sneaking a bag of
class A drugs into a guy's pocket and then busting him for possession.

The question is reasonable and, in my opinion, topical.
[...]

I agree.

On the other hand, it's quite possible that a solution other than
writing a C program would meet the OP's needs better.

This was my thought, I apologize to the OP (and Richard) if my
response came out as rude or hostile.
If someone were to ask me how to copy a number of files (names given
as a list in a text file) to a single output file, writing a C program
would not be the first thing I'd think of. On a Unix-like system, a
one-line shell command would do it; on other systems, similar
solutions undoubtedly exist. (Perl springs to mind; a Perl program to
do this would be much shorter than a corresponding C program.)

Precisely. I have lost count of the number of times somebody asked me
for help and this dialog repeated itself:

He: "How can I do ZZ?"
Me: "Why do you want to do ZZ?"
He: "Because I need to do YY and that requires ZZ"
Me: "Why do you want to do YY?"
He: "Because I need to do XX and that requires YY"
Me: "Why do you want to do XX?"
He: "Because I need to do WW and that requires XX"
Me: "Have you considered doing AA instead of WW, this would solve
your problem directly."
He: "No, I didn't. Hmm, let me see, yes, that will do it."

I always try to understand the original problem that needs to be
solved, ignoring, at least temporarily, the new problems that pop up
once a particular solution was chosen. (Always annoying everybody
asking "What is the question behind your question?")

Once I was asked to help finish a C program that had to do some simple
data reduction on a few large text files. Each line on the files
contained a variable number of alphanumeric fields. The program had to
do some calculations on the values of the last two fields, depending
on the value of previous fields. The person doing this work had
written a very complex parser in C that failed to do job correctly.
A quick rendition of "How do I do this in C?" / "Why do you want to do
this in C" ensued, and the (correct) solution to the problem was an
8-line AWK script.
I (mis?)read the original post as a similar case, where a C solution,
while doable, would not have been an optimal one.
It seems sensible to me to point this out *and* to help with a C
solution if that's what the OP really wants.

I agree.
 
R

Richard Heathfield

Keith Thompson said:
Richard Heathfield said:
The question is reasonable and, in my opinion, topical.
[...]

I agree.

On the other hand, it's quite possible that a solution other than
writing a C program would meet the OP's needs better.

If someone were to ask me how to copy a number of files (names given
as a list in a text file) to a single output file, writing a C program
would not be the first thing I'd think of.

It wouldn't? It would sure as heck be the first thing I'd think of.
On a Unix-like system, a
one-line shell command would do it;

I might use that solution, but only if I were certain that the files would
be copied in the order I wanted. Finding that out might well take longer
than banging out a few lines of C.
on other systems, similar solutions undoubtedly exist.

Not *undoubtedly*, no. The C Standard does /not/ impose that requirement.
And in any case, why have a "similar" solution for each platform when you
can have an /identical/ solution for each platform.
(Perl springs to mind; a Perl program to
do this would be much shorter than a corresponding C program.)

But by the time you've learned Perl and written a Perl interpreter for, or
ported one to, your target system, it might just have been quicker to bang
out a few lines of C.
It seems sensible to me to point this out *and* to help with a C
solution if that's what the OP really wants.

Shame to go only half-way, then, really.
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:
Richard Heathfield said:
The question is reasonable and, in my opinion, topical.
[...]

I agree.

On the other hand, it's quite possible that a solution other than
writing a C program would meet the OP's needs better.

If someone were to ask me how to copy a number of files (names given
as a list in a text file) to a single output file, writing a C program
would not be the first thing I'd think of.

It wouldn't? It would sure as heck be the first thing I'd think of.
On a Unix-like system, a
one-line shell command would do it;

I might use that solution, but only if I were certain that the files would
be copied in the order I wanted. Finding that out might well take longer
than banging out a few lines of C.

The problem statement says that the file names are listed in a text
file, presumably in the desired order. If I use the list file, the
files are going to be copied in the right order.
Not *undoubtedly*, no. The C Standard does /not/ impose that requirement.
And in any case, why have a "similar" solution for each platform when you
can have an /identical/ solution for each platform.

It's entirely possible that this was a one-time task. But of course
if you need a portable solution, and the only tool you can assume is a
C compiler on each target system, then a C program could be the best
(or only) solution.

I'm not arguing that a C program *isn't* the best solution, only that
it *might* not be.

For me, I guarantee you that if I wanted to accomplish this task on a
Unix-like system, I could do it much faster and more reliably using
something other than C (but my solution would probably use tools that
happen to be implemented in C).
But by the time you've learned Perl and written a Perl interpreter for, or
ported one to, your target system, it might just have been quicker to bang
out a few lines of C.

Maybe yes, maybe no. By the time *I've* learned Perl and made sure
there's a Perl interpreter on my target system -- oh, wait, I already
have. But that's just an example, and I don't claim that it's
applicable to anyone other than me.
Shame to go only half-way, then, really.

Fortunately, we didn't (this newsgroup being a collaborative effort),
and the OP probably got at least as much good advice as he was looking
for.
 
R

Robert Latest

On Thu, 25 May 2006 07:51:14 +0000,
Richard Heathfield said:
Keith Thompson said:

It wouldn't? It would sure as heck be the first thing I'd think of.

At least it *should* be when the question is asked on clc.

Otherwise it's

xargs cat \{\} >> resulting_big_file < list.txt

(which should work on any reasonably set-up Windows system, too).

And by the way, even for a one-off on a system without GNU tools I'd do
it in C. I've become sick of Perl.

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,183
Messages
2,570,967
Members
47,520
Latest member
KrisMacono

Latest Threads

Top