Reading a line from a file

  • Thread starter Joona I Palaste
  • Start date
D

Dan Pop

In said:
Colin JN Breame said:
Hi,

Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

What happens if the user simply types the newline character?

What happens if the user presses the eof key instead?

Your example is mishandling all these cases (well, the last one can still
be detected with an feof() call, but it's rather ugly).

Dan
 
C

Colin JN Breame

Colin JN Breame said:
Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from a file? Ive looked at fscanf but was not sure which format
specifier to use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a file
and store the whole line to a single buffer, use the %[ conversion
specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we see
EOF, a newline character ('\n'), or until we've read 100 characters, and
assign them to buff." The conversion specifier "%*[^\n]" means "read
characters until we see EOF or a newline and throw them away." This
conversion specifier is there in case the input line is longer than our
expected maximum line length, and gives us a way to remove those extra
characters from the input buffer. The "%*c" conversion specifier means
"read the next character (which should be the newline) and throw it away."
This removes the newline character from the input buffer. You never want
to use the "%[" conversion specifier without specifying a maximum field
width; fscanf() has no way to tell how big your target buffer is unless
you explicitly tell it, so if your input buffer is sized for 100
characters and the input line is 132 characters and you haven't specified
a maximum field width, fscanf() will attempt to write those extra 32
characters to memory outside your buffer, which will cause a crash (if
you're lucky) or otherwise weird behavior (if you're not).

Alternately, you can use fgets() to read an input line into a buffer. Like
the %[ conversion specifier above, you specify a maximum buffer length:

char buff[101];
FILE *infile;
...
fgets (buff, sizeof buff, infile);

Like the %[ conversion specifier above, fgets() will read until it sees
either an EOF, a newline, or until we've read 100 characters, and stores
them to buff. Unlike the conversion specifier used above, the newline
character is stored as part of the buffer. Also, unlike fscanf(), there's
no provision to automatically consume and discard any characters beyond
the expected input line length; you'll have to call fgets() (or other
input routine) repeatedly to clear out the input buffer. Note that
fflush() should *not* be used to clear the input buffer; you must use an
actual input routine.

Looks interesting, thanks all for the suggestions. Unfortunately, Im
now being forced to use C++.

Thanks again!
 
J

John Bode

In said:
Colin JN Breame said:
Hi,

Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

What happens if the user simply types the newline character?

What happens if the user presses the eof key instead?

Your example is mishandling all these cases (well, the last one can still
be detected with an feof() call, but it's rather ugly).

Dan

See, this is why I prefer fgets() for interactive input; getting the
conversion specifiers for fscanf() just right is apparently beyond my
abilities.

:p~~~

For the benefit of the OP, could you show how to handle those cases
properly, since I borked it?
 
N

nrk

John said:
(e-mail address removed) (Dan Pop) wrote in message
In <[email protected]>
Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from
a file? Ive looked at fscanf but was not sure which format specifier
to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

While the line is correctly read into buff, the trailing newline remains in
the input stream. This is because, there is a failure in scanning for the
%*[^\n] specifier and scanning will stop at that point. The only solution
I can think of is breaking the call into two separate ones, one to get the
first 100 of the line and any possible trailing non-newline junk, and the
next to get rid of newlines.
This is more insidious. buff is uninitialized in this case, as there is a
failure for the very first conversion specifier (%100[^\n]). The way to
handle this is to look at the return code of fscanf. If it's 0, that means
none of the items were assigned to, so handle it appropriately.
Again, check the return code of fscanf. If it's EOF, then it signifies end
of input or error in input. Handle appropriately.
See, this is why I prefer fgets() for interactive input; getting the
conversion specifiers for fscanf() just right is apparently beyond my
abilities.

:p~~~

For the benefit of the OP, could you show how to handle those cases
properly, since I borked it?

My attempt:
int rc;

rc = fscanf(input, "%100[^\n]%*[^\n]", buff);

if ( rc == EOF ) {
/* end of input/error in input handling */
}
else if ( rc < 1 ) {
/* buff was not assigned to... */
/* note that this is simply a newline by itself
in all likelihood, you can just scan the newline
out and continue as normal */
}

/* scan out newline + following empty lines */
rc = fscanf(input, "%*[\n]");
if ( rc == EOF ) {
/* end of input/error in input handling */
}

-nrk.
 
S

someone else

%s will scan up to the first space character

try fscaf(your_file_pointer,"%[^\n]%*c",your_target)
this will scan everything up to \n into your_target, and then scan and
discard the \n character itself
 
D

Dan Pop

In said:
John said:
(e-mail address removed) (Dan Pop) wrote in message
In <[email protected]>
(e-mail address removed) (John Bode) writes:

Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from
a file? Ive looked at fscanf but was not sure which format specifier
to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

While the line is correctly read into buff, the trailing newline remains in
the input stream. This is because, there is a failure in scanning for the
%*[^\n] specifier and scanning will stop at that point. The only solution
I can think of is breaking the call into two separate ones, one to get the
first 100 of the line and any possible trailing non-newline junk, and the
next to get rid of newlines.
Right.
This is more insidious. buff is uninitialized in this case, as there is a
failure for the very first conversion specifier (%100[^\n]). The way to
handle this is to look at the return code of fscanf. If it's 0, that means
none of the items were assigned to, so handle it appropriately.

Right, but he was discarding the return value of fscanf, which is
extremely important.
Again, check the return code of fscanf. If it's EOF, then it signifies end
of input or error in input. Handle appropriately.

Same remark as above.
My attempt:
int rc;

rc = fscanf(input, "%100[^\n]%*[^\n]", buff);

if ( rc == EOF ) {
/* end of input/error in input handling */
}
else if ( rc < 1 ) {
/* buff was not assigned to... */
/* note that this is simply a newline by itself
in all likelihood, you can just scan the newline
out and continue as normal */
}

/* scan out newline + following empty lines */

It's usually better to leave the following empty lines alone. They may
be significant as such.
rc = fscanf(input, "%*[\n]");
if ( rc == EOF ) {
/* end of input/error in input handling */
}

Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Dan
 
N

nrk

Dan Pop wrote:

It's usually better to leave the following empty lines alone. They may
be significant as such.

Agreed. However, if all that needs to be done is maintain a line number
count, I think the fscanf can be modified to add a %n specifier to achieve
the same result.
rc = fscanf(input, "%*[\n]");
if ( rc == EOF ) {
/* end of input/error in input handling */
}

Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Is there any specific reason you've used feof instead of directly testing rc
against EOF above? Wouldn't rc != EOF will give you the same effect?

-nrk.
 
D

Dan Pop

In said:
Dan Pop wrote:



Agreed. However, if all that needs to be done is maintain a line number
count, I think the fscanf can be modified to add a %n specifier to achieve
the same result.

When reading input from the terminal, an empty line often means the user's
acceptance of a default value. Your approach merely complicates the
things in such a case.
Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Is there any specific reason you've used feof instead of directly testing rc
against EOF above? Wouldn't rc != EOF will give you the same effect?

What happens if the eof condition occurs after fscanf has already read 20
characters? rc will be 1, but there is no point in calling getc(),
especially on certain implementations with non-sticky eof where getc()
will effectively attempt to get more input.

Dan
 
N

nrk

Dan said:
In <[email protected]> nrk
Dan Pop wrote:



Agreed. However, if all that needs to be done is maintain a line number
count, I think the fscanf can be modified to add a %n specifier to achieve
the same result.

When reading input from the terminal, an empty line often means the user's
acceptance of a default value. Your approach merely complicates the
things in such a case.
Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Is there any specific reason you've used feof instead of directly testing
rc
against EOF above? Wouldn't rc != EOF will give you the same effect?

What happens if the eof condition occurs after fscanf has already read 20
characters? rc will be 1, but there is no point in calling getc(),
especially on certain implementations with non-sticky eof where getc()
will effectively attempt to get more input.

Dan

Thank you. I had missed considering both interactive input, and the case of
end of file being reached with rc == 1.

-nrk.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,122
Messages
2,570,717
Members
47,283
Latest member
VonnieEwan

Latest Threads

Top