Best way to input from stdin?

R

Rui Maciel

I'm writing a program that supports input from stdin. To be able to do that I tend to rely on a simple loop
that tests the return of fgets(), such as the following example:

if (fgets (buffer, BUFFER_SIZE, stdin) != NULL)
{
// read the buffer
}


I've been doing this for a while but I'm not sure that this is the best way to handle it. A quick google on
the subject failed to return any meaningful result and searching this group's history through google groups
ended up being a very disappointing experience (a search for nothing more than "stdin" returns only 14
results that only go as far as June 8th). So, what's the best way to handle input from stdin?


Thanks in advance,
Rui Maciel
 
B

bartc

Rui Maciel said:
I'm writing a program that supports input from stdin. To be able to do
that I tend to rely on a simple loop
that tests the return of fgets(), such as the following example:

if (fgets (buffer, BUFFER_SIZE, stdin) != NULL)
{
// read the buffer
}


I've been doing this for a while but I'm not sure that this is the best
way to handle it.

What's the problem with doing things this way?

If you're worried about dealing with input lines of any conceivable length,
then everyone apparently writes their own 'getline' routines. You might try
searching for that, and fgetline, instead.

And in the code fragment you've given, I'm not sure how useful comparing the
result to NULL will be (assuming live input). When the user enters an empty
line, the buffer should contain "\n"; the function will succeed.
 
E

Eric Sosman

Rui said:
I'm writing a program that supports input from stdin. To be able to do that I tend to rely on a simple loop
that tests the return of fgets(), such as the following example:

if (fgets (buffer, BUFFER_SIZE, stdin) != NULL)
{
// read the buffer
}


I've been doing this for a while but I'm not sure that this is the best way to handle it. A quick google on
the subject failed to return any meaningful result and searching this group's history through google groups
ended up being a very disappointing experience (a search for nothing more than "stdin" returns only 14
results that only go as far as June 8th). So, what's the best way to handle input from stdin?

You can't judge "best" without an objective of some kind.
The snippet you've shown looks perfectly plausible for many
situations, although there may be difficulty with input lines
that are too long (more than BUFFER_SIZE-2 "payload" characters,
plus the line-ending '\n', plus the string-ending '\0').

"Bless you, it all depends!" -- Pitti-Sing
 
R

Rui Maciel

Eric said:
You can't judge "best" without an objective of some kind.
The snippet you've shown looks perfectly plausible for many
situations, although there may be difficulty with input lines
that are too long (more than BUFFER_SIZE-2 "payload" characters,
plus the line-ending '\n', plus the string-ending '\0').

I intend to use the code in that snippet to simply fill a buffer from a file that is then used by a parser to
build a document tree. The parser then "waits" for more input once it reaches the buffer's end. Is it
possible that, in this case and judging by the example code, some problem may come up if the input lines are
too long?


Rui Maciel
 
J

James Kuyper

Rui said:
I'm writing a program that supports input from stdin. To be able to do that I tend to rely on a simple loop
that tests the return of fgets(), such as the following example:

if (fgets (buffer, BUFFER_SIZE, stdin) != NULL)
{
// read the buffer
}


I've been doing this for a while but I'm not sure that this is the best way to handle it. A quick google on
the subject failed to return any meaningful result and searching this group's history through google groups
ended up being a very disappointing experience (a search for nothing more than "stdin" returns only 14
results that only go as far as June 8th). So, what's the best way to handle input from stdin?

That depends entirely upon what the input looks like and what you need
to do with it. That's a key reason why there's several different ways to
do it. The approach you're currently using is fine for line-oriented
text files, completely inappropriate for binary files, and not
necessarily a good way to go if you're reading text files that are not
line-oriented.
 
R

Rui Maciel

bartc said:
What's the problem with doing things this way?

I believe there isn't a problem and until now everything worked well. Nonetheless, as "getting something to
work" isn't exactly the same thing as "doing it right" then it is better to make sure that it is as good as
it appears to be.

If you're worried about dealing with input lines of any conceivable
length, then everyone apparently writes their own 'getline' routines. You
might try searching for that, and fgetline, instead.

I've browsed some articles on getline() and it appears that it works like the code snippet I've posted here
with the added feature that the function also returns when stumbling on a specific delimiter character.

As that feature forces the getline() function to compare all characters with a given delimiter character
then it appears to be needlessly slower, as the input always needs to be parsed after being fed to a
buffer.

What is getline() used for?

And in the code fragment you've given, I'm not sure how useful comparing
the result to NULL will be (assuming live input). When the user enters an
empty line, the buffer should contain "\n"; the function will succeed.

The NULL test is performed in order to check for EOF.


Rui Maciel
 
R

Rui Maciel

James said:
That depends entirely upon what the input looks like and what you need
to do with it. That's a key reason why there's several different ways to
do it. The approach you're currently using is fine for line-oriented
text files,

That's good to hear.

completely inappropriate for binary files

What's the best way to read data from a binary file?
, and not
necessarily a good way to go if you're reading text files that are not
line-oriented.

How come?


Rui Maciel
 
R

Rui Maciel

James said:
That depends entirely upon what the input looks like and what you need
to do with it. That's a key reason why there's several different ways to
do it. The approach you're currently using is fine for line-oriented
text files,

That's good to hear.

completely inappropriate for binary files

What's the best way to read data from a binary file?
, and not
necessarily a good way to go if you're reading text files that are not
line-oriented.

How come?


Rui Maciel
 
E

Eric Sosman

Rui said:
I intend to use the code in that snippet to simply fill a buffer from a file that is then used by a parser to
build a document tree. The parser then "waits" for more input once it reaches the buffer's end. Is it
possible that, in this case and judging by the example code, some problem may come up if the input lines are
too long?

(The "example code" was a single fgets() call.)

If a line is too long for the buffer, fgets() will read as
much as it can, filling all but the last buffer position with
data from the input and placing a '\0' in the last position.
Whether that's a "problem" or not depends on what the program
then does with the partial line. Note that the un-read tail of
the line is still pending on the input, ready to be read as if
it were the "next" line -- another likely source of confusion
if the data in the lines is structured.

You can try strchr(buffer, '\n') to see whether a complete
line has been read, and this *almost* works. If it returns
non-NULL, fgets() has in fact read an entire line. But if it
returns NULL, there are two possibilities: fgets() may have
read a partial line, or the very last line of input may lack
a '\n' and fgets() has read all the way to end-of-file. (On
some systems this situation cannot arise, but on others it can.)
 
B

bartc

Rui said:
bartc wrote:

I've browsed some articles on getline() and it appears that it works
like the code snippet I've posted here with the added feature that
the function also returns when stumbling on a specific delimiter
character.

As that feature forces the getline() function to compare all
characters with a given delimiter character then it appears to be
needlessly slower, as the input always needs to be parsed after being
fed to a buffer.

That character is likely to be '\n', which if present is going to be at the
end of a line so removing that is a quick operation, at least after strlen()
is used. Since this is file i/o, I don't think strlen is a significant
overhead.
What is getline() used for?

The problem with fgets() is that you have to specify an upper line length,
the string returned *might* an '\n' at the end, and if it hasn't, then it
will have left unused characters in the line which will appear the next time
you call fgets as extra phantom lines, probably screwing up your program.

I believe getline() routines get around some of these problems.

if you're not worried about potential gigabyte-length lines, but only those
of a reasonable length, then you just need to cleanly strip the '\n' (if it
will affect your code), safely dump any text that will exceed your buffer,
and possibly signal the fact that you have an overflow line which could mean
a data error.

So use an existing getline() or write another variation.
The NULL test is performed in order to check for EOF.

Ok, then your code isn't necessarily just for stdin, but for any file.
 
E

Eric Sosman

Rui said:
What's the best way to read data from a binary file?

What's the best way to breathe? (Give one answer only,
please, and make sure it's appropriate for all conceivable
and inconceivable circumstances: Swimming the English Channel,
hiding behind a door to escape detection by the Daleks on
the other side, making your Metropolitan Opera debut, doing
Lamaze exercises, being pepper-sprayed, ...)

In other words -- and for at least the third time in
this thread, hint, hint -- There is no "best" in isolation,
but only in relation to some goal or set of goals. Describe
your purposes, describe the sort of data you're reading, and
maybe someone will have suggestions.
 
J

James Kuyper

Rui said:
James Kuyper wrote: ....
What's the best way to read data from a binary file?

fread(), unless you have access to a software package that knows the
format of the binary file, in which case you're better off using that
package rather than wasting time writing your own code to do the same
thing. Most of the files my programs read are in HDF format, for
instance, so I use the HDF C library to read them (for this purpose, all
you need to understand about HDF is that it's a specialize file format).
How come?

Getting a line at a time is a very good idea when lines have a special
significance. For a file in which a newline character is just another
whitespace character, no more or less significant than a blank or a tab
character, then depending upon what you're doing with the files, it
might be more appropriate to use either getchar() or scanf().
 
N

Nick Keighley

There's no decent way of writing a completely reusable getline() function.
The reason is that malciious users can potentially pass lines long enough to
exhaust memory, and there is no strategy for dealing with this that will be
right for all programs.
However as a general rule, fgets() with a big buffer(eg 4096 bytes) and some
wrapper code to check for the newline is OK. Most people do not know how to
use fgets() properly. The newline is a sentinel that tells you a full line
has been read. If it is not present, you need to take action, not treat the
half-read line as a full one, which could cause a nasty bug.

and, if used carefully like this, fgets() is often perfectly ok for
a wide variety of programs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top