Which kind of makes you wonder exactly what value the standard C library is
giving you. If this whole "ignore '\0'" feature of fgets() was a somehow
necessary or useful feature which one could easily map to a more general
solution there would be no issue. But its not.
Maybe you think you've already explained what you mean elsethread,
but would you take the time again, please, to explain (1) what you
mean by "ignore '\0'" in the context of fgets(); and (2) why you
consider this (whatever it is) a problem in the context of C
programming? (I certainly don't understand what your reply to
CBFalconer means -- what do you mean by "map to a more general solution,"
particularly?)
In the interest of shortening this subthread, I will explain what
*I* think you mean by (1), and why (1) is not a problem in C.
fgets() reads characters input from a file until it hits either
'\n' or EOF, and stores these characters into a buffer. When fgets()
is done reading (having hit either '\n' or EOF), it terminates the
buffer with '\0', thus making a string that it returns to the user.
(It also stops reading input when so-and-so many characters have
been read, but that's irrelevant to what I think you're discussing.)
This behavior is perfectly good for dealing with text streams,
but can fail in the following way when used on binary streams:
Suppose the input stream contains a '\0' byte. fgets() reads that
byte, sees that it is not '\n' (and that no EOF condition has been
raised), and stores it in the buffer. Then it proceeds, reading
and storing characters until '\n' or EOF.
So, if the user sees that his buffer contains a string that's
not terminated with '\n', he is unsure of whether the terminating
'\0' is due to EOF in the stream, or the actual reading of a '\0'
byte. He can check feof() to find out whether the stream *is*
at EOF, but there still remains the possibility that the stream
hit EOF after reading one or more '\0's.
So the problem is not that fgets() "ignores" null characters in
the input stream; it's rather that fgets() does *not* perform any
special operations when confronted with one. Which is perfectly
reasonable, since text streams do not contain '\0', and binary
streams assign no interesting semantics to '\n' characters (or
sequences). That is, it is /a priori/ a silly idea to use fgets()
in any situation in which the above-mentioned ambiguity could
arise.
Even if a (possibly malicious) text stream *did* contain a
'\0' byte, how could fgets() possibly deal reasonably with it?
It couldn't treat it as a normal character, because that would
be ambiguous. It couldn't treat it as '\n', because that would
break the existing semantics of fgets(), which guarantee that
each complete line read end with the terminating '\n' (although
this might be the best "solution" behavior). And it couldn't
use any kind of escape sequence to represent '\0', because all
other bytes already have unambiguous semantics in the output
of fgets().
The solution, in a C-programming context, is to use fgets()
only on text streams, and on binary streams to use fread(), getc(),
or other functions that treat all bytes symmetrically (i.e., with
no special semantics involving '\n', '\0', or whitespace).
You have implied elsethread that you think this solution a
"non-solution" (or perhaps I'm again imagining sarcasm where
none was intended). If that's the case, would you mind explaining
what you think is wrong with getc() and fread() on binary streams?
-Arthur