CBFalconer said:
However the user should be aware that everything breaks down if the
input system tries to handle a file as text when that file doesn't
adhere to the conventions for text on the system. Thus a
windows/dos file bodily transferred to a linux system will have
those extraneous '\r's. A linux file transferred to windows may
appear to be one long line with no '\n's.
Some programs deliberately treat all files as binary, and try to
make their own decisions about the format. I believe gcc is one of
these.
I'm fairly new to C and especially to ANSI/ISO C, but it seems somewhat
strange to me that the Standard (AFAIK) doesn't attempt to regulate
this aspect of the language as far as the standard file descriptors go.
For example, to me it would seem logical to *always* open stdin in text
mode, so that redirected input would work correctly regardless of the
platform, since I think it's fair to say that most redirected input to
real-world apps and command-line utilities is in the form of text files
and not binary data files. Some (most?) compilers will link code into
compiled programs to choose the "best" mode for stdin based on how you
give your C programs input. Cygwin gcc is one example: it puts stdin in
text mode if no redirection occurs; however, if you are redirecting
input from a file (even if it's a text file), it will switch stdin into
binary mode, unless you a) explicitly force stdin into text mode in the
source code or b) override this behavior with an external environment
variable.... Wouldn't it make more sense to clearly define this
behavior instead of leaving it to the whim of the specific compiler you
happen to using at the moment? For example, why not have something in
the Standard to the effect of: "Upon entering main(), the standard
streams stdin, stdout, and stderr shall be in text mode"? Then the
programmer need not worry about compiler quirks like the one I
mentioned above when parsing text files from redirected input, since
newline translation would be guaranteed to occur unless or until the
programmer explicitly switches a stream from text mode into binary mode
at the source-code level.
Being a C newcomer and a complete novice in all things Standard, I
wouldn't be at all surprised if my argument here is overly simplistic
or even unfounded. What are your (everyone's) thoughts on this idea, or
has it already be discussed and discarded, or beyond the scope of what
the Standard is meant to define? I'm interested to see your thoughts
and comments on this.
Mike S