Mabden said:
Depend on it how? The program is working fine (better now). If you mean
that it may change in a future compiler, I have comments in the code
that I excluded to explain what is going on.
The Windows text file format uses a CR-LF pair to mark the end of a
line. If this is based on some written standard, I don't know what
that standard says about a CR or LF character that's not part of a
CR-LF pair. Apparently the stdio implementation you're using treats a
lone LF as an end-of-line marker, and you're depending on this
behavior to "magically" read Unix-format text files that you've
presumably copied without conversion to a Windows system.
If there's a guarantee somewhere that Windows treats a lone LF
character as an end-of-line marker, that's ok. <OT>The fact that
Notepad doesn't do this leads me to suspect that there is no such
guarantee. said:
You mean in a binary format? I'm not sure what you are saying here, as I
do read and write the data. My original program read in a lf and output
a crlf. This resulted in a crcrlf. That's when I realized it was
automagically adding the cr "for me".
Sorry, I left out some words and failed to proofread. I meant "to
read and write the data in binary mode". I believe it's safer to do
this and control the interpretation of the input data yourself, than
to depend on behavior that isn't guaranteed and could change
unpredictably.
Here's what I think is happening in your program. You read a
character at a time from a file opened in text mode. If the next two
bytes are a CR-LF pair (the Windows standard end-of-line marker),
getc() gives you '\n' (C's internal single-character representation of
an end-of-line marker). If the next byte is an LF character, getc()
gives you an LF character, which happens to be the same as C's '\n'
character. The result: your program happens to accept either CR-LF or
LF as an end-of-line marker. Given that getc() happens to work this
way, I *think* you can conclude that the rest of stdio will behave
consistently, for example that fgets() will treat a lone LF character
as an end-of-line marker, but I wouldn't bet large sums of money on
this conclusion.
This happens to work because of the relationship between the Windows
and Unix conventions for marking end-of-line and the value chosen by C
implementations for '\n' (it happens to be the same on both; it
needn't have been). Note that this doesn't work in the opposite
direction. If you copy a Windows text file to a Unix system without
conversion, then read it as a text file, you'll get an extra CR ('\r')
character at the end of each line.
Your program works. With enough research, you might even be able to
prove that it's guaranteed to work. The problem, in my view, is that
the chain of reasoning is too long for comfort.
The best way to convert Unix-format text files to Windows-format text
files is to treat them as binary files (or at least to treat the
"foreign" format as binary) and to do the conversion based on
knowledge of the actual format.