J
Jordan Abel
Jordan Abel wrote On 02/22/06 14:37,:
Keep in mind that we're speaking of text streams, where
the number of characters written to a stream need not be the
same as the number of bytes written to the file. A familiar
example is putc('\n', stream) on Windows, where one character
generates two bytes. There are also systems where writing a
newline produces no bytes in the file, systems where a file
contains both data bytes and metadata bytes, and systems that
use state-dependent encodings for extended character sets.
If you're dealing with something that might be a state-dependent
encoding, you should probably be using fgetpos and fsetpos
exclusively.
It's not so much a problem of U.B., but of failure that
doesn't produce a reliable indication. Seek to a position that
happens to be in the middle of a multi-byte character or in the
middle of a stretch of metadata, and the problem may be difficult
to detect: a byte in a file does not always stand alone, but may
require prior context (at an arbitrary separation) for proper
interpretation. Here's the stuff of a nightmare or two: Imagine
opening a stream for update, seeking to the middle of a stretch of
metadata, successfully writing "Hello, world!" there, and only
later discovering that the successful write has corrupted the file
structure and made the entire tail end unreadable ...
An implementation may silently force a file opened in update mode to
be a binary stream. An implementation that has such issues probably
should do so. (It would be nice if some way were provided for the
program to detect this, but unfortunately there does not seem to be)
It would be nice if one could do meaningful arithmetic on file
position indicators in text streams, but given the rich variety of
file encodings that exist in the world it is not always possible
to do so.
There is a difference between "not meaningful" and "undefined" - I
am entirely opposed to the dilution of the term "undefined behavior"
in this newsgroup.
I think that the implementation should detect all those issues and
treat them as "a request that cannot be satisfied", and return a
value indicating failure. I think there is a reading of the standard
which supports this view.