EOF (novice)

Hallvard B Furuseth · Dec 2, 2003

James said:
Of course not. "We don't know nor should we care" clearly spells that
out. You claimed that EOF was *not* stored on the physical medium.

No, I claimed that _C's_ EOF is not. At least that was what I meant to
say. Anyway, maybe we just have been saying the same thing in different
ways.

Arthur J. O'Dwyer · Dec 2, 2003

I still don't get it. Each and every byte combination is still valid
in a binary file, therefore it *cannot* be used as eof marker.

A trivial example would be an MS-DOS-like hybrid system on which the
byte 0xA1 would indicate the end of each file (text or binary). [Not
a typo; I specifically changed it from 0x1A so that EOF could be
#defined to be 0xA1A1 on this hypothetical 16-bit system.]
"But then how does a program represent the literal byte 0xA1 on
the disk?" you ask. Simple -- escape codes. For example, the EOF
code could be 0xA1A1, and the escape code for the literal byte 0xA1
could be 0xA100 (big-endian). This would satisfy all the requirements
of the C standard on file systems (i.e., precious few), while being
technically possible.
Heck, you could even Huffman-encode every single file on the system
to save space, and use some rare codon to indicate EOF. That's getting
closer to what I think James means by "a compressed filesystem."

That's true and irrelevant in the case of text files. My point is that
your scheme simply does not work for binary files.

[In case Dan hasn't already thought of this: fseek() is not required
to run in constant time. Binary files don't have to be random-access
in their "natural state"; it just happens that all existing systems
do it that way.]

Furthermore, EOF
is a C macro having no connection with whatever mechanism the
implementation uses to detect the end of a file. All we know about it
is that it expands to a negative integer value.

Correct, of course. But I just gave a possible implementation
on which the system's EOF marker, 0xA1A1, is exactly the same value
as the C compiler's EOF value. So James' scenario is not impossible,
merely implausible. Heck, for all I know it might be *common* on
some highly esoteric platforms! ;-)

-Arthur

Dan Pop · Dec 3, 2003

I still don't get it. Each and every byte combination is still valid
in a binary file, therefore it *cannot* be used as eof marker.

Click to expand...

A trivial example would be an MS-DOS-like hybrid system on which the
byte 0xA1 would indicate the end of each file (text or binary). [Not
a typo; I specifically changed it from 0x1A so that EOF could be
#defined to be 0xA1A1 on this hypothetical 16-bit system.]
"But then how does a program represent the literal byte 0xA1 on
the disk?" you ask. Simple -- escape codes. For example, the EOF
code could be 0xA1A1, and the escape code for the literal byte 0xA1
could be 0xA100 (big-endian). This would satisfy all the requirements
of the C standard on file systems (i.e., precious few), while being
technically possible.

The semantics of fscanf and ftell on binary streams render this scheme
painful to implement: the byte offsets used by the program or reported
to the program are not the real byte offsets inside the file. But this
is only the tip of the iceberg. Imagine that I want to overwrite a
sequence of ordinary bytes by a sequence of 0xA1 bytes. Not only the
whole remaining of the file would have to be rewritten on the disk, but
the physical size of the file would increase, creating problems if there
is no more room on the disk (from the user's POV the file has the same
size, but it suddenly no longer fits on the disk). I'm afraid no one
would want to use your implementation ;-)

Dan

glen herrmannsfeldt · Dec 5, 2003

Arthur said:
[In case Dan hasn't already thought of this: fseek() is not required
to run in constant time. Binary files don't have to be random-access
in their "natural state"; it just happens that all existing systems
do it that way.]

The file system used on some IBM mainframes does not make fseek() easy.

For files with fixed length records, they are normally stored on disk in
fixed length blocks, except for the last block. If an existing file is
appended to, it can have a short block that is not at the end, making
random access difficult. Though if the library routines keep track of
the block sizes the first time through, it would be easy to fseek() to
any previously seen position.

For files with variable length records (V or VB), the only way would be
to keep track of the block lengths in the file.

-- glen

Getchar() problem	8	Jan 2, 2022
C Programming Language 2nd Ed, Exercise 1.9 and 1.12, Solution suggestion.	45	Jun 9, 2014
How to input EOF?	6	Sep 29, 2008
Why getchar() doesn't quit if EOF isn't the first char	17	Nov 14, 2007
EOF question..	2	Jan 14, 2007
While Loop Evaluation Help	20	Jul 22, 2010
C newbie problem	12	Sep 10, 2007
problem with getchar EOF	2	Oct 16, 2009

EOF (novice)

Hallvard B Furuseth

Arthur J. O'Dwyer

Dan Pop

glen herrmannsfeldt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads