Value of EOF

P

Peter Nilsson

CBFalconer said:
EOF is a macro defined in stdio.h (and other places).

It's only defined in said:
All that is known about it is that is negative, fits into an int,
True.

and is outside the range of char.

No. If char is signed (as it is on many implementations), then -1
(the most common value of EOF) is well within its range. It's
(necessarily) outside the range of unsigned char.
 
M

Mark McIntyre

No. If char is signed (as it is on many implementations), then -1
(the most common value of EOF) is well within its range.

However ISTR that EOF is guaranteed not to be any char that the
platform uses (else how would the std library functions differentiate
eof from a valid char?), so for some definitions of 'out of range' its
out.
 
K

Keith Thompson

Mark McIntyre said:
However ISTR that EOF is guaranteed not to be any char that the
platform uses (else how would the std library functions differentiate
eof from a valid char?), so for some definitions of 'out of range' its
out.

CBFalconer simply made a slight error (which he later acknowledged).
The value of EOF is required to be a negative number which, because
it's negative, is necessarily outside the range of values of unsigned
char.

There's no implication that EOF needs to be a (signed) char value that
the platform doesn't use. For example, on a system with signed 8-bit
char using ISO 8859-1, the char value -1 corresponds to a printable
character (a 'y' with a diaresis (umlaut)). This isn't a problem for
the standard library functions. For example, fgetc() returns either
EOF (on end-of-file or error) or the input character interpreted as an
*unsigned* char converted to int.

The standard does require (C99 6.2.5p3) that all members of the basic
character set have to have positive values when expressed as plain
char, but that's not necessarily related to EOF. (I'm not even sure
why that requirement exists.) ASCII-based character sets meet this
requirement automatically; EBCDIC-based character sets require plain
char to be unsigned.

(If I were designing the language from scratch today, I'd require
plain char to be unsigned, avoiding all the stuff about having to
interpret characters as unsigned char in various contexts. I'd also
make a stronger distinction between character types and integer types,
and between characters and bytes. Too late.)
 
P

pete

Mark McIntyre wrote:
However ISTR that EOF is guaranteed not to be any char that the
platform uses (else how would the std library functions differentiate
eof from a valid char?), so for some definitions of 'out of range' its
out.

Standard Library functions need not be written in portable C,
so there's any number of ways.

But, you can use ferror to differentiate,
for output functions, for example.

int fputs(const char *s, FILE *stream)
{
while (*s != '\0') {
if (putc(*s, stream) == EOF && ferror(stream) != 0) {
return EOF;
}
++s;
}
return 0;
}

feof can be used additionally
to get the whole story of what's going on with input functions.
 
M

Michael Wojcik

If I were designing the language from scratch today, I'd require
plain char to be unsigned, avoiding all the stuff about having to
interpret characters as unsigned char in various contexts.

I've wondered whether there are architectures where forcing char to
be unsigned would adversely affect performance - if for example there
was a fast sign-propagating widening operation but not a fast non-
propagating one.

My assumption is that C89 left the signedness of plain char up to the
implementation because there were existing pre-standard implementations
on both sides, but I'd be interested to hear if performance was also
possibly a consideration.

--
Michael Wojcik (e-mail address removed)

Maybe, but it can't compete with _SNA Formats_ for intricate plot
twists. "This format is used only when byte 5, bit 1 is set to 1
(i.e., when generalized PIU trace data is included)" - brilliant!
 
C

Chris Torek

The fgetc() function returns a value "as if" the (plain) char had
been unsigned to start with, so that on ordinary signed-char 8-bit
systems, fgetc() returns either EOF, or a value in [0..255]. As
long as UCHAR_MAX <= INT_MAX, EOF can be defined as any negative
"int" value.

The fputc() function should logically be called with equivalent
values, but the Standard says that it just converts its argument
to unsigned char -- so fputc(EOF, stream) just does the same thing
as fputc((unsigned char)EOF, stream).

Hosted implementations on machines in which "char" and "int" have
the same range (e.g., 32-bit char and 32-bit int) have a problem.
(The only implementations I know of in which char and int have the
same range are not "hosted", so they do not have to make stdio
work.)

Standard Library functions need not be written in portable C,
so there's any number of ways.

Indeed. On the other hand, the example below is not particularly
good, I think:
But, you can use ferror to differentiate,
for output functions, for example.

int fputs(const char *s, FILE *stream)
{
while (*s != '\0') {
if (putc(*s, stream) == EOF && ferror(stream) != 0) {
return EOF;
}
++s;
}
return 0;
}

The first problem is that ferror(stream) could be nonzero even
before entering this fputs(). (This is not actually harmful in
this case, as I will explain in a moment, but it suggests a
perhaps-incorrect model. Just because output failed earlier
does not necessarily mean that output will continue to fail.
Consider a floppy disk with a single bad sector, in which writes
to the bad sector fail, but writes to the rest of the disk work.)

The second problem is that the test is redundant, except on those
UCHAR_MAX > INT_MAX implementations that have problems implementing
fgetc(). The reason is that fputc() returns the character put,
i.e., (unsigned char)*s, on success. If UCHAR_MAX <= INT_MAX,
fputc() (and thus putc()) can only return EOF on failure, in the
same way that fgetc() can only return EOF on failure-or-EOF.
 
P

pete

Chris said:
Indeed. On the other hand, the example below is not particularly
good, I think:


The first problem is that ferror(stream) could be nonzero even
before entering this fputs(). (This is not actually harmful in
this case, as I will explain in a moment, but it suggests a
perhaps-incorrect model.

I don't see any problem with putc(*s, stream)
if ferror(stream) is nonzero.
Just because output failed earlier
does not necessarily mean that output will continue to fail.
Consider a floppy disk with a single bad sector, in which writes
to the bad sector fail, but writes to the rest of the disk work.)

I would say that whether or not it continues to write
after a failed bad sector, is up to the implementor,
and in this case that's me.
The second problem is that the test is redundant, except on those
UCHAR_MAX > INT_MAX implementations that have problems implementing
fgetc().

UCHAR_MAX > INT_MAX counts.
This code is for reading and discussing.
I'm not posting an example of code to use.
 
P

pete

Indeed. On the other hand, the example below is not particularly
good, I think:


The first problem is that ferror(stream) could be nonzero even
before entering this fputs(). (This is not actually harmful in
this case, as I will explain in a moment, but it suggests a
perhaps-incorrect model. Just because output failed earlier
does not necessarily mean that output will continue to fail.
Consider a floppy disk with a single bad sector, in which writes
to the bad sector fail, but writes to the rest of the disk work.)

Do you think it would be more better if fputs started
with a clearerr function call?
 
P

pete

pete said:
Do you think it would be more better if fputs started
with a clearerr function call?

Or to put it another way:
What are output functions supposed to do
if the error indicator is set
prior to the output function being called?
 
C

Chris Torek

Chris said:
... ferror(stream) could be nonzero even before entering this
[implementation's output-producing code]. [But] Just because output
failed earlier does not necessarily mean that output will continue
to fail.

Or to put it another way:
What are output functions supposed to do
if the error indicator is set
prior to the output function being called?

As far as I can tell, the Standards are not very specific.

For the first question I think it is "obvious" that no output
function (fputs, fwrite, fputc, etc.) should clear the error
indicator at entry. It is supposed to be cumulative, so that:

do_some_output();
do_more_output();
do_yet_more_output();
if (ferror(outstream)) ... handle failure of any earlier output ...

It is even more obvious that clearerr() is wrong because clearerr()
clears both the error and EOF indicators. :) (Of course, the
EOF indicator should probably be clear in the first place, as it
is set only on failure-in-fgetc()-or-equivalent and cleared by any
fseek or rewind operation, which one would normally find between
input and output attempts. But I believe Standard C allows an
output operation immediately after an input operation that returns
EOF.)

Now, once the EOF flag is set, C99 requires that further read
attempts continue to return EOF:

/* this first line assumes UCHAR_MAX <= INT_MAX */
if (getchar() == EOF && getchar() != EOF)
puts("this is not a valid C99 implementation");

if (feof(stream) && fgetc(stream) != EOF)
puts("this is not a valid C99 implementation");

(C89 allows the puts() calls to occur.) One might then argue that if
error is set in an earlier output operation, further attempts to
output should also fail immediately. But I think neither C89 nor
C99 require this.
 
L

Lawrence Kirby

Or to put it another way:
What are output functions supposed to do
if the error indicator is set
prior to the output function being called?

Output functions should never clear the error indicator. It is a common
and reasonable approach to perform several output operations and then use
ferror() to check if any of them failed. Committee members have stated
that the error indicator is supposed to be sticky.

If fputs() is called with the error indicator already set then it could
either return a failure or attempt to perform the operation as normal.
Either way the error indicator would still be set on return.

Lawrence
 
P

pete

I accidentally emailed this to Chris Torek first
with my munged return address. Sorry about that!

Chris said:
Chris Torek wrote:
... ferror(stream) could be nonzero even before entering this
[implementation's output-producing code].
[But] Just because output
failed earlier does not necessarily mean that output will continue
to fail.

Or to put it another way:
What are output functions supposed to do
if the error indicator is set
prior to the output function being called?

As far as I can tell, the Standards are not very specific.

For the first question I think it is "obvious" that no output
function (fputs, fwrite, fputc, etc.) should clear the error
indicator at entry. It is supposed to be cumulative, so that:

do_some_output();
do_more_output();
do_yet_more_output();
if (ferror(outstream)) ... handle failure of any earlier output ...

It is even more obvious that clearerr() is wrong because clearerr()
clears both the error and EOF indicators. :)

I realised that after some moments of thought.
(Of course, the
EOF indicator should probably be clear in the first place, as it
is set only on failure-in-fgetc()-or-equivalent and cleared by any
fseek or rewind operation, which one would normally find between
input and output attempts. But I believe Standard C allows an
output operation immediately after an input operation that returns
EOF.)

Now, once the EOF flag is set, C99 requires that further read
attempts continue to return EOF:

/* this first line assumes UCHAR_MAX <= INT_MAX */
if (getchar() == EOF && getchar() != EOF)
puts("this is not a valid C99 implementation");

if (feof(stream) && fgetc(stream) != EOF)
puts("this is not a valid C99 implementation");

(C89 allows the puts() calls to occur.) One might then argue that if
error is set in an earlier output operation, further attempts to
output should also fail immediately. But I think neither C89 nor
C99 require this.

Thank you.
But then, is the definition of fputs that I posted,
wrong or is it just not so great?

My intention was for it to be minimilisticly adequate for the DS9k.
 
D

Dave Thompson

I've wondered whether there are architectures where forcing char to
be unsigned would adversely affect performance - if for example there
was a fast sign-propagating widening operation but not a fast non-
propagating one.
There was one originally very important one: the PDP-11, at least
to-register, the common case for computation though not assignment and
arg passing. Although the impact on programs overall would vary. And
if you (mean to) only force plain char unsigned but still allow
explicitly signed char -- as C did for all other integer types until
My assumption is that C89 left the signedness of plain char up to the
implementation because there were existing pre-standard implementations
on both sides, but I'd be interested to hear if performance was also
possibly a consideration.
11 was mostly gone by 89, and IIRC even VAX was beginning to falter.
Though development of the standard had started years earlier.

Maybe, but it can't compete with _SNA Formats_ for intricate plot
twists. "This format is used only when byte 5, bit 1 is set to 1
(i.e., when generalized PIU trace data is included)" - brilliant!

I'll see your trace data and raise you a Conditional End Bracket and
(I think my favorite) an Isolated Pacing Response. But, at least in
the long-gone days I looked at it, it was Formats *and Protocols*,
which was vital for breaking the 1000-page mark.

- David.Thompson1 at worldnet.att.net
 
M

Michael Wojcik

There was one originally very important one: the PDP-11, at least
to-register, the common case for computation though not assignment and
arg passing.

Ah. Thanks for the example. I never did anything much with the -11,
so I don't know much about its architecture (though I have read posts
about it on a.f.c).
Although the impact on programs overall would vary. And
if you (mean to) only force plain char unsigned but still allow
explicitly signed char -- as C did for all other integer types until
<spit> _Bool </> -- the programmer would have the choice.

Yes, that's what I meant - platforms where performance might suffer
if C required that plain char be unsigned, rather than making it
implementation-defined.
I'll see your trace data and raise you a Conditional End Bracket and
(I think my favorite) an Isolated Pacing Response. But, at least in
the long-gone days I looked at it, it was Formats *and Protocols*,
which was vital for breaking the 1000-page mark.

At some point IBM split the various protocols off into separate
books. LU6.2, for example, was split off first into the _LU6.2
Format and Protocol Reference Manual_, which later became _LU6.2
Reference: Peer Protocols_.

That leaves my edition of _SNA Formats_ (ver 16, from 1996) at a
trim 700 pages or so.

--
Michael Wojcik (e-mail address removed)

[After the lynching of George "Big Nose" Parrot, Dr. John] Osborne
had the skin tanned and made into a pair of shoes and a medical bag.
Osborne, who became governor, frequently wore the shoes.
-- _Lincoln [Nebraska] Journal Star_
 
V

Villy Kruse

Yes, that's what I meant - platforms where performance might suffer
if C required that plain char be unsigned, rather than making it
implementation-defined.

Besides, if the character set being used was pure ASCII or one of
the equivalent iso646 variants all characters encountered would have
positive values in a 8 bit signed integer type. The eight bit was usualy
a parity bit which were stripped before the characters reached the user
level program. On MS-DOS, which have always used some form of 8-bit
character set, the char type was usualy unsigned, so all 256 character
values could be positive.


Villy
 
K

Kenny McCormack

EOF is a macro defined in stdio.h (and other places).

It's only defined in <stdio.h>.[/QUOTE]

I think the program belows shows that it is defined in other places, too:

#include <stdio.h>

#ifdef EOF
char *s1 = "'ifdef' says EOF is defined here - in "__FILE__;
#else
#error EOF is not defined (1)!
#endif

#if defined(EOF)
char *s2 = "'if defined' says EOF is defined here - in "__FILE__;
#else
#error EOF is not defined (2)!
#endif

int main(int argc,char **argv)
{
printf("s1 = '%s'\n",s1);
printf("s2 = '%s'\n",s2);
return 0;
}
 
K

Kenneth Brody

Kenny said:
It's only defined in <stdio.h>.

I think the program belows shows that it is defined in other places, too:

#include <stdio.h>

#ifdef EOF
char *s1 = "'ifdef' says EOF is defined here - in "__FILE__;
#else
#error EOF is not defined (1)!
#endif[/QUOTE]
[...]

No, it doesn't. It merely shows that something, somewhere, has defined
it by the time you hit the #ifdef. It does absolutely nothing to tell
you where it was defined.

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenny McCormack

Kenny said:
I think the program belows shows that it is defined in other places, too:

#include <stdio.h>

#ifdef EOF
char *s1 = "'ifdef' says EOF is defined here - in "__FILE__;
#else
#error EOF is not defined (1)!
#endif
[...]

No, it doesn't. It merely shows that something, somewhere, has defined
it by the time you hit the #ifdef. It does absolutely nothing to tell
you where it was defined.

They just said that it wasn't defined anywhere other than in stdio.h, and
I disproved that. Obviously, it is defined in my program.
 
L

Lawrence Kirby

Kenny said:
EOF is a macro defined in stdio.h (and other places).

It's only defined in <stdio.h>.

I think the program belows shows that it is defined in other places, too:

#include <stdio.h>

#ifdef EOF
char *s1 = "'ifdef' says EOF is defined here - in "__FILE__;
#else
#error EOF is not defined (1)!
#endif
[...]

No, it doesn't. It merely shows that something, somewhere, has defined
it by the time you hit the #ifdef. It does absolutely nothing to tell
you where it was defined.

They just said that it wasn't defined anywhere other than in stdio.h, and
I disproved that. Obviously, it is defined in my program.

Because <stdio.h> defines it. It is the act of including <stdio.h> which
causes it to be defined. There is nothing else in your program that
defines it.

What the output of your program is saying is that at the point of the
preprocessor tests a definition of EOF is visible. It is not saying that
the definition itself is in your source file.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,166
Messages
2,570,907
Members
47,448
Latest member
DeanaQ4445

Latest Threads

Top