Qry : Behaviour of fgets -- ?

C

Casper H.S. Dik

jacob navia said:
Nothing is said about error handling. It is just
"UNDEFINED".

Yes. And when a programmer invokes "undefined" behaviour
*ANYTHING* can happen.

It's not quite the same as "no errors should happen"; it's
"thou shalt not invoke undefined behaviour".

The reason why this behaviour is undefined and why implementors
are free to ignore this is fairly straight forward:

- it's prohibitively expensive to test for any possible
form of undefined behaviour
Why stop at (FILE *)NULL; why not test for
(FILE *)0x1 or any non-mapped or non-aligned pointer or
pointer not returned by f*open?

- undefined behaviour is only invoked by programs with bugs
in it so why should other programs pay for this?

Dumping core is a good thing; you have a bug, it was caught. Now go
and fix your code.

Casper
 
R

Richard Heathfield

Casper H.S. Dik said:

The standard should not make it easier for buggy programs to
blunder on. Agreed.

The implementation is doing you a favour by dumping core for this
particular bit of undefined behaviour. Agreed.

You have a bug in your program Agreed.

(you invoke undefined behaviour, always a bug).

Here, however, I must beg to differ. There are times when one must
invoke undefined behaviour if one is to get something done.

It is not a bug to point directly at video memory, for example. Yes, to
do this will necessarily make your code non-portable. Yes, you're
leaving behind all the guarantees that the Standard offers. But no, it
is not a bug, if one is deliberately setting out to write that code to
achieve one's goal because the Standard doesn't offer any satisfactory
way to achieve it.

<snip>
 
C

Casper H.S. Dik

Richard Heathfield said:
Here, however, I must beg to differ. There are times when one must
invoke undefined behaviour if one is to get something done.

Such as? You are nopt confusing implementation defined behaviour
with undefined behaviour?
It is not a bug to point directly at video memory, for example. Yes, to
do this will necessarily make your code non-portable. Yes, you're
leaving behind all the guarantees that the Standard offers. But no, it
is not a bug, if one is deliberately setting out to write that code to
achieve one's goal because the Standard doesn't offer any satisfactory
way to achieve it.

Ah, yes, but I would say that such things are covered by
"extensions to the standard" rather than down-right undefined
behaviour.

Casper
 
J

jacob navia

Casper said:
Yes. And when a programmer invokes "undefined" behaviour
*ANYTHING* can happen.

It's not quite the same as "no errors should happen"; it's
"thou shalt not invoke undefined behaviour".

The reason why this behaviour is undefined and why implementors
are free to ignore this is fairly straight forward:

- it's prohibitively expensive to test for any possible
form of undefined behaviour
Why stop at (FILE *)NULL; why not test for
(FILE *)0x1 or any non-mapped or non-aligned pointer or
pointer not returned by f*open?

- undefined behaviour is only invoked by programs with bugs
in it so why should other programs pay for this?

Dumping core is a good thing; you have a bug, it was caught. Now go
and fix your code.

Casper

Why error analysis is necessary?

Error analysis means trying to have a defined an in all cases
identical reaction to program errors.

This means that for each function we write, we try to
return a specified error report value if things go wrong.

In the case of fgets this implies:
o Testing for NULL.
o Testing for a bad value of n (<= 0)
o Testing for a NULL value in the receiving buffer.

This means 3 integer comparisons in this case. Compared to the
i/o that fgets is going to do anyway, those 3 integer
comparisons amount to NOTHING.
 
R

Richard Heathfield

Casper H.S. Dik said:
Such as? You are nopt confusing implementation defined behaviour
with undefined behaviour?

No. Implementation-defined behaviour is behaviour that the Standard
requires the implementation to document. The Standard does not require
any implementation to document the behaviour of:

unsigned char *p = (unsigned char *)0xb8000000UL;
*p++ = 'A';
*p++ = 0x84;
Ah, yes, but I would say that such things are covered by
"extensions to the standard" rather than down-right undefined
behaviour.

The behaviour of extensions, as far as I'm aware, is not defined by the
Standard, and therefore is covered by the definition of undefined
behaviour, which (according to my C89 draft) is:

"Undefined behavior --- behavior, upon use of a nonportable or erroneous
program construct, of erroneous data, or of indeterminately-valued
objects, for which the Standard imposes no requirements. Permissible
undefined behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message)."
 
C

Chris Dollin

jacob said:
Why error analysis is necessary?

Error analysis means trying to have a defined an in all cases
identical reaction to program errors.

This means that for each function we write, we try to
return a specified error report value if things go wrong.

In the case of fgets this implies:
o Testing for NULL.
o Testing for a bad value of n (<= 0)
o Testing for a NULL value in the receiving buffer.

o Testing for a bad value of n (> maximum possible line length)

o Testing that the non-null buffer points to a legal C object
containing at least the `n` characters required

o Testing that the non-null stream points to a legal C object
This means 3 integer comparisons in this case. Compared to the
i/o that fgets is going to do anyway, those 3 integer
comparisons amount to NOTHING.

If you're /serious/ about checking for errors, "3 integer comparisions"
is a substantial underestimate.
 
J

jacob navia

Chris said:
o Testing for a bad value of n (> maximum possible line length)

That would be another integer comparison, but I do not
see immediately where you get this value: "maximum
possible line length"...
o Testing that the non-null buffer points to a legal C object
containing at least the `n` characters required

o Testing that the non-null stream points to a legal C object

In general, this is impossible and in the best case it would
repeat in software the tests done by the hardware.

Yes, maybe in some systems this is possible and very cheap.


But you are just making a caricature of what I said, a
well known way of discussion without being serious.

If you're /serious/ about checking for errors, "3 integer comparisions"
is a substantial underestimate.
No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.
Consequence:
We do not test anything at all.

Engineering is a long list of compromises. You have to test what can be
tested and not more but not less.
 
R

Richard

Keith Thompson said:
Yes, such a change could be made, but passing a null pointer to fgets
(or to most standard library functions) is only one of a nearly
infinite number of possible errors. fgets() could detect a null
pointer, but it couldn't reasonably detect all possible invalid
arguments. The burden is on the programmer to avoid passing invalid
arguments; fgets() just has to work properly if it the arguments are
valid.

And if the standard committee decided to require fgets() to behave
reasonably with a null pointer argument, it would be literally decades
before all implementations would conform.

In addition, sometimes two bugs make a feature. If the behaviour changed
over night after a new compile there might be news bugs which could go
days without being discovered in huge legacy code bases. Never
underestimate the knock on effects of changing code libraries. Even if
it is a "one liner".
 
A

Army1987

On Fri, 07 Sep 2007 12:40:38 +0200, jacob navia wrote:

[snip]
[about checking whether fgets() is called with either s or stream
being NULL or n being nonpositive]
No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.
Consequence:
We do not test anything at all.

Engineering is a long list of compromises. You have to test what can be
tested and not more but not less.
Avoiding to call fgets() with null pointers is not harder than
avoiding to call it with pointers to invalid memory. Since,
anyway, the programmer has to assure that stream points to a
vaild FILE, he must also assure that it is not NULL. So testing
whether stream is NULL without being able to check whether it
points to a valid FILE is not very useful. Of course, you are
completely free to do that in your fgets() implementation.
 
J

jacob navia

Army1987 said:
On Fri, 07 Sep 2007 12:40:38 +0200, jacob navia wrote:

[snip]
[about checking whether fgets() is called with either s or stream
being NULL or n being nonpositive]
No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.
Consequence:
We do not test anything at all.

Engineering is a long list of compromises. You have to test what can be
tested and not more but not less.
Avoiding to call fgets() with null pointers is not harder than
avoiding to call it with pointers to invalid memory.

The standard doesn't supply an
bool isvalid_file(FILE *);

This routine would be very easy and cheap to implement
for the folks that wrote the standard library but
it wasn't added to C99.

Then , error checking is very difficult. Still, testing for obvious
bad values such as NULL is feasible and would catch a lot of errors
without just crashing the program.
Since,
anyway, the programmer has to assure that stream points to a
vaild FILE, he must also assure that it is not NULL.

Which programmer? I repeat: this could be a library routine
taht receives the file handle from unknown sources.
So testing
whether stream is NULL without being able to check whether it
points to a valid FILE is not very useful.

Since we can't really check for all possible errors, better
not to check anything.
 
J

jacob navia

Richard said:
In addition, sometimes two bugs make a feature. If the behaviour changed
over night after a new compile there might be news bugs which could go
days without being discovered in huge legacy code bases. Never
underestimate the knock on effects of changing code libraries. Even if
it is a "one liner".

I can't parse that
> If the behaviour changed
> over night after a new compile there might be news bugs which could go
> days without being discovered in huge legacy code bases.

You mean programs that relied on fgets making a segmentation fault
would now work and that would produce new bugs?

???

Please explain
 
A

Army1987

Army1987 said:
On Fri, 07 Sep 2007 12:40:38 +0200, jacob navia wrote:

[snip]
So testing
whether stream is NULL without being able to check whether it
points to a valid FILE is not very useful.

Since we can't really check for all possible errors, better
not to check anything.
Do you think that avoiding to call fgets() with a NULL stream is
any harder than avoiding to call it with an invalid pointer? Do
you think that it is ever possible to do the latter without doing
the former? If fgets() doesn't check whether stream points to a
valid FILE, one must avoid to call it with an invalid pointer.
And if one avoids to call it with an invalid pointer, he/she/it
avoids to call it with a null pointer, almost by definition. And
if one avoids to call it with a null pointer, checking that within
fgets() is useless.
 
J

Joachim Schmitz

jacob navia said:
Army1987 said:
On Fri, 07 Sep 2007 12:40:38 +0200, jacob navia wrote:

[snip]
[about checking whether fgets() is called with either s or stream
being NULL or n being nonpositive]
This means 3 integer comparisons in this case. Compared to the
i/o that fgets is going to do anyway, those 3 integer
comparisons amount to NOTHING.
If you're /serious/ about checking for errors, "3 integer comparisions"
is a substantial underestimate.

No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.
Consequence:
We do not test anything at all.

Engineering is a long list of compromises. You have to test what can be
tested and not more but not less.
Avoiding to call fgets() with null pointers is not harder than
avoiding to call it with pointers to invalid memory.

The standard doesn't supply an
bool isvalid_file(FILE *);

This routine would be very easy and cheap to implement
for the folks that wrote the standard library but
it wasn't added to C99.

Then , error checking is very difficult. Still, testing for obvious
bad values such as NULL is feasible and would catch a lot of errors
without just crashing the program.
Wouldn't someone that calls fgets with a FILE * of NULL be very likely not
to check fgets' returnvalue too?
Then the only sensible thing fgets could do about this is an
assert(stream!=NULL); causing the program to abort right on the spot,
spitting out a more or less usefull diagnostic.

Bye, Jojo
 
J

jacob navia

Joachim said:
jacob navia said:
Army1987 said:
On Fri, 07 Sep 2007 12:40:38 +0200, jacob navia wrote:

[snip]
[about checking whether fgets() is called with either s or stream
being NULL or n being nonpositive]
This means 3 integer comparisons in this case. Compared to the
i/o that fgets is going to do anyway, those 3 integer
comparisons amount to NOTHING.
If you're /serious/ about checking for errors, "3 integer comparisions"
is a substantial underestimate.

No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.
Consequence:
We do not test anything at all.

Engineering is a long list of compromises. You have to test what can be
tested and not more but not less.
Avoiding to call fgets() with null pointers is not harder than
avoiding to call it with pointers to invalid memory.
The standard doesn't supply an
bool isvalid_file(FILE *);

This routine would be very easy and cheap to implement
for the folks that wrote the standard library but
it wasn't added to C99.

Then , error checking is very difficult. Still, testing for obvious
bad values such as NULL is feasible and would catch a lot of errors
without just crashing the program.
Wouldn't someone that calls fgets with a FILE * of NULL be very likely not
to check fgets' returnvalue too?

"Someone that calls fgets with a FILE * of NULL"

Yes, that "someone" looks very stupid but all bugs are stupid.
To avoid having software that breaks at the slightest error
with catastrophic failures we should at least try to
establish fail safe procedures at the base of it...
Then the only sensible thing fgets could do about this is an
assert(stream!=NULL); causing the program to abort right on the spot,
spitting out a more or less usefull diagnostic.

Microsoft proposed in their Technical Report a general exception
mechanism. We could use that...
 
J

Joachim Schmitz

jacob navia said:
Joachim said:
jacob navia said:
Army1987 wrote:
On Fri, 07 Sep 2007 12:40:38 +0200, jacob navia wrote:

[snip]
[about checking whether fgets() is called with either s or stream
being NULL or n being nonpositive]
This means 3 integer comparisons in this case. Compared to the
i/o that fgets is going to do anyway, those 3 integer
comparisons amount to NOTHING.
If you're /serious/ about checking for errors, "3 integer
comparisions"
is a substantial underestimate.

No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.
Consequence:
We do not test anything at all.

Engineering is a long list of compromises. You have to test what can
be
tested and not more but not less.
Avoiding to call fgets() with null pointers is not harder than
avoiding to call it with pointers to invalid memory.
The standard doesn't supply an
bool isvalid_file(FILE *);

This routine would be very easy and cheap to implement
for the folks that wrote the standard library but
it wasn't added to C99.

Then , error checking is very difficult. Still, testing for obvious
bad values such as NULL is feasible and would catch a lot of errors
without just crashing the program.
Wouldn't someone that calls fgets with a FILE * of NULL be very likely
not to check fgets' returnvalue too?

"Someone that calls fgets with a FILE * of NULL"

Yes, that "someone" looks very stupid but all bugs are stupid.
To avoid having software that breaks at the slightest error
with catastrophic failures we should at least try to
establish fail safe procedures at the base of it...
Then the only sensible thing fgets could do about this is an
assert(stream!=NULL); causing the program to abort right on the spot,
spitting out a more or less usefull diagnostic.

Microsoft proposed in their Technical Report a general exception
mechanism. We could use that...
I'd just prever the program to aport right there. Much better and easier to
debug than to contine with bogus data and corrupted memory which will
eventually lead the program to die at a completly unrelated spot, making it
next to impossible to find the root cause and possible corrupting data
integrity
 
J

jacob navia

Joachim said:
I'd just prever the program to aport right there. Much better and easier to
debug than to contine with bogus data and corrupted memory which will
eventually lead the program to die at a completly unrelated spot, making it
next to impossible to find the root cause and possible corrupting data
integrity

Well, by default that mechanism does exactly that.
Abort.

The only difference is that you can "subclass" it
by writing you own function and having that function
called when an error occurs.
 
S

Sheth Raxit

No. The standard should not make it easier for buggy programs to
blunder on.

Is Assert-ing for NULL is better than Undefined Behaviour ?
Good for Buggy program. Good for platform ?
The implementation is doing you a favour by dumping core for this
particular bit of undefined behaviour. You have a bug in your program
(you invoke undefined behaviour, always a bug). If you had continued
onwards, your codee would not have made a distinction between
"file not found" and "file is empty". Perhaps not relevant for you,
but in other cases this might hide a serious bug in the program.

Casper

-Raxit
http://www.barcamp.org/BarCampMumbaiOct2007 <----BarCampMumbaiOct2007
 
C

Chris Dollin

jacob said:
That would be another integer comparison, but I do not
see immediately where you get this value: "maximum
possible line length"...

From wherever it is.

Actually it's silly of me: even if an implementation has a limit
on the length of line it can deliver, it's perfectly reasonable
to hand it a bigger buffer, ie a larger n. Colour me egg-faced.
In general, this is impossible
Exactly.

and in the best case it would
repeat in software the tests done by the hardware.

More likely the tests done by other software.
Yes, maybe in some systems this is possible and very cheap.

Unlikely, and that's my point.
But you are just making a caricature of what I said,

No: I'm taking your argument about checking seriously.
a well known way of discussion without being serious.

I'm serious.
No, not in this case. You argument seems to be:

If we are going to test exhaustively all errors, that would be
impossible.

It's not /that/ impossible, and I didn't claim it was impossible.
Consequence:
We do not test anything at all.

That's not my argument.

My argument is that if we're going to talk about the costs of doing
argument checking, let's consider taking argument checking seriously.
(Sticking with C.) We see that the range of options -- and costs -- is
wider than you described.
Engineering is a long list of compromises.

Yes (but not /only/ that).
You have to test what can be tested

No, you don't.

You test what it's /cost-effective/ to have tests for.
and not more but not less.

The question (for fgets) is: is the cost of checking for null/negative
arguments worth the benefit?

You obviously feel that it is. I'm not convinced. I'd be more convinced
in a teaching/debugging implementation of C. If I were worried about
negative sizes or null pointers in my use of fgets, I can always check
them myself, for an overhead comparable to that which fgets itself
would apply -- less, since I can eg move the tests out of the loop.

Maybe it's true that the cost of fgets is utterly dominated by the
I/O it does. But maybe it's true I'm already running at the limits
of my processor and even those paltry three instructions are worth
having.
 
B

Ben Bacarisse

jacob navia said:
Joachim Schmitz wrote:

"Someone that calls fgets with a FILE * of NULL"

Yes, that "someone" looks very stupid but all bugs are stupid.
To avoid having software that breaks at the slightest error
with catastrophic failures we should at least try to
establish fail safe procedures at the base of it...

The idea that fail-safe procedures make software more robust is a huge
can of worms and one the is off topic here anyway.

What is on-topic is the idea that the standard has somehow failed to
be safe by allowing fgets to be undefined when called with NULL as the
stream pointer. That is simply not how C works.

The philosophy of Java and C# may be to push everything you might ever
want into a huge library from which you simply pick the functions you
need, but the philosophy of C is to provide just enough for to write
what you need. The standard library is not, of course, actually
minimal from a formal point of view, but it is not far off. If you
need

char *null_safe_fgets(char *s, int n, FILE *fp)
{
return fp ? fgets(s, n, fp) : NULL;
}

then it is not hard to add it to your toolbox. The C idea -- that you
don't pay for what you don't need, means that programs that already
test for NULL on open (and most will want to do that) can avoid paying
for a test for a NULL stream pointer in the IO operations.

Every non-trivial C project I have been involved with includes a small
library (sometimes only a module) that wraps the IO layer in a set of
primitive operations that suit the task in hand. Often, this can be
borrowed from previous work. It is the C way.

It is possible that this is slightly less common now that we have a
good standard, since one reason for doing it in the past was to
localise all the places where one might have to fiddle about when
porting to new C library. However the idea that you write what you
need on top of a small library is deeply embedded in C.
 
K

Kenneth Brody

Richard said:
Casper H.S. Dik said:


No. Implementation-defined behaviour is behaviour that the Standard
requires the implementation to document. The Standard does not require
any implementation to document the behaviour of:

unsigned char *p = (unsigned char *)0xb8000000UL;
*p++ = 'A';
*p++ = 0x84;
[...]

But, what if your particular implementation does document the
behavior? Yes, as far as the Standard is concerned, it's UB,
but have you really invoked UB on that particular platform?

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,087
Messages
2,570,600
Members
47,222
Latest member
jspanther

Latest Threads

Top