Qry : Behaviour of fgets -- ?

D

Douglas A. Gwyn

Richard said:
On strcpy(), yes, but on fgets()? A char * and a FILE * can only overlap
if a. you're invoking UB anyway, by scribbling wildly into the FILE
object through a mispointed char *, or b. you're invoking UB anyway, by
scribbling neatly into the FILE object using undefined and unportable
assumptions about the layout of the FILE.

Um, no, consider fgets((char *)(void *)fp, (int)sizeof(FILE), fp);
while it may be a stupid thing to do, if no further use is made
of the associated stream it would have been allowed, and the
implementation would have to make it work, which means making
a copy of the FILE structure before starting the transfer.
By prohibiting overlap of the pointed-to objects, we assure
the implementor that he doesn't have to worry about that.
 
C

Charlie Gordon

Peter J. Holzer said:
Keith Thompson said:
jacob navia wrote:
[...]
In the case of fgets this implies: [...]
o Testing that the non-null stream points to a legal C object
Only the char * pointer would be hard to verify, the stream pointer
should
be easy to check against open streams handled by the library, as any
conforming library needs to account for at least those open for write and
update:

7.19.5.2p3 If stream is a null pointer, the fflush function performs
this
flushing action on all
streams for which the behavior is defined above.

C libraries used to allocate FILE structures from a static array:
checking
for valid FILE pointers was pretty easy then. I would be surprised if
current implementations could no longer do this efficiently.

On many modern system there is no fixed limit on the number of open
files, so a static array isn't feasible. And once you start allocating
them individually, you either have to check all of them (which is not
efficient) or use a more complicated structure (like a binary tree) only
this check.

Instead of allocating them individually, you can allocate them in chunks of
increasing sizes, keeping the complexity of the test in logN, which is still
quite efficient.
 
K

Kenneth Brody

Richard said:
No they don't. They either leave it as is or increments/decrements.

Are you 100% certain? There is no possibility that it will do
something other than (1) leave it as is or (2) increment/decrement
it?

While I don't have a DS9000 (I can't afford the upgrade from my
older DS6000), I understand that CBFalconer's description is
pretty accurate for the DS9000. And, as he says, it is standard
conforming.

FYI - the DS6000 gets stuck in a bus conflict exception loop on the
execution of "i = i++;". (Assuming i is an integer type.)

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenneth Brody

Charlie said:
Instead of allocating them individually, you can allocate them in chunks of
increasing sizes, keeping the complexity of the test in logN, which is still
quite efficient.

Well, you can't realloc() the array, as that would invalidate all
existing FILE* values. (Unless it returned the same address, which
is certainly not guaranteed, or even very likely in most scenarios.)

You can, however, allocate chunks of fixed-sized arrays which are then
put on a linked list. Not as efficient as a single array, but a lot
more efficient than individual allocations.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenneth Brody

Wojtek said:
Charlie Gordon said:
Note that the same effet can be achieved without an actual FILE structure,
using a pointer instead of the address as the "special value":

static char dummy_file_object[1] = "";
FILE *dummy_file_pointer = (FILE *)&dummy_file_object;
dummy_file_pointer is guaranteed to be unequal to any FILE * returned by
fopen.

That doesn't work if FILE has non-trivial alignment requirements.

Does alignment matter if the pointer is never dereferenced? From
context, I assume its sole purpose is as a value to be stored and
compared against. You know that the address is valid, so it can't
be a trap representation.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenneth Brody

Casper H.S. Dik said:
Keith Thompson said:
No, FILE cannot be an incomplete type. C99 7.19.1p2:

which is an object type capable of recording all the information
needed to control a stream [...]
I don't think think this requirement is necessary (i.e., the standard
could just as easily have allowed, or even required, FILE to be an
incomplete type without breaking any reasonable code), but there it
is.

Yep, when the 64 bit ABI for Solaris was defined our hopes to make
FILE completely opaque were quickly dashed by some standards
tests. (Now it's an opaque blob of memory of fixed size)

The hoops we have to jump through in 32 bit space to add additional
data are not pretty.

Is this a valid FILE definition?

typedef struct
{
struct _OPAQUE_FILE *private;
}
FILE;

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenneth Brody

CBFalconer wrote:
[...]
However to be able to supply getc and putc as macros, with a FILE*
argument, it is necessary that FILE be a complete type. However it
is the height of foolishness to use this description, which would
be available in stdio.h.

But the standard doesn't say that getc/putc _are_ macros, just that
the _can_ be.

7.19.7.5p2 (emphasis mine)

The getc function is equivalent to fgetc, except that _if_it_is_
_implemented_as_a_macro, it may evaluate stream more than once,
so the argument should never be an expression with side effects.

7.19.7.8p2 says the same for putc.

Not that I disagree that peeking in a FILE* is a bad idea to begin
with. (Though, to be honest, I have done so [in some system-specific
code, of course] in order to see if there was any buffered input
available.)

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
?

=?iso-2022-kr?q?Harald_van_D=0E=29=26=0Fk?=

I think you missed the fact that <stdio.h> is *not* #included in the
context in which the "fgets" layering macro is defined and used.

This started with:

Army1987 said:
You're not allowed to do that.

where you did include <stdio.h> before defining fgets. When /not/
including <stdio.h> you noted that FILE must be defined, and that the way
it must be defined depends on the implementation, so exactly what is it
that "will in fact work on any actual platform"?

(My apologies if you end up reading this more than once. I'm having
problems with my newsreader.)
 
C

Charlie Gordon

Douglas A. Gwyn said:
Um, no, consider fgets((char *)(void *)fp, (int)sizeof(FILE), fp);
while it may be a stupid thing to do, if no further use is made
of the associated stream it would have been allowed, and the
implementation would have to make it work, which means making
a copy of the FILE structure before starting the transfer.
By prohibiting overlap of the pointed-to objects, we assure
the implementor that he doesn't have to worry about that.

nonsense!

and of course fread(fp, sizeof FILE, 1, fp) invokes UB as well, no need to
sprinkle prototypes with useless keywords to spell the obvious.

If this is the kind of issue on which the commitee focusses between releases
of the Standard, the situation is hopeless.
 
W

Wojtek Lerch

Douglas A. Gwyn said:
Um, no, consider fgets((char *)(void *)fp, (int)sizeof(FILE), fp);
while it may be a stupid thing to do, if no further use is made
of the associated stream it would have been allowed, and the
implementation would have to make it work, which means making
a copy of the FILE structure before starting the transfer.

Are you implying that

memset(stdin,6,sizeof(FILE));

is allowed, as long as no further use is made of stdin by the program? Or
would I also need to make sure that the program terminates by calling
abort() and that abort() doesn't attempt to flush or close streams on my
implementation?

Does this also mean that it's wrong for an implementation to keep track of
all the open streams by using a linked list of all the FILE objects, because
my memset() call may break the list and cause trouble when an unrelated
stream is opened or closed?
 
W

Wojtek Lerch

Kenneth Brody said:
Wojtek said:
Charlie Gordon said:
static char dummy_file_object[1] = "";
FILE *dummy_file_pointer = (FILE *)&dummy_file_object;

That doesn't work if FILE has non-trivial alignment requirements.

Does alignment matter if the pointer is never dereferenced?

Yes.

6.3.2.3#7 "A pointer to an object or incomplete type may be converted to a
pointer to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."
 
A

Army1987

Does alignment matter if the pointer is never dereferenced? From
context, I assume its sole purpose is as a value to be stored and
compared against. You know that the address is valid, so it can't
be a trap representation.
Simply casting a pointer to a pointer to a different type causes
UB if it points to memory not aligned for the latter type.
 
K

Keith Thompson

Not necessary, but sufficient.


True, but in that case you might as well make FILE a complete type.

Making it incomplete would make it more difficult for programs to
abuse it by declaring FILE objects. We can't prevent perverse
programmers from examining the expansion of the getc() macro and
messing around with the internals of _FILE< but we can make it less
convenient. Or we could, if FILE were permitted to be incomplete.

This isn't a huge deal (i.e., I don't consider it to be a bug in the
standard), but IMHO it would make sense for a future standard to allow
FILE to be an incomplete type. It wouldn't break any existing code
that doesn't deserve to be broken.
 
A

André Gillibert

Keith said:
Making it incomplete would make it more difficult for programs to
abuse it by declaring FILE objects. We can't prevent perverse
programmers from examining the expansion of the getc() macro and
messing around with the internals of _FILE< but we can make it less
convenient.

typedef struct FILE {char dummy;} FILE;

Make messing with the internals as inconvenient as the incomplete type:

typedef struct FILE FILE;

An incomplete type would be less "artificial", I agree, but we can keep it
as fool proof as with an incomplete type.
 
K

Keith Thompson

Charlie Gordon said:
nonsense!

and of course fread(fp, sizeof FILE, 1, fp) invokes UB as well, no need to
sprinkle prototypes with useless keywords to spell the obvious.

It's obvious to you and to me, but it's not obvious to the compiler.
If this is the kind of issue on which the commitee focusses between releases
of the Standard, the situation is hopeless.

Everybody here knows that the FILE object and the buffer will never
actually overlap in real code, and that if they do, it's (almost?)
certainly the result of something that invokes undefined behavior
anyway. The point of the 'restrict' qualifiers is to convey that
"obvious" knowledge to the compiler. This might enable the compiler
to generate slightly better code; at worst, it should be harmless.

The point, I think, is that the 'restrict' qualifiers allow the
compiler to assume that the char* and FILE* arguments point to
non-overlapping memory. This might permit the compiler to generate
slightly better code in some cases. At worst, it should be harmless.

As a programmer, either writing a call to fgets() or implementing
fgets() itself, you can safely ignore the 'restrict' qualifiers.

It could be argued that pointer arguments should be restrict-qualified
unless there's a specific reason not to. Another incompatible C-like
language might even make 'restrict' the default, and require an
explicit qualifier when the arguments *might* point to overlapping
memory. (Doesn't Fortran do something like that?) Doing that for C
would have broken too much code.
 
D

Douglas A. Gwyn

Kenneth said:
Does alignment matter if the pointer is never dereferenced?

It might; it depends on the architecture.
Also, the C implementation might rely on the alignment
assumption when generating code for pointer conversion.
The following will not work properly on many word-addressed
platforms:
union t { int i; char c[2]; } u, *x;
char *p = &u.c[1]; *q;
x = (union t *)p; // might be allowed
q = (char *)x; // byte selector is lost
assert(p == q); // likely to fail!
 
D

Douglas A. Gwyn

Wojtek said:
Are you implying that
memset(stdin,6,sizeof(FILE));
is allowed, as long as no further use is made of stdin by the program?

It's not supposed to be allowed, but so far as I know the standard
neglected to specify that.

Anyway, to further develop the previous idea, I suppose
one could save the FILE object somewhere, clobber it without
performing any stdio operations, then restore the object
before proceeding. While not something to be emncouraged,
it could have been assumed that it is allowed (in a s.c.
program). Since this never came up for discussion that I
recall, a poll would have to be conducted to see what the
intent may have been, if any.
 
A

André Gillibert

Keith Thompson wrote:

It could be argued that pointer arguments should be restrict-qualified
unless there's a specific reason not to. Another incompatible C-like
language might even make 'restrict' the default, and require an
explicit qualifier when the arguments *might* point to overlapping
memory. (Doesn't Fortran do something like that?)

Yes, Fortran does that. It generated endless flamewars about why Fortran
beats C at performances because "Fotran has no pointer", while in reality,
Fortran has disguised pointers, but they mustn't be aliased as function
arguments.

The questions about nested structures and pointers didn't exist in Fortran
77 as there were no structures or pointers to pointers (or even explicit
pointers/references).
Actually, avoiding aliasing issues was relatively straightforward.
Rule of thumb: Don't pass twice the *same* (as recognized by the variable
name) array in a function call.
Doing that for C
would have broken too much code.

Yes, of course.
However, one of the (pre-1999) C implementation I use has a non-conforming
option "Assume no pointer aliasing". It specifically does that. I didn't
understood this option (and never enabled it) until I read about the
restrict qualifier in C99.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,100
Messages
2,570,634
Members
47,240
Latest member
taarariachand

Latest Threads

Top