How to modify ((c=getchar())!=NULL) to use fscanf or scanf instead?

BartC · Apr 1, 2014

Eric Sosman said:
[...]
Wrong or not, the Pascal design is far simpler to use. For example, I
can

Click to expand...

Simple redeems wrong?

write an outline file-processing loop right now:

while (!eof(f)) {

Click to expand...

So we are not at EOF. Well, how *far* are we from EOF? Can we read, say,
512
bytes, without having to bother checking the result?

As soon as we have to check something about the read result, the !eof(f)
check was superfluous.

Click to expand...

Aye. There's also the matter of the EOF-test-before-reading
on an interactive device like a keyboard. How would that work,
exactly? "Do you intend to press any more keys? If so, press Y.
If not, press -- oh, wait, if you're not going to press any more
keys you can't press N, can you? Okay: press Y if you're going
to press more keys, otherwise don't press any keys -- and that's
how I'll know whether we're at EOF."

It might work like this:

int c;

while (!eof(stdin)) {
c=fgetc(stdin);
printf("%d %c\n",c,c);
}

eof() is never true, unless perhaps the user presses some special key (eg.
Ctrl D). Which is pretty much how many Unix utilities seem to work.
(Although in my example, the input will also be buffered by the line.)

However ... for known keyboard interaction, you would probably take a
different approach, which might depend on whether you want a
character-at-time interface or a line-oriented one.

But for the million situations where you just have a small, static
line-oriented text file which has a well-defined beginning and end, then my
eof()/readline() loop works perfectly well.

(I can just imagine the hairy compilers the people on here might write which
will be parsing source input even as the programmer is still adding lines at
the end! Sometimes you just have to state the kind of file model that you
want to support, and insist that the files conform to that model.)

BartC · Apr 1, 2014

glen herrmannsfeldt said:
But you don't know that it is EOF until you know what you want
to read.

If you consider Java Scanner, which is commonly used for stream
input, you tell it what you want to read, and it tells you if you
can read that.

Well, I don't use stream input for files. EOF is perfectly well defined for
many kinds of input: if the current position within the file is at the end,
then ... you're at the end of the file!

If you're talking about 'files' that don't have a proper beginning or end,
maybe a stream of characters coming over a wire, then fine, you take a
different approach. But then, a lot of the stuff you might want to do will
not be meaningful (how do you report an error on line 217 for example, if
line 217 has come and gone since then?)

And if you have files which can change size and contents even as you read
them, then you've got bigger problems than how to detect the end of them.

Keith Thompson · Apr 1, 2014

BartC said:
It might work like this:

int c;

while (!eof(stdin)) {
c=fgetc(stdin);
printf("%d %c\n",c,c);
}

eof() is never true, unless perhaps the user presses some special key (eg.
Ctrl D). Which is pretty much how many Unix utilities seem to work.
(Although in my example, the input will also be buffered by the line.)

However ... for known keyboard interaction, you would probably take a
different approach, which might depend on whether you want a
character-at-time interface or a line-oriented one.

But for the million situations where you just have a small, static
line-oriented text file which has a well-defined beginning and end, then my
eof()/readline() loop works perfectly well.

(I can just imagine the hairy compilers the people on here might write which
will be parsing source input even as the programmer is still adding lines at
the end! Sometimes you just have to state the kind of file model that you
want to support, and insist that the files conform to that model.)

C already has a mechanism that works reasonably well both for
interactive input and for reading from files and other devices. What
exactly is the advantage of inventing something new and less flexible?

BartC · Apr 1, 2014

C already has a mechanism that works reasonably well both for
interactive input and for reading from files and other devices. What
exactly is the advantage of inventing something new and less flexible?

For you, probably none (and it's not that new). If I haven't managed to put
my view across well enough in five previous posts on the subject, then I'm
not very hopeful for this one!

But, to establish how 'out-of-left-field' I might be on this, I looked at
rosettacode.org, at the task 'read a file line-by-line'. And plenty of
languages do split the tasks of reading a line, from checking the status.
And many of those use an explicit end-of-file() check at the start of a
while loop.

Including Ada. So if I'm wrong, so is Ada. The curly-braced languages tend
to follow C's style, which is not surprising.

And I don't think you can argue that because someone is coding in C, then
any other way of doing this stuff is prohibited. I've implemented several
languages myself based on C's standard file functions, and they all have an
eof() function.

Because with higher level features, C-style read functions, full of
pointers, buffers and error returns, are inappropriate. But that
higher-level style can be carried down to actual C code too.

(But I've just remembered that this is the group where regulars will argue
to the death that 'for (i=A; i<=B; ++i)' is always superior and more
'flexible' than 'for i=A to B', so perhaps I shouldn't be surprised.)

Keith Thompson · Apr 1, 2014

BartC said:
For you, probably none (and it's not that new). If I haven't managed to put
my view across well enough in five previous posts on the subject, then I'm
not very hopeful for this one!

But, to establish how 'out-of-left-field' I might be on this, I looked at
rosettacode.org, at the task 'read a file line-by-line'. And plenty of
languages do split the tasks of reading a line, from checking the status.
And many of those use an explicit end-of-file() check at the start of a
while loop.

Including Ada. So if I'm wrong, so is Ada. The curly-braced languages tend
to follow C's style, which is not surprising.

I don't recall using the word "wrong".

The traditional C way of reading input, which does differ from the way
some other languages do it, works well both for interactive devices and
for fixed-size files. The mechanisms used by other languages can also
work well both for interactive devices and for fixed-size files.

The mechanism you propose is usable only for fixed-size files. I see no
advantage that mitigates this loss of functionality.

And I don't think you can argue that because someone is coding in C, then
any other way of doing this stuff is prohibited. I've implemented several
languages myself based on C's standard file functions, and they all have an
eof() function.

Nobody is prohibiting anything. I merely question whether your approach
is a good idea.

Because with higher level features, C-style read functions, full of
pointers, buffers and error returns, are inappropriate. But that
higher-level style can be carried down to actual C code too.

If you can come up with a mechanism that works at a higher level (not
"full of pointers, buffers and error returns") while retaining the
existing flexibility, that might be useful.

(But I've just remembered that this is the group where regulars will argue
to the death that 'for (i=A; i<=B; ++i)' is always superior and more
'flexible' than 'for i=A to B', so perhaps I shouldn't be surprised.)

I don't remember any such argument; can you provide a reference?

Certainly the C-style for loop is more general than a range-based
for loop. But iterating over a range is a very common case, and I
wouldn't object to adding such a feature to the language (though I
don't recall you advocating such a change). But until that happens,
C-style for loops have one very large advantage: they exist.

(You could hack something together with macros, but I personally don't
think that's a good idea; the existing idiom is reasonably clear, and a
macro solution would require anyone reading the code to figure out how
the macros work.)

BartC · Apr 1, 2014

I don't recall using the word "wrong".

The traditional C way of reading input, which does differ from the way
some other languages do it, works well both for interactive devices and
for fixed-size files. The mechanisms used by other languages can also
work well both for interactive devices and for fixed-size files.

The mechanism you propose is usable only for fixed-size files. I see no
advantage that mitigates this loss of functionality.

I'd dispute that. The following two loops both seem to work the same,
reading keyboard input, on both Windows and Ubuntu, and terminating on Ctrl
C or Ctrl Z:

while (!eof(stdin)) {
fgets(buffer,n,stdin);
printf("%d %s\n",++count,buffer);
}

while (fgets(buffer,n,stdin)) {
printf("%d %s\n",++count,buffer);
}

(It depends how eof() is implemented. The version I use now does peek ahead,
and returns a flag if that peek failed. Versions that work with a
predetermined file size, or using seeks, might not work.)

But in any case I think it is a better idea to treat different kinds of
files separately, although I know that Unix users are excited about creating
pipes and things between processes, and pretending files redirected to stdin
have actually been typed by somebody.

The actuality is that pretty much 100% of the files I want to read (using
such a loop) are fixed-size.

I don't remember any such argument; can you provide a reference?

(I've brought it up a few times over the years...

(You could hack something together with macros,

.... and that argument is generally used. And in fact I've had to resort to
such a macro because of 100% of my for-loops do just iterate from A to B and
the regular for-statement is too long-winded and error-prone.

but I personally don't
think that's a good idea; the existing idiom is reasonably clear, and a
macro solution would require anyone reading the code to figure out how
the macros work.)

The macro looks like FOR(i,A,B) in use, which I don't think is that
challenging. But then no-else is going to see it.)

Ben Bacarisse · Apr 1, 2014

<snip>

But for the million situations where you just have a small, static
line-oriented text file which has a well-defined beginning and end,
then my eof()/readline() loop works perfectly well.

It often works, but it's almost always wrong from the point of view of
expression the program's intent. I can't remember the last time I cared
if I had hit the end of a file (or whatever). What I care about is if
the read worked, or if the data was of the right sort, or if it meets
some other condition. Writing

while (read_and_check_data(...) == SUCCESS) {
...
}

is almost always better than

while (!one_possible_reason_for_failure(...)) {
...
}

What the OP wants is

while (fscanf(fp, "%d", &input) == 1) {
/* do something with 'input' */
}

or, if there are further restrictions on the numbers, something like
this:

while (fscanf(fp, "%d", &input) == 1 && input >= 0) {
/* do something with 'input' */
}

and that's almost certainly better than any loop that checks for EOF
instead of input success.

<snip>

glen herrmannsfeldt · Apr 2, 2014

BartC said:
(I can just imagine the hairy compilers the people on here
might write which will be parsing source input even as the
programmer is still adding lines at the end! Sometimes you
just have to state the kind of file model that you want to
support, and insist that the files conform to that model.)

I don't know that any Unix compilers will accept /dev/tty as
an input file. I did once with TOPS-10 and Fortran-10 on a
PDP-10 compile direct from terminal input. If you make any
mistakes, you have to start over, so it helps to have a small
program.

-- glen

Keith Thompson · Apr 2, 2014

glen herrmannsfeldt said:
I don't know that any Unix compilers will accept /dev/tty as
an input file. I did once with TOPS-10 and Fortran-10 on a
PDP-10 compile direct from terminal input. If you make any
mistakes, you have to start over, so it helps to have a small
program.

I don't know of any that don't:

% gcc -x c -o hello /dev/tty
#include <stdio.h>
int main(void) { puts("hello"); }
% ./hello
hello

James Kuyper · Apr 2, 2014

On 04/02/2014 02:38 PM, glen herrmannsfeldt wrote:
....

I don't know that any Unix compilers will accept /dev/tty as
an input file.

~(49) gcc -xc /dev/tty -o from_terminal
#include <stdio.h>
int main(void) {
printf("Hello World!\n");
return 0;
}
~(50) from_terminal
Hello World!

glen herrmannsfeldt · Apr 3, 2014

(snip, I wrote)

I don't know of any that don't:

Last time I tried, the compiler required a .c extension.
Seems that gcc doesn't require that any more.

How about compilers other than gcc?

% gcc -x c -o hello /dev/tty
#include <stdio.h>
int main(void) { puts("hello"); }
% ./hello
hello

-- glen

Kaz Kylheku · Apr 3, 2014

(snip, I wrote)

Last time I tried, the compiler required a .c extension.
Seems that gcc doesn't require that any more.

How about compilers other than gcc?

Compilers that need a filesystem object with a .c extension
can be fooled by a Unix fifo.

mkfifo fake.c
cc -c fake.c & # blocks on open("fake.c", ...) syscall
some_process > fake.c

now the compiler is possibly compiling pieces of code from the pipe before
some_process finishes writing it all.

James Kuyper · Apr 3, 2014

(snip, I wrote)

Last time I tried, the compiler required a .c extension.
Seems that gcc doesn't require that any more.

I'm not sure that it ever did; I am sure that it hasn't required it for
a long time. I'm also sure that most of the other compilers I've ever
used had options for accepting non-default file naming conventions.you

Keith Thompson · Apr 3, 2014

glen herrmannsfeldt said:
Last time I tried, the compiler required a .c extension.
Seems that gcc doesn't require that any more.

How about compilers other than gcc?

It requires *some* way to know that the file contains C source (since
the "gcc" command is a driver capable of invoking compilers for several
different languages). Usually that's done by giving it a file whose
name ends in ".c", but in this case, the "-x c" option serves the same
purpose.

glen herrmannsfeldt · Apr 3, 2014

Keith Thompson said:
(snip)
It requires *some* way to know that the file contains C source (since
the "gcc" command is a driver capable of invoking compilers for several
different languages). Usually that's done by giving it a file whose
name ends in ".c", but in this case, the "-x c" option serves the same
purpose.

Before gcc compiled for different languages, it still allowed linking
in .o files, and sometimes assembling .s files. I was writing C
back to SunOS 3.x and other computers before that. It could be
that I missed the option.

VMS programs in general, and not just compilers, mostly have
a preferred extension (if none is specified) or will accept
any extension you supply. (For example TYPE defaults to .LIS
if none is specified.)

TOPS-10 is slightly different, but mostly has preferred extension,
but will accept any when supplied.

-- glen

Öö Tiib · Apr 4, 2014

(I've brought it up a few times over the years...

... and that argument is generally used. And in fact I've had to resort to
such a macro because of 100% of my for-loops do just iterate from A to B and
the regular for-statement is too long-winded and error-prone.

You seem to assume [A,B] on close to all cases; someone else assumes [A,B)
on close to all cases. Both assumptions make sense.

Traditional 'for' cycle makes it explicit what it is; the macros hide it.
So the macros are less error prone only for author of macro and that
is exactly the state-of-art situation.

BartC · Apr 4, 2014

Öö Tiib said:
(I've brought it up a few times over the years...

... and that argument is generally used. And in fact I've had to resort
to
such a macro because of 100% of my for-loops do just iterate from A to B
and
the regular for-statement is too long-winded and error-prone.

Click to expand...

You seem to assume [A,B] on close to all cases; someone else assumes [A,B)
on close to all cases. Both assumptions make sense.

If you mean an inclusive A,B range, then this was typical of for-loops in
most (admittedly old) languages I've used in the past: Fortran, Algol,
Pascal etc. The syntax usually suggests that too: for i:= A to B etc.

I understand that C is 0-based so many loops will end up looking like
FOR(i,A,B-1). I don't think that's a problem. I suppose macros could be
renamed to make it clear, and different ones used for different kinds of
ranges, but that complicates matters and for my purposes it wasn't
necessary.

But just about anything is easier to write than an actual C for-loop, where
you have to tell the compiler (which would otherwise have no idea of how to
code a for-loop) these extra details:

- Loop-index - needs to be repeated twice (with no checks if you get it
wrong, see example below)

- Limit comparator: you need to tell it whether it's ==, <, <= etc, again
with no checks

- Increment: you have to say if it's ++ or --, but it could be anything and
the compiler will say nothing.

I think really there is no argument for not having a proper for-statement,
as well as what C has now. (In my own language designs, I have the proper
for statement, as well as the C version. And guess what: for all its
supposed power and flexibility, I've never used the C version!)

Traditional 'for' cycle makes it explicit what it is; the macros hide it.
So the macros are less error prone only for author of macro and that
is exactly the state-of-art situation.

How many people do you think have ever had bugs like the following:

for (i=0 i<N; ++i)
for (j=0; j<N; ++i )
.....

This kind of bug is *impossible* when you write loops like this:
for i=1,N do
for j=1,N do
.....

Seungbeom Kim · Apr 8, 2014

... and that argument is generally used. And in fact I've had to resort to
such a macro because of 100% of my for-loops do just iterate from A to B and
the regular for-statement is too long-winded and error-prone.

Click to expand...

You seem to assume [A,B] on close to all cases; someone else assumes [A,B)
on close to all cases. Both assumptions make sense.

Traditional 'for' cycle makes it explicit what it is; the macros hide it.
So the macros are less error prone only for author of macro and that
is exactly the state-of-art situation.

Then let the macro be used like FOR(i, A, <=B) or FOR(i, A, <B).

Then why not support decrements or arbitrary step sizes as well?
FOR(i, A, >=0, --), FOR(i, A, <B, +=2), FOR(i, A, >0, >>=1), etc.

Seungbeom Kim · Apr 8, 2014

What I care about is if the read worked, or if the data was
of the right sort, or if it meets some other condition. Writing

while (read_and_check_data(...) == SUCCESS) {
...
}

is almost always better than

while (!one_possible_reason_for_failure(...)) {
...
}

This is analogous to

if (mkdir(name) fails) {
print "mkdir failed";
return false;
}
return true;

being better than

if (exists(name)) {
print "name already exists";
return false;
}
mkdir(name); // assume success
return true;

though for a somewhat different reason.

Öö Tiib · Apr 8, 2014

... and that argument is generally used. And in fact I've had to resort to
such a macro because of 100% of my for-loops do just iterate from A toB and
the regular for-statement is too long-winded and error-prone.

Click to expand...

You seem to assume [A,B] on close to all cases; someone else assumes [A,B)
on close to all cases. Both assumptions make sense.

Traditional 'for' cycle makes it explicit what it is; the macros hide it.
So the macros are less error prone only for author of macro and that
is exactly the state-of-art situation.

Click to expand...

Then let the macro be used like FOR(i, A, <=B) or FOR(i, A, <B).

Then why not support decrements or arbitrary step sizes as well?
FOR(i, A, >=0, --), FOR(i, A, <B, +=2), FOR(i, A, >0, >>=1), etc.

Not sure if that fits into "proper for statement" of Bart, but
it would indeed cover close to 100% variants I see in code ...
like FOR(p, A, !=NULL, =p->next) ... FOR(i, i, <B, ++).

Some with additional constraints like

for (i = a; weStillCare() && i < b; i++)

remain, but FOR(i,a,<b,++) and 'if (!weStillCare()) break;' in start
of body of loop work about same.

scanf/getchar sequence problem	21	Apr 7, 2005
Learning C - Scanf or Getch, or Getchar not working correctly after first loop.	26	Feb 15, 2007
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
How to try a range of hex values in C# code ?	0	Nov 19, 2022
How to keep count of right answer and wrong answers in C++?	0	Nov 3, 2021
fscanf to read lines from file?	2	May 20, 2004
C Standard Regarding Null Pointer Dereferencing	280	Jul 21, 2010
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024

How to modify ((c=getchar())!=NULL) to use fscanf or scanf instead?

BartC

BartC

Keith Thompson

BartC

Keith Thompson

BartC

Ben Bacarisse

glen herrmannsfeldt

Keith Thompson

James Kuyper

glen herrmannsfeldt

Kaz Kylheku

James Kuyper

Keith Thompson

glen herrmannsfeldt

Öö Tiib

BartC

Seungbeom Kim

Seungbeom Kim

Öö Tiib

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads