Do input functions use fgetc inside them or read()

K

Kobu

Do the "larger" input functions like scanf, gets, fgets use fgetc to
take input or an operating system call function like read() (I know it
could be any "way", but I'm trying to find out how it's best to "think
of it")?

I've seen some explain it as if they call fgetc internally, then I've
seen some people explain it as if they call some lower level OS system
call like read(). I've even seen some older posts talk about a input
functions interacting directly with drivers (tty comes up a lot).

I've been searching through the deja/google archives and there is a
mixture of "ways of thinking how larger input functions operate." This
is important to figure out little nook and crany issues like the
eof/error post recently that made me more confused than I've ever been
before about input. This is a sticking point in my journey to
understand these topics on my own- so pls try not to give me RTFM
answers :). I also understand many things I'm talking about are not
part of the C standard, but they are interfaces to the C standard and
many C gurus have brought these topics up in previous posts because I
think it helps understand these standard input functions.



Side note: So far this is how I picture the things in my head.

scanf/gets
|
|
\ /
fgetc
|
|
\ /
read() (system call)
|
|
\ /
driver (like tty)
 
E

Eric Sosman

Kobu said:
Do the "larger" input functions like scanf, gets, fgets use fgetc to
take input or an operating system call function like read() (I know it
could be any "way", but I'm trying to find out how it's best to "think
of it")?

They obtain their input "as if" by calling fgetc(). This
does not imply that there's an actual fgetc() or getc() call
in the implementation of fread(), say, but the net observable
effect must be the same.
I've seen some explain it as if they call fgetc internally, then I've
seen some people explain it as if they call some lower level OS system
call like read(). I've even seen some older posts talk about a input
functions interacting directly with drivers (tty comes up a lot).

Different implementations ... well, they differ: that's
why we call them "different!" Some use read(), some use mmap(),
some use SYS$GET(), some use Black Magic. The C Standard says
what the implementation must do, not how it is to be done.
Side note: So far this is how I picture the things in my head.

scanf/gets
|
|
\ /
fgetc
|
|
\ /
read() (system call)
|
|
\ /
driver (like tty)

This is a plausible schematic, but by no means universal.
The crucial question is this: Why do you need to know what
happens "under the hood?" Thirst for knowledge? Fine: just
realize that the knowledge you gain about one implementation
may not be transferable to the next. Performance? Fine:
just realize that the trick that makes machine A read input
ten times faster causes machine B imitate molasses flowing
uphill in winter. A desire to mix-and-match the "lower-level"
and "upper-level" calls? Fine again: just realize that the
technique that does wonders on machine A causes machine B to
blue-screen ...

When you have the luxury (and it *is* a luxury) of
treating your libraries as "black boxes" and simply letting
them "do their own thing," by all means do so: You save an
enormous amount of pain. Do not attempt to outwit those
libraries unless the amount of pain you already suffer from
exceeds the amount you will incur.
 
S

Stephen Sprunk

Kobu said:
Do the "larger" input functions like scanf, gets, fgets use fgetc to
take input or an operating system call function like read() (I know it
could be any "way", but I'm trying to find out how it's best to "think
of it")?

I've seen some explain it as if they call fgetc internally, then I've
seen some people explain it as if they call some lower level OS system
call like read().

Explaining it as the more complex functions calling the simpler ones if the
goal is to explain how they all work together without the details of where
the character stream(s) comes from (or go to, in the case of output). It's
possible a really naive implementation might do that, but it's not the most
efficient method.

The way I'd explain it comes in two layers; there's a "lower" layer that
fetches a bunch of characters from the OS and stores them in a hidden array
(probably in the FILE object) plus an "upper" layer that studies the array,
removes however many characters are needed, and delivers them to your
program. If the array doesn't have enough content to meet your needs, the
upper layer calls into the lower layer to go get more. The upper layer, of
course, consists of the user-visible functions like fgetc() and scanf().
The lower level would be some implementation-specific wrappers around OS
functions/syscalls, e.g. read(). Reverse the process for output streams.

That's a long way of saying that scanf() doesn't call fgetc() but rather
both of them use a common facility which is invisible to you. Of course,
any particular implementation might be written differently; my concept might
be horribly inefficient compared the magic that systems implementors cook
up.
I've even seen some older posts talk about a input
functions interacting directly with drivers (tty comes up a lot).

This may be the case for DOS and embedded systems, but it'd never fly in a
multitasking environment like UNIX or Windows. It's probably not even true
on DOS since it'd prevent shell redirection and other handy tricks.

If my explanation made sense, I'd suggest you grab the source for glibc and
look at how the GNU folks do it...

S
 
S

Stephen Sprunk

Eric Sosman said:
The crucial question is this: Why do you need to know what
happens "under the hood?" Thirst for knowledge? Fine: just
realize that the knowledge you gain about one implementation
may not be transferable to the next.

And it may not need to be; some people understand abstraction layers more
easily by looking at how things work under the hood in one implementation
even if they never actually take advantage of those details in their own
code.
When you have the luxury (and it *is* a luxury) of
treating your libraries as "black boxes" and simply letting
them "do their own thing," by all means do so: You save an
enormous amount of pain. Do not attempt to outwit those
libraries unless the amount of pain you already suffer from
exceeds the amount you will incur.

Of course. OTOH, learning how the Standard Library interfaces with your
implementation is a great tool in learning how to interface with other parts
of your implementation the Standard doesn't cover. Hopefully in the process
one will learn to create a layer of abstration that will allow the
unportable parts of one's code to be segregated for easier porting in the
future.

S
 
K

Keith Thompson

Eric Sosman said:
Performance? Fine:
just realize that the trick that makes machine A read input
ten times faster causes machine B imitate molasses flowing
uphill in winter.
[...]

<OT>
Not everything is as it seems. Google "Boston molasses disaster" for
a counterexample to your implicit assumption.
</OT>

Followups redirected appropriately.
 
D

Dan Pop

In said:
Do the "larger" input functions like scanf, gets, fgets use fgetc to
take input or an operating system call function like read() (I know it
could be any "way", but I'm trying to find out how it's best to "think
of it")?

<stdio.h> streams are typically buffered. So, all input functions look
for input into the stream buffer. If this buffer is empty, the
implementation has to refill it, by asking the OS to read more data
from the (physical or logical) device to which the stream is connected
and to store it into the stream buffer. It is obvious that this is
achieved in a highly OS-specific way.

Fortunately, all you need to know is that "larger" input functions
behave *as if* they obtained the data through repeated fgetc() calls.

Dan
 
K

Kobu

Thank you all for your comments. The reason I wanted to know a little
bit about what's happening below the input functions is because it
gives me enlightenment into how the upper parts work, and also shows me
where the boundary is (so i can be more careful about segregating the
too). I've run into problems between different implementations, and
this is a sample of the problem.

The following code was testing on a *n*x/bsd type system and then a
winconsole/dos type system. The behaviour is totally different... for
the carefully picked input that I chose which should force the *under
the hood* fgetc to act the same on both systems (both systems also have
NON STICKY EOF when dealing with interactive terminal).

#include <stdio.h>

int main()
{
char string1[30];
char string2[30];

scanf("%s", string1);
scanf("%s", string2);

if (!*string1)
printf("string1 IS nul terminated\n");
else
printf("string1 not nul terminated\n");

if (!*string2)
printf("string2 IS nul terminated\n");
else
printf("string2 not nul terminated\n");

return 0;
}


On the *n*x system, I gave it two EOF signals (Ctrl+D), the program
ended.

On the win system, I had to give it four EOF signals (Ctrl+Z, Enter),
then only did the program end (2 signals per scanf).

Based on the responses to my original post, I am safe to assume that
scanf gets input "as if" by fgetc. If I assume that, fgetc should flag
scanf with a value equal to the EOF macro, and scanf should get the
hint and return (not wait for another fgetc-EOF like my win system is
doing).

Based on the fgetc theory, how can one explain the win system. This is
precisely why I asked if it was safe for me to assume fgetc under the
hood of scanf and larger functions or whether I have to go a layer up,
or down to find the boundary between the STANDARD and NON STANDARD.

Some help please.

[BTW, according to both implementations, fgetc should return EOF in
each case - Ctrl+D on beginning of line for *n*x OR Ctrl+Z & Enter at
beginning of line for Win system. Thus there is no difference at the
fgetc-EOF layer as far as my experiment goes, so where *can* the
difference be? ]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top