Historical question, why fwrite and not binary specifier for fprintf?

D

David Mathog

In the beginning (Kernighan & Ritchie 1978) there was fprintf, and unix
write, but no fwrite. That is, no portable C method for writing binary
data, only system calls which were OS specific. At C89 fwrite/fread
were added to the C standard to allow portable binary IO to files. I
wonder though why the choice was made to extend the unix function
write() into a standard C function rather than to extend the existing
standard C function fprintf to allow binary operations?

Consider a bit of code like this (error checking and other details omitted):

int ival;
double dval;
char string[10]="not full\0\0";
FILE *fp;

fp = fopen("file.name","w");
(void) fprintf(fp,"%i%f%s",ival,dval,string);

It always seemed to me that the natural extension, if the data needed to
be written in binary, would have been either this (which would have
allowed type checking):

(void) fprintf(fp,"%bi%bf%bs",ival,dval,string);

or perhaps just this (which would not have allowed type checking):

(void) fprintf(fp,"%b%b%b",ival,dval,string);

(Clearly there are some issues in deciding whether to write for string
"not full", or the entire buffer, which could have been handled in the
%bs form using field width, for instance.)

Anyway, in the real world fwrite was chosen. For those of you who were
around for this decision, was extending fprintf considered instead of,
or in addition to fwrite? What was the deciding factor for fwrite?
I'm guessing that it was that everybody had been using write() for years
and it was thought that fwrite was a more natural extension, but that is
just a guess.

Thanks,

David Mathog
 
S

santosh

David said:
In the beginning (Kernighan & Ritchie 1978) there was fprintf, and
unix write, but no fwrite. That is, no portable C method for writing
binary data, only system calls which were OS specific. At C89
fwrite/fread were added to the C standard to allow portable binary IO
to files. I wonder though why the choice was made to extend the unix
function write() into a standard C function rather than to extend the
existing standard C function fprintf to allow binary operations?

Perhaps because the UNIX functions were well known? Perhaps for reasons
of efficiency?
Consider a bit of code like this (error checking and other details
omitted):

int ival;
double dval;
char string[10]="not full\0\0";
FILE *fp;

fp = fopen("file.name","w");
(void) fprintf(fp,"%i%f%s",ival,dval,string);

It always seemed to me that the natural extension, if the data needed
to be written in binary, would have been either this (which would have
allowed type checking):

(void) fprintf(fp,"%bi%bf%bs",ival,dval,string);

or perhaps just this (which would not have allowed type checking):

(void) fprintf(fp,"%b%b%b",ival,dval,string);

(Clearly there are some issues in deciding whether to write for string
"not full", or the entire buffer, which could have been handled in the
%bs form using field width, for instance.)

Anyway, in the real world fwrite was chosen. For those of you who
were around for this decision, was extending fprintf considered
instead of, or in addition to fwrite? What was the deciding factor
for fwrite? I'm guessing that it was that everybody had been using
write() for years and it was thought that fwrite was a more natural
extension, but that is just a guess.

Personally I'm glad that direct I/O has separate functions for it. The
*printf()/*scanf() interface is already quite a complicated, bloated
one.

It seems to me that their primary use is when conversion is necessary.
Otherwise a more direct interface should be preferable, at least for
efficiency, if not for anything else.
 
K

Keith Thompson

David Mathog said:
In the beginning (Kernighan & Ritchie 1978) there was fprintf, and
unix write, but no fwrite. That is, no portable C method for writing
binary data, only system calls which were OS specific. At C89
fwrite/fread
were added to the C standard to allow portable binary IO to files. I
wonder though why the choice was made to extend the unix function
write() into a standard C function rather than to extend the existing
standard C function fprintf to allow binary operations?

Consider a bit of code like this (error checking and other details omitted):

int ival;
double dval;
char string[10]="not full\0\0";
FILE *fp;

fp = fopen("file.name","w");
(void) fprintf(fp,"%i%f%s",ival,dval,string);

It always seemed to me that the natural extension, if the data needed
to be written in binary, would have been either this (which would have
allowed type checking):

(void) fprintf(fp,"%bi%bf%bs",ival,dval,string);

or perhaps just this (which would not have allowed type checking):

(void) fprintf(fp,"%b%b%b",ival,dval,string);
[...]

Neither form really allows type checking, unless the compiler chooses
(as gcc does, for example) to check the arguments against the format
string and issue warning for mismatches. Such checking is not
possible if the format string is not a string literal.

Your ``string'' argument is passed as a pointer to the first character
of the string (&string[0]). fprintf would have no way to know how
many characters to print -- unless it stops at the first '\0', but
that's likely to be inappropriate for a binary file.

The whole purpose of fprintf is to format data into text (the final
'f' stands for format). Binary output specifically doesn't do any
formatting; it just dumps the raw bytes. Having to invoke fprintf,
with all its internal machinery to parse the format string, when you
merely want to dump raw bytes doesn't seem like a good thing.

fwrite() does just what it needs to do, without all that conceptual
overhead.

Speaking of historical questions, were fread() and frwrite() invented
by the C89 committee, or were they based on existing practice? I
suspect the latter, but I'm not sure.
 
R

Richard Tobin

Speaking of historical questions, were fread() and frwrite() invented
by the C89 committee, or were they based on existing practice? I
suspect the latter, but I'm not sure.

They were existing practice.

-- Richard
 
R

Richard Tobin

In the beginning (Kernighan & Ritchie 1978) there was fprintf, and unix
write, but no fwrite. That is, no portable C method for writing binary
data, only system calls which were OS specific. At C89 fwrite/fread
were added to the C standard to allow portable binary IO to files.

No. fwrite() and friends were present in the standard i/o library
introduced in 7th edition unix in 1979.
I
wonder though why the choice was made to extend the unix function
write() into a standard C function rather than to extend the existing
standard C function fprintf to allow binary operations?

The standard i/o library provides two things: efficient buffering and
formatted i/o. getc(), fwrite(), etc provide buffering. printf() etc
provide formatting on top of that. It makes no sense for you to have
to use the formatting mechanism (and its overhead) just for buffered
i/o, whether text or binary.

The standard i/o library seems to me to be an excellent balance of
simplicity and functionality, unix and C at their best.

-- Richard
 
J

J. J. Farrell

Keith said:
David Mathog said:
In the beginning (Kernighan & Ritchie 1978) there was fprintf, and
unix write, but no fwrite. That is, no portable C method for writing
binary data, only system calls which were OS specific. At C89
fwrite/fread
were added to the C standard to allow portable binary IO to files. I
wonder though why the choice was made to extend the unix function
write() into a standard C function rather than to extend the existing
standard C function fprintf to allow binary operations?
[...]

...

Speaking of historical questions, were fread() and frwrite() invented
by the C89 committee, or were they based on existing practice? I
suspect the latter, but I'm not sure.

I believe they first appeared in public as part of the 7th Edition UNIX
standard library in January 1979.
 
J

Jack Klein

In the beginning (Kernighan & Ritchie 1978) there was fprintf, and unix
write, but no fwrite. That is, no portable C method for writing binary
data, only system calls which were OS specific. At C89 fwrite/fread
were added to the C standard to allow portable binary IO to files. I
wonder though why the choice was made to extend the unix function
write() into a standard C function rather than to extend the existing
standard C function fprintf to allow binary operations?

Consider a bit of code like this (error checking and other details omitted):

int ival;
double dval;
char string[10]="not full\0\0";
FILE *fp;

fp = fopen("file.name","w");
(void) fprintf(fp,"%i%f%s",ival,dval,string);

And how would you write a float, or a short, or other types that
undergo default promotions when passed to variadic functions?
It always seemed to me that the natural extension, if the data needed to

It seems to me to be a totally useless and hideously inefficient idea.
Consider a struct with a dozen members. Consider an array of a few
hundred such structs. Both of these values are trivially small for
many data programs.

Now you'd have to define a format string that contained specifiers for
each of those dozen members. And call fprintf() in a loop a few
hundred times to save the data to a file. Each time passing each of
the dozen members by value.

And you would have to write a corresponding format string for fscanf()
to read your data back in. Passing a dozen pointers to members of the
structure on each call. And put that in a loop a few hundred times.

Now for a big data base, you might have a structure with 50 or 100
members, and you might have a few hundred thousand of them to read and
write.

And, of course, whenever the structure definition changes (oops, this
needs to be a long, int overflows, or a new member is added, or the
order of members is rearranged), you would have to modify all input
and output format strings -- correctly of course.

*printf() and *scanf() are primarily designed to convert between human
readable text and binary format. Using them when you want no such
conversion does not even seem intuitive.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
 
C

CBFalconer

Jack said:
.... snip ...

*printf() and *scanf() are primarily designed to convert between
human readable text and binary format. Using them when you want
no such conversion does not even seem intuitive.

However, especially in the embedded field, they are often a
monstrous waste, and also an easy way to inject errors. Simple,
non-variadic functions to output a specific type with possibly a
field width specifier (see Pascal) are much more efficient.

The problem is that those functions, if linked, need to include all
the options, whether used or not. They are just too big and
all-encompassing. It makes much less difference on some OS where
the entire function is in one shared library file. But with static
linking the problem reappears everywhere.
 
E

Eric Sosman

David Mathog wrote On 11/27/07 12:18,:
In the beginning (Kernighan & Ritchie 1978) there was fprintf, and unix
write, but no fwrite. That is, no portable C method for writing binary
data, only system calls which were OS specific.

There was putc(), which is portable. In fact, the
various Standards for C describe all the output operations
as operating "as if" by repeated putc() calls.
At C89 fwrite/fread
were added to the C standard to allow portable binary IO to files.

Although they are not mentioned in K&R, they are both
older than the ANSI Standard. ANSI did not "add" them; it
codified existing practice.
I
wonder though why the choice was made to extend the unix function
write() into a standard C function rather than to extend the existing
standard C function fprintf to allow binary operations?

... but fprintf() *can* generate binary output!

FILE *stream = fopen("data.bin", "wb");
double d = 3.14159;
char *p;
for (p = (char*)&d; p < (char*)(&d + 1); ++p)
fprintf (stream, "%c", *p);

(Error-checking omitted for brevity.) putc() would be
a better choice, but fprintf() *can* do it, if desired.
Consider a bit of code like this (error checking and other details omitted):

int ival;
double dval;
char string[10]="not full\0\0";
FILE *fp;

fp = fopen("file.name","w");
(void) fprintf(fp,"%i%f%s",ival,dval,string);

It always seemed to me that the natural extension, if the data needed to
be written in binary, would have been either this (which would have
allowed type checking):

(void) fprintf(fp,"%bi%bf%bs",ival,dval,string);

As Charlie Brown said, "Bleah!" Note that this would
offer no way to output a promotable type without performing
the promotion and a subsequent demotion (I'm not worried
about the speed, but about potential changes in the data,
things like a minus zero float losing its minus sign in
the conversion to double and back). Writing out a struct
would be clumsy in the extreme, as you'd need to enumerate
every element, one by one. I can't see any way to write
a bit-field with this scheme, nor any way to write a union
without foreknowledge of which element was current (short
of repeated "%c" as above -- which requires no extensions).
[...] For those of you who were
around for this decision, was extending fprintf considered instead of,
or in addition to fwrite? What was the deciding factor for fwrite?

I wasn't there and don't know, and the Rationale offers
no hints. But the idea of trying to use fprintf() for this
strikes me as tightening screws with hammers: It's the wrong
interface, that's all. Besides, fread() and fwrite() already
existed; they were not "creatures of the committee" in the
way that <stdlib.h> was, for example.
 
D

David Mathog

Jack said:
And how would you write a float, or a short, or other types that
undergo default promotions when passed to variadic functions?

Good point. The only variadic functions I ever use are those in the
standard libraries, and I had naively always thought that it was fprintf
(and relatives) which were performing these promotions, in order to
simplify the number of formatting options required. It seems that
it is instead the mechanism which supports the variadic function per se
which causes the promotions. That is the key point.

This topic came up because I was once again looking at reading/writing a
binary file consisting of varying types of data in a known order. In
the past I have either read these sorts of things in as block and then
chopped it up inside into the correct size pieces, or used a series of
fread()'s of the appropriate size. I thought, wouldn't it be nice,
since I already know that it is "byte, byte, byte, 6 character string,
32 bit unsigned int" (etc) to be able to do something like:


freadf(stream,"%c%c%c%6c%d",&byte1,&byte2,&byte3,&character_array,&int)

to read it, or fwritef (by variable name, not address of variable) to
write. Of course there is no such thing as freadf or fwritef, but the
latter is similar in form to fprintf. (Google turns up freadf/fwritef
functions, but they appear not to be part of any standard.)

The problem is, if I (finally) understand this correctly, not only is
there no fwritef, there cannot be a function which accepts arguments by
name (and not address), simply anyway, because no matter what the format
string says if the arguments are specified this way they will be
promoted by the variadic function mechanism before the code handling the
format string has a chance to touch the raw data. There is a way around
it, use pointers for fwritef as well as freadf. That makes the idea of
adding binary output modes to fprintf look particularly bad, because

fprintf(fout,"%c",byte1);

but

fprintf(fout,"%bc",&byte1);

so the "b" modifier would require a change in the address mode of the
argument list. Better to start over with:

fwritef(fout,"%c",&char1)

so that all the "b" modifiers can be eliminated and there is no change
in the access mode, all arguments are always by reference. The problem
of deciding how to handle character strings (by null termination or
count) remains.

Regards,

David Mathog
 
R

Richard Tobin

In the beginning (Kernighan & Ritchie 1978) there was fprintf, and unix
write, but no fwrite. That is, no portable C method for writing binary
data, only system calls which were OS specific.
[/QUOTE]
There was putc(), which is portable.

To be pedantic, in 1978 there was indeed putc(), but it wasn't the
putc() we know today. The then-current sixth edition unix used a
pointer to a 518-byte "struct buf", which the programmer had to
create, instead of the opaque FILE struct introduced in seventh
edition (1979) along with most of the stdio functions we use today.

-- Richard
 
E

Eric Sosman

Richard Tobin wrote On 11/28/07 13:00,:
There was putc(), which is portable.


To be pedantic, in 1978 there was indeed putc(), but it wasn't the
putc() we know today. The then-current sixth edition unix used a
pointer to a 518-byte "struct buf", which the programmer had to
create, instead of the opaque FILE struct introduced in seventh
edition (1979) along with most of the stdio functions we use today.[/QUOTE]

To be pedantic back at'cha: putc() and FILE and fopen()
and so on are described in Chapter 7 of "The C Programming
Language" by Brian W. Kernighan and Dennis M. Ritchie, ISBN
0-13-110163-3. The copyright date is 1978, not 1979 or later,
and the putc() description is on page 152.

Perhaps Unix lagged C by a year or so?
 
R

Richard Tobin

Eric Sosman said:
To be pedantic back at'cha: putc() and FILE and fopen()
and so on are described in Chapter 7 of "The C Programming
Language" by Brian W. Kernighan and Dennis M. Ritchie, ISBN
0-13-110163-3. The copyright date is 1978, not 1979 or later,
and the putc() description is on page 152.

Perhaps Unix lagged C by a year or so?

I was relying on the date of the unix manuals. I don't think there
was anything except unix to run C on back then. Perhaps the updated
library was available was available before the new version of unix, or
perhaps the book was written before the corresponding software was
generally available. (I don't seem to have my K&R1 to hand to see if
it says anything about it.)

-- Richard
 
J

J. J. Farrell

Richard said:
I was relying on the date of the unix manuals. I don't think there
was anything except unix to run C on back then. Perhaps the updated
library was available was available before the new version of unix, or
perhaps the book was written before the corresponding software was
generally available. (I don't seem to have my K&R1 to hand to see if
it says anything about it.)

UNIX v7 was released in January 1979; I guess that K&R1 was written
based on what was being put together for UNIX v7. Given the difference
between copyright and release dates, it's quite possible that UNIX v7
was finished within Bell Labs before K&R1 was finished.

UNIX v6 had primitive versions of putc() and fopen() which do not match
the definitions in K&R1 or UNIX v7. The Standard I/O library as we know
it today is based on that in UNIX v7 as described in K&R1.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top