Telling an empty binary file from a "full" one

B

Bryan Donlan

Michel said:
No, I don't have this problem. The reason for that is that it's a
configuration file, it writes to a file whats in memory in order to use
it later. so it works both on big endian and little endian machines,
and indeed it can take absolutly any way of writing double-precision
floats, since it reads only what it writes.

This is true, but note that you'll need to reload it with the same
implementation that saved it.
 
B

Bryan Donlan

Michel said:
That's getting helpful, but I don't really know how to deal with what
fread returns (indeed i have never dealt with size_t's before, nor
included stddef.h).

Anyways, my file has no risk of being over the right size, but only
under, so I guess i should try to read
(4*sizeof(double)+sizeof(int_32)) bytes and see what it returns (when
i'll have figured out what to do with what fread returns)

btw, right now, that file is empty, uneditable and undeletable, and `ls
-l` in cygwin tells me "ls: freq.cfg: No such file or directory", is it
because i killed the process before it fclosed the file?

Sounds like it doesn't exist. Did fopen return NULL?
 
K

Keith Thompson

Bryan Donlan said:
Michel Rouzic wrote: [...]
btw, right now, that file is empty, uneditable and undeletable, and `ls
-l` in cygwin tells me "ls: freq.cfg: No such file or directory", is it
because i killed the process before it fclosed the file?

Sounds like it doesn't exist. Did fopen return NULL?

<WAY_OT>
I think he means that "ls -l" with no arguments, says
ls: freq.cfg: No such file or directory
-- i.e., there's a directory entry for it (so it shows up in a plain
"ls"), but any attempt to read the file itself acts as if it doesn't
exist.

See www.cygwin.com for pointers to mailing lists where you can ask
about this.
</WAY_OT>
 
M

Michel Rouzic

Bryan said:
Sounds like it doesn't exist. Did fopen return NULL?

no never mind, i had deleted that message and reposted it without the
end so noboy would reply to that, but you did anyways. I don't know how
I did, but even if it didn't look like that in cygwin, my program was
still running... I only had to kill it to fix it...
 
R

Richard Bos

That's getting helpful, but I don't really know how to deal with what
fread returns (indeed i have never dealt with size_t's before, nor
included stddef.h).

One hardly ever #includes <stddef.h>, since nearly everything in it is
also defined in other headers, where necessary (TTBOMK the only
exceptions are ptrdiff_t and offsetof()). For example, size_t is also
defined in <stdio.h>. And you probably _have_ dealt with size_t's
without realising it: sizeof evaluates to a size_t.

FWIW, a size_t is simply an unsigned integer type of the necessary size.
Nothing mysterious about it.
Anyways, my file has no risk of being over the right size, but only
under, so I guess i should try to read
(4*sizeof(double)+sizeof(int_32)) bytes and see what it returns (when
i'll have figured out what to do with what fread returns)

Yes... except that if I were you, I'd make that a macro, so you can use
it both for reading and for writing; and people who say "there is no
risk of foo" _will_ encounter foo the day after they hand in their
programs, so I'd make certain and check for over-sized files anyway.

Richard
 
C

Chris Torek

I'm curious as to what existing OS's do not accurately
report the lengths of binary files. Does anyone
have any examples?

A whole bunch of old mainframe and minicomputer OSes did this.
They allocated only whole sectors to files, and files were always
sized in whole-sector units. Text files used special encodings so
as to be able to hold "lines of text" that did not come out to an
even number of disk sectors. For instance, each line might be
prefixed by a byte-count indicating how much space the line occupied
within the text file and how many bytes of that were to be treated
as file text (with the extra bytes, if any, being ignored -- this
allows one to shorten lines without rewriting the file). (Each
line might also be numbered, so that lines could be lengthened
without rewriting the entire file, by marking the original line as
deleted -- zero valid bytes -- and placing the new text into whatever
existing space could be found, or at the end of the file.)

VMS's RMS took care of dealing with all the various file-formats
for you; you just told it to open a "text" file and it would map
out the magic. Open the same file as "binary", however, and all
the magic encoding shows up. It was not until VMS version 5 that
"stream-LF" text files appeared; before then, *all* text files had
magic encoding. (The encoding for a "stream-LF" file is basically
the same as that used on Unix systems, i.e., no encoding at all,
just a sequence of bytes with "lines" indicated by newline bytes.)

One interesting consequence of byte-count-encoded (and optionally
numbered) lines is that there is no such thing as a final line that
does not end with a newline. That is:

FILE *somefile = fopen("somefile.txt", "w");
... check for errors as needed ...
fprintf(somefile, "ab\nc\nd");
fclose(somefile);

is faced with a problem: should it write the three lines saying
"line 1: two bytes, ab; line 2: one byte, c; line 3: one byte, d"
-- which is "ab\nc\nd\n", which is not what you wrote -- or should
it write "line 1: two bytes, ab; line 2: one byte, c" -- which is
"ab\nc\n", which is *also* not what you wrote? The file format is
such that it is physically impossible to reproduce what you *did*
write. A file is a sequence of complete lines; there is no such
thing, in this file format, as an incomplete, not-newline-terminated,
line.

The C standard allows the runtime library to have either of the
two above behaviors, and different C libraries did different things.
If you want "ab\nc\nd\n" to appear in the file, you must write that
final newline yourself; only then is line 3, consisting of the
letter "d", sure to make its way into the file. (Assuming no disk
errors or other similar problems, of course.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,169
Messages
2,570,920
Members
47,462
Latest member
ChanaLipsc

Latest Threads

Top