How to detect an empty file?

C

Chris Torek

stat() works on paths not file descriptors. If you're alluding to race
conditions all file routines have them.

Not *all*: open() with O_CREAT|O_EXCL, for instance, does not.
Indeed, this is the *only* way to guarantee that a file that did
not exist before the call is created by the call and does exist
after the call.

Similarly, open() followed by fstat() avoids some (but not all)
races: in particular, the information you get with fstat() will
definitely refer to the underlying file, and the st_ino, st_dev,
and st_rdev fields should not change (of course the size, owner,
and permissions, for instance, could change).
BTW saying "stat" isn't portable is kinda moot since not all platforms
have files anyways.

While the latter is true, the former is an overstatement: there are
hosted systems that have "files" dealt with via fopen, but lack
the stat() family of calls (e.g., C on some Univac and IBM sysems).
But even Windows has "stat" like functionality. BSD, MacOs, Linuxes
have it too. So do UNIXes. So very likely using some stat function is
better than opening the file and doing some ftell crap.

Usually it is best to open() the file, then use fstat() if needed.
On many systems, this is essentially just as fast as using stat()
on the file, too; and if you intend to open it after getting the
result from stat(), open() followed by fstat() is faster -- often
significantly faster -- than stat() followed by open(). The reason
is that most of the work (and hence most of the time) goes to
translating path names, rather than dealing with the final on-media
entity representing the file. Doing stat(), then open(), runs the
slow code twice; doing open(), then fstat(), runs it only once.

(On vxWorks, stat() is in fact implemented as open() followed by
fstat() followed by close(). So here it is *definitely* more
efficient to open() first, then fstat().)
 
K

Keith Thompson

Old Wolf said:
He's opening the file in text mode, so this method won't work.

To be painfully precise, it's not *guaranteed* to work. (It actually
will work on some systems -- which is probably worse than not working
at all.)
 
K

Keith Thompson

Chris Torek said:
Not *all*: open() with O_CREAT|O_EXCL, for instance, does not.
Indeed, this is the *only* way to guarantee that a file that did
not exist before the call is created by the call and does exist
after the call.
[...]

Not entirely topical, but ...

According to the Linux man page for open(2):

O_EXCL is broken on NFS file systems, programs which rely on it
for performing locking tasks will contain a race condition.

It suggests an alternative, but it's even more off-topic than O_EXCL,
so I'll leave it to anyone interested in the details to look it up.
 
C

Chris Torek

(More off-topic drift; I plan to stop adding to it after this :) )
Chris Torek said:
... open() with O_CREAT|O_EXCL, for instance, [avoids certain races on
POSIX systems].

Not entirely topical, but ...

According to the Linux man page for open(2):

O_EXCL is broken on NFS file systems, programs which rely on it
for performing locking tasks will contain a race condition.

This is (at least theoretically) fixed in modern versions of NFS,
which use a "create verifier" to make the "idempotent but unrepeatable
due to O_EXCL" operation repeatable (hence "really" idempotent).

The client and server both have to support at least v3, and both
have to implement create-verifiers. Not all NFS-es do so, alas.
 
C

Clever Monkey

Keith said:
There's a big difference between supporting the stat() function as
defined by the POSIX standard, and having "stat" like functionality.
If Windows has function that's similar to stat(), but that has a
different name and/or different semantics, then that function can't be
used in portable code. ftell() can (though there are limits to what's
guaranteed for ftell()).

And there are systems other than BSD, MacOS, Linux, and Windows. Does
OpenVMS support stat()? What about IBM's various mainframe operating
systems? (Those are rhetorical questions, BTW.)
If we were looking for an answer, I can supply one. Many IBM mainframes
only have stat() and other POSIX (or POSIX-like) routines if they are
supplied as an add-on. IBM has such routines for the 390, at least in
part, because a company I worked for supplied them.
Sometimes a non-portable solution is the best one, but it's important
to know what's part of the C standard and what isn't, and to know just
what tradeoffs you're making.
Point taken.
 
J

Joe Wright

Olivier said:
Dear all,

I thought the code
-----------------------------
pt_fichier_probleme = fopen(nom_fichier, "w");

if(pt_fichier_probleme == NULL){
message_warning_s
("Erreur l'ouverture du fichier\n%s\n", (gchar *)nom_fichier);
return;}
else {
rewind(pt_fichier_probleme); /* Be sure we're at beginning */
if(feof(pt_fichier_probleme) == 0){
/* We are not at end of buffer ... It means the
file already has some content!! */
if( AskConfirmation(user_data) == 0){
/* L'utilisateur ne veut pas qu'on ecrive sur le fichier !!*/
fclose(pt_fichier_probleme);
return;};
};};
-----------------------------

was an excellent way of
-- opening the file nom_fichier for writing,
-- detecting a mistake if it was not possible,
-- if the file was not empty, the askign the
user whether it still wants to overwrite it.
(that's AskConfirmation : a window with the question and so on)
As it turns out, confirmation is always asked :-(

Help?
Best !
Amities,
Olivier

long sof(FILE *f) {
long end, her = ftell(f);
fseek(f, 0, SEEK_END);
end = ftell(f);
fseek(f, her, SEEK_SET);
return end;
}
 
J

Joe Wright

Olivier said:
Dear all,

I thought the code
-----------------------------
pt_fichier_probleme = fopen(nom_fichier, "w");

if(pt_fichier_probleme == NULL){
message_warning_s
("Erreur l'ouverture du fichier\n%s\n", (gchar *)nom_fichier);
return;}
else {
rewind(pt_fichier_probleme); /* Be sure we're at beginning */
if(feof(pt_fichier_probleme) == 0){
/* We are not at end of buffer ... It means the
file already has some content!! */
if( AskConfirmation(user_data) == 0){
/* L'utilisateur ne veut pas qu'on ecrive sur le fichier !!*/
fclose(pt_fichier_probleme);
return;};
};};
-----------------------------

was an excellent way of
-- opening the file nom_fichier for writing,
-- detecting a mistake if it was not possible,
-- if the file was not empty, the askign the
user whether it still wants to overwrite it.
(that's AskConfirmation : a window with the question and so on)
As it turns out, confirmation is always asked :-(

Help?
Best !
Amities,
Olivier

long sof(FILE *f) {
long end, her = ftell(f);
fseek(f, 0, SEEK_END);
end = ftell(f);
fseek(f, her, SEEK_SET);
return end;
}
 
W

Walter Roberson

Joe Wright said:
long sof(FILE *f) {
long end, her = ftell(f);
fseek(f, 0, SEEK_END);
end = ftell(f);
fseek(f, her, SEEK_SET);
return end;
}

Small problems:

1) fseek() undoes any ungetc()

2) You haven't really defined what the return value means.
If f is a text stream, then the value returned by ftell() is
opaque. Looking at the man page I have handy, I can't tell if
it is even certain to be non-zero for non-empty files

3) You don't check the return value from fseek(SEEK_END) so you
are not certain that the fseek() has succeeded. Your value
"end" might be the same as your value "her", rather than reflecting
the position of the end of file or signalling that the
position is not meaningful (e.g., for a device).
 
R

ritesh

While going through this thread and writting some of my own code I
noticed this -

1. Suppose I create a text file. Write some data to it. Then close
the FILE* pointer to this file.
2. Since I have the path for the file (as char *), I can always open
the file in append mode to write more data to it.
3. What if I want to open the file in "edit" mode, and need to insert
text at the beginning of the file?

If I open using -
filePtr = fopen(filePath, "wt");
then the file is truncated to zero length and I loose the previous
data.

If I open using
filePtr = fopen(filePath, "at");
then the file pointer is placed at the end of the file. If I try to
this -
rewind(filePtr);
now the filePtr moves to end of the file, since it was opened in append
mode and it allow me to move back anymore.

I guess the only option left is to -
long int length = someFunctionToGetLengthOfFile(filePtr);
fseek(filePtr, -length, SEEK_SET);

Could someone please tell me which function would give me the length of
the text file? Portable solutions would be best, non-portable would do
as long as they work on Linux, Unix and Solaris.

Thanks,
Ritesh
 
K

Keith Thompson

ritesh said:
While going through this thread and writting some of my own code I
noticed this -

1. Suppose I create a text file. Write some data to it. Then close
the FILE* pointer to this file.
2. Since I have the path for the file (as char *), I can always open
the file in append mode to write more data to it.
3. What if I want to open the file in "edit" mode, and need to insert
text at the beginning of the file?

There's no standard way to do that, and very likely no system-specific
way to do it directly. The usual way to do that kind of thing is to
create a new file, write the new data to it, then copy the old file,
the rename the new file to the old file's name.

If I open using -
filePtr = fopen(filePath, "wt");
then the file is truncated to zero length and I loose the previous
data.

If I open using
filePtr = fopen(filePath, "at");
then the file pointer is placed at the end of the file. If I try to
this -
rewind(filePtr);
now the filePtr moves to end of the file, since it was opened in append
mode and it allow me to move back anymore.

Both "wt" and "at" are non-standard. The only standard mode arguments
for fopen() are:

r w a
rb wb ab
r+ w+ a+
r+b w+b a+b (or rb+ wb+ ab+)
I guess the only option left is to -
long int length = someFunctionToGetLengthOfFile(filePtr);
fseek(filePtr, -length, SEEK_SET);

Could someone please tell me which function would give me the length of
the text file? Portable solutions would be best, non-portable would do
as long as they work on Linux, Unix and Solaris.

The only portable way to get the length of a text file is to read the
whole file and count the characters. (On some systems, this won't be
the physical size of the file; for example, on Windows each
two-character CR LF end-of-line marker is translated to a single '\n'
-- but only in text mode.)

Non-portably, you might be able to use fseek() to seek to the end of
the file, then ftell() to find out what the offset is. There's no
guarantee that the result of ftell() is meaningful for a text file
(other than as an argument to fseek()), <OT>but it should gives you a
simple byte count on Unix-like systems.</OT>

<OT>For Unix-like systems, see also the stat() function</OT>

However, knowing the length of a text file doesn't help you insert
text at the beginning of it. You'll still need to create a new file;
there's no way to shift the existing data.
 
M

Michael Mair

ritesh said:
While going through this thread and writting some of my own code I
noticed this -

1. Suppose I create a text file. Write some data to it. Then close
the FILE* pointer to this file.
2. Since I have the path for the file (as char *), I can always open
the file in append mode to write more data to it.
3. What if I want to open the file in "edit" mode, and need to insert
text at the beginning of the file?

Then I'd rather open it in "r+" mode.
Note that you cannot "insert" text but that you have to read
all the file into a buffer, write the part to be inserted into
the file and then the buffer contents.
If you want to overwrite something of the same length, then
this of course is possible.

If I open using -
filePtr = fopen(filePath, "wt");

There is no 't' in standard C. You can leave it out.

then the file is truncated to zero length and I loose the previous
data.

If I open using
filePtr = fopen(filePath, "at");
then the file pointer is placed at the end of the file. If I try to
this -
rewind(filePtr);
now the filePtr moves to end of the file, since it was opened in append
mode and it allow me to move back anymore.

Nonsense. Read your C standard library reference of choice.
rewind() is in terms of positioning equivalent to
(void)fseek(stream, 0L, SEEK_SET)

I guess the only option left is to -
long int length = someFunctionToGetLengthOfFile(filePtr);
fseek(filePtr, -length, SEEK_SET);

Not at all. Read past threads on this issue.
Could someone please tell me which function would give me the length of
the text file?

C works with streams which are not only files.
There is no standard C function to do so as a stream could
have "infinite" length and/or could not be evaluated for length
as this would discard the actual "content".
Portable solutions would be best, non-portable would do
as long as they work on Linux, Unix and Solaris.

Have a look at POSIX functions and/or ask in comp.unix.programmer.


Cheers
Michael
 
A

av

What happens if someone else appends data after you fseek()?

I don't think that the question has an answer.

size of file is a function of time
for example find in the time "t" the size of a file and allocate with
malloc memory == the size of file, then load that file in that memory
could be a bug if the file is more long in the time t+1
 
G

Gordon Burditt

1. Suppose I create a text file. Write some data to it. Then close
the FILE* pointer to this file.
2. Since I have the path for the file (as char *), I can always open
the file in append mode to write more data to it.
3. What if I want to open the file in "edit" mode, and need to insert
text at the beginning of the file?

There is no mode to write text to the beginning of the file *AND SHOVE
THE REST OF IT TO THE BACK OF THE FILE*. If you write a line of
50 characters, you overwrite the first 50 or so [*] bytes of the
file. If you want to prepend text to the whole file, read the file,
rewind, write the new text, then write all the old stuff.

[*] Due to differences in line endings, a line might occupy more
bytes in the file than it might appear. Consider Windows and
MS-DOS \r\n line endings.

There is no mode "wt" or "at" in standard C. Not very many systems
support on-the-fly tab expansion and compression or translation to
or from Turkish, and hopefully none of them support thermonuclear
file writes.

I think the mode you are looking for is "r+" (not "r+t"), which does
not truncate the file, but will permit fseeking and writing.
If I open using -
filePtr = fopen(filePath, "wt");
then the file is truncated to zero length and I loose the previous
data.

If I open using
filePtr = fopen(filePath, "at");
then the file pointer is placed at the end of the file. If I try to
this -
rewind(filePtr);
now the filePtr moves to end of the file, since it was opened in append
mode and it allow me to move back anymore.

I guess the only option left is to -
long int length = someFunctionToGetLengthOfFile(filePtr);
fseek(filePtr, -length, SEEK_SET);

If you're thinking of doing that after opening the file in "a" mode,
it may not work. All writes may be forced to the end of the file.
Could someone please tell me which function would give me the length of
the text file? Portable solutions would be best, non-portable would do
as long as they work on Linux, Unix and Solaris.

Non-standard: the return value of ftell() after fseek(filePtr, 0L,
SEEK_END) may return the current length of the file if the seek
offset is counted in bytes (rather than, say, bytes, sectors, tracks,
cylinders, and trains, packed into one integer in bitfields). This
should work on UNIX-like systems. It may get confusing on systems
with \r\n line endings.

Non-standard: Use fstat(fileno(filePtr), &statbuffer). This should
work on UNIX-like systems.

Gordon L. Burditt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,184
Messages
2,570,978
Members
47,561
Latest member
gjsign

Latest Threads

Top