Strange Behaviour in finding Size of a File

N

Nick Keighley

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:
//-- Code starts here : --
static size_t LogSize = 1048576;

     Ah. This is obviously some strange usage of "10 MB" that
I hadn't previously been aware of.
It is strange that the condition got satisfied when results.st_size =2589116.

     Not all *that* strange ...

this is why you don't put numbers like that in your code

#define K 1024
#define M (K * K)
#define LOG_SIZE (10 * M)
 
B

Ben Bacarisse

Nick Keighley said:
this is why you don't put numbers like that in your code

#define K 1024
#define M (K * K)
#define LOG_SIZE (10 * M)

I'd put an 'L' suffix on the 1024 since 10 * M is outside the minimum
range or plain int arithmetic.
 
E

Eric Sosman

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:
//-- Code starts here : --
static size_t LogSize = 1048576;

Ah. This is obviously some strange usage of "10 MB" that
I hadn't previously been aware of.
It is strange that the condition got satisfied when results.st_size = 2589116.

Not all *that* strange ...

this is why you don't put numbers like that in your code

#define K 1024
#define M (K * K)
#define LOG_SIZE (10 * M)

Ugh. If you simply *must* do that sort of thing, at least
use identifiers like KILO and MEGA (or KIBI and MIBI). Myself,
I'd prefer to write

#define LOG_SIZE (10L * 1024 * 1024)

.... with special emphasis on the "L".
 
I

Ike Naar

[...]
(As I said earlier, C doesn't guarantee that the value returned by
ftell() is meaningful for text streams, but that's unlikely to be
an issue for the OP.)

If the all the logging happens in one place (or a small
number of nearby places), and if it all happens during one
execution of the program, the O.P. can do the entire job in
purely portable C. Something like

static FILE *logStream;
static size_t logLength;

void writeLog(const char *format, ...) {
if (logStream == NULL) {
logStream = openLog(...);
logLength = 0;
}

va_list ap;
va_start(ap, format);
logLength += vfprintf(format, ap);

You mean
logLength += vfprintf(logStream, format, ap);
?
 
I

Ian Collins

I'd put an 'L' suffix on the 1024 since 10 * M is outside the minimum
range or plain int arithmetic.

Or lust use

const size_t oneK = 1024;
const size_t oneM = oneK * 1024;
const size_t logSize = 10 * oneM;
 
B

Ben Bacarisse

Ian Collins said:
Or lust use

const size_t oneK = 1024;
const size_t oneM = oneK * 1024;
const size_t logSize = 10 * oneM;

Check the newsgroup! That's more use in C++ than in C -- you can't put
these at file scope in C.
 
K

Kenny McCormack

Try being a little less condescending and actually answering the
question.

Try being a little less condescending and actually helping people asking
for help instead of being pedantic (your words) on size units and
everything but the OP's problem.[/QUOTE]

That's Kiki. It wouldn't be Kiki if it didn't exhibit those sorts of
behaviors.
 
P

Phil Carmody

Ian Collins said:
I didn't understand it either. I've never seen the expression used in
the context of man pages.

Ditto, clearly our presumably-more-than-two decades of unix experience
are all for nought. And it's not even being compact, as the expression
used is way more characters than the *absolutely standard* notation
which Keith followed up with, viz. ftell(2).

Looks like we've got a new level 2 usenet poster, IYKWIM...

Phil
 
F

felix

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:
//-- Code starts here : --
static size_t LogSize = 1048576;
bool CreateNewLogs = false;


if ( stat ( logFile, &results ) == 0 )



That is presumably the POSIX stat() function, or something similar? If

so, its behavior is defined by the POSIX standard, not the C standard,

and you'll get better answers to your questions in comp.unix.programmer

than in this newsgroup.


if ( results.st_size > LogSize )

CreateNewLogs = true;


//-- Code ends here : --

It is strange that the condition got satisfied when results.st_size = 2589116.
And we are sure that the size of the data that is written is between 50 to 100 bytes in one operation. And this check is done before writing into the LogFile.



Keep in mind that file I/O is normally buffered, so the buffer size is

more relevant than the size of your individual writes. Still, that seems

to be a rather large jump to explain by buffering.


I am not sure if I am missing anything in our understanding of the stat function. Any inputs or pointers on this regard will be really Helpful.



The people in comp.unix.programming may need to know more details about

how data is written to the file, and whether or not you've used any

POSIX functions to change the file mode.

Just to get a better idea of what's going on, I'd recommend reporting

the file size somewhere (probably in a separate log file) every time you

call stat().


Thanks a Lot James and all others that helped me understand this problem.

As said above, the file I/O is buffered. The file size was not getting updated after each fwrite(). We tried to fflush() the after writing the data into log files. And now we see the correct size of the file.

We are trying to understand ftell()/fseek() also as suggested.

And thanks for suggestions regarding the constant - LogSize.
 
J

James Kuyper

On Friday, 9 November 2012 17:26:28 UTC+5:30, James Kuyper wrote: ....
As said above, the file I/O is buffered. The file size was not
getting updated after each fwrite(). We tried to fflush() the after
writing the data into log files. And now we see the correct size of
the file.

Note: the size that you saw before was also the correct size. That is -
it was the actual size currently taken up by the file on disk. It didn't
include the part that was still in the buffers, and had not yet been
written to the disk (that explanation is somewhat incomplete, but it's
good enough for this context). That might not have been the quantity you
wanted to know, but that doesn't make it incorrect.
We are trying to understand ftell()/fseek() also as suggested.

If fflush() caused stat() to give you the size that you wanted, calling
ftell() is almost certainly a more efficient way of getting that size
information - at least if you can call it from within the same program
that's writing to the log file. According to the C standard, this is
only guaranteed if the file is opened in binary mode, but your use of
stat() suggests that you may be using a Unix-like system, in which case
text mode is indistinguishable from binary mode.

Your wording implies that you're having trouble understand ftell(). Can
you explain the nature of your confusion?
 
F

felix

Note: the size that you saw before was also the correct size. That is -

it was the actual size currently taken up by the file on disk. It didn't

include the part that was still in the buffers, and had not yet been

written to the disk (that explanation is somewhat incomplete, but it's

good enough for this context). That might not have been the quantity you

wanted to know, but that doesn't make it incorrect.






If fflush() caused stat() to give you the size that you wanted, calling

ftell() is almost certainly a more efficient way of getting that size

information - at least if you can call it from within the same program

that's writing to the log file. According to the C standard, this is

only guaranteed if the file is opened in binary mode, but your use of

stat() suggests that you may be using a Unix-like system, in which case

text mode is indistinguishable from binary mode.



Your wording implies that you're having trouble understand ftell(). Can

you explain the nature of your confusion?

First, Sorry for this delay in my response.

James, you did explain the information that we were looking for.
We were trying to understand which of those subroutines (ftell(3) or fflush(3)) would be more effective when used within the code. We wanted to make sure that we knew why we are using what we are using.

Once again, thank you James.

Thanks and Regards,
Felix
 
J

James Kuyper

We were trying to understand which of those subroutines (ftell(3) or
fflush(3)) would be more effective when used within the code. We
wanted to make sure that we knew why we are using what we are using.

That's trivial: ftell() just retrieves data that's stored in the FILE
object (conceptually, at least - implementation details may differ) -
it's very quick. fflush() forces the buffers to be flushed earlier than
they otherwise would be. Assuming a reasonably competent implementation
of the standard I/O library (if you can't make that assumption, change
which library you're using), the flushing policy is presumably close to
optimum, and randomly interfering with that policy will cause a
significant reduction in I/O efficiency. You should interfere with the
buffering only when you have to - and in this case, you don't have to.
Also, ftell(), if executed after the immediately preceding write to the
file, will (on a binary stream, at least) returns exactly the quantity
that you're interested in. Calling fflush() improves the relevance of
the numbers you can get from stat(), but it's still not guaranteed to
give you the same number that ftell() would give you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,567
Members
47,203
Latest member
EmmaSwank1

Latest Threads

Top