The Best Buffer Size for IO process

B

Buzz Lightyear

Hi, guys,
If I want to write a large block of data into disk file with maximum
speed, what's the best way to do it.

It looks to give a proper Block Size while writing is a key factor.

int writeFile(char* data, unsigned int size)
{
FILE * pFile;
pFile = fopen ( "myfile.bin" , "wb" );

unsigned int offset(0);

Assert( size > BLOCK_SIZE);

char* targetData( data );

while( offset < size )
{
fwrite (targetData, BLOCK_SIZE , sizeof(char) , pFile );
offset += BLOCK_SIZE;
targetData += BLOCK_SIZE;
}

if( offset < size )
fwrite (targetData, size - offset , sizeof(char) , pFile );

fclose (pFile);
return 0;
}


So how to define a proper BLOCK_SIZE, 1MB, 10MB, or 4KB.

Thanks!
 
I

Ian Collins

Buzz said:
Hi, guys,
If I want to write a large block of data into disk file with maximum
speed, what's the best way to do it.

It looks to give a proper Block Size while writing is a key factor.

int writeFile(char* data, unsigned int size)
{
FILE * pFile;
pFile = fopen ( "myfile.bin" , "wb" );

unsigned int offset(0);

Assert( size > BLOCK_SIZE);

char* targetData( data );

while( offset < size )
{
fwrite (targetData, BLOCK_SIZE , sizeof(char) , pFile );
offset += BLOCK_SIZE;
targetData += BLOCK_SIZE;
}

if( offset < size )
fwrite (targetData, size - offset , sizeof(char) , pFile );

fclose (pFile);
return 0;
}


So how to define a proper BLOCK_SIZE, 1MB, 10MB, or 4KB.

Measure, but it probably won't make a great deal of difference. There
will be buffering in the library and I/O subsystem.
 
B

Buzz

Measure, but it probably won't make a great deal of difference.  There
will be buffering in the library and I/O subsystem.

So application code can't do much work on IO performance tuning?

Thanks.
 
S

Stuart Redmann

[snipped code that writes block of data to file]

So application code can't do much work on IO performance tuning?

It is to be assumed that the underlying driver will ensure that the
optimal buffering is performed. You can make some tests whether your
block size matters at all. However, IO performance tuning is off-topic
in this newsgroup, so you'd better find a suitable group that is
concerned with this hardware/platform specific questions.

Regards,
Stuart
 
T

Thomas J. Gritzan

1) Why can't you simply write all of it in a single call?

2) Why do you write C-style code in a C++ newsgroup?
 
M

Maxim Yegorushkin

Hi, guys,
If I want to write a large block of data into disk file with maximum
speed, what's the best way to do it.

It depends on your definition of speed. Speed may mean that writeFile()
returns as quick as possible (low latency). Or it may mean that the data
gets written to disk as fast as possible (low delay).

In the first case it is popular to do asynchronous writes by letting
another thread do the writing. That is, writeFile() tells another thread
to write the data and returns. POSIX-standard aio_write() function can
be used for that:
http://www.opengroup.org/onlinepubs/000095399/functions/aio_write.html

In the second case you may like to directly call the operating system
file write function, like POSIX-standard write(). This avoids buffering
in the user-space, although the kernel normally caches the data and does
not write it to disk immediately. If you also want to make sure that the
data has actually hit the disk force a sync operation after write().
POSIX-standard fsync() does that
http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html

The best of both worlds is to use memory mapped files. This way writing
to file does not require any calls, rather writing to memory where the
file is mapped (and calling msync() to force the kernel actually write
data to disk if necessary). The drawback is that growing a memory mapped
file is troublesome, although on Linux mremap() system call may help.
 
B

Buzz Lightyear

It depends on your definition of speed. Speed may mean that writeFile()
returns as quick as possible (low latency). Or it may mean that the data
gets written to disk as fast as possible (low delay).

In the first case it is popular to do asynchronous writes by letting
another thread do the writing. That is, writeFile() tells another thread
to write the data and returns. POSIX-standard aio_write() function can
be used for that:http://www.opengroup.org/onlinepubs/000095399/functions/aio_write.html

In the second case you may like to directly call the operating system
file write function, like POSIX-standard write(). This avoids buffering
in the user-space, although the kernel normally caches the data and does
not write it to disk immediately. If you also want to make sure that the
data has actually hit the disk force a sync operation after write().
POSIX-standard fsync() does thathttp://www.opengroup.org/onlinepubs/009695399/functions/fsync.html

The best of both worlds is to use memory mapped files. This way writing
to file does not require any calls, rather writing to memory where the
file is mapped (and calling msync() to force the kernel actually write
data to disk if necessary). The drawback is that growing a memory mapped
file is troublesome, although on Linux mremap() system call may help.

Thanks. I think I get the point. Memory mapping might be the best
choice.

P.S. I also made some test, and found that when set BLOCK size as
10MB, the writing performance is the best one.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top