ifstream buffer size conversion from size_t to std::streamsize --> Is this OK?

N

Notebooker

Hello,

I'm an intermediate noob reading-in data from ascii-file using an
ifstream object.

I have specified a c-style string buffer with size of type size_t and I
am specifying to use this buffer size as the number of characters to
read in using the function read(). The issue I am having is read()
expects that the value for the number of characters to read-in will be
of type std::streamsize, which is apparently signed int. My buffer
size, being of type size_t, is unsigned int.

I am getting the follwing compile-time warning in MSVC++ 2005 EE:

warning C4267: 'argument' : conversion from 'size_t' to
'std::streamsize', possible loss of data

1. What are the implications of this down-cast in this case? My guess
is that I will process the buffer thinking I have read-in my specified
size when in fact I have read-in upto only the maximum number allowable
by signed int.

2. If I want to read-in as much data as possible in one-shot, is my
only solution in this case to define the length of the _buffer array
using a signed int?



CODE SNIPPET FOLLOWS:

#include <fstream>
#include <string>
.. . .

// somewhere ...
size_t _nSizeBuf = (int) ( 1024 / sizeof(char) ); // Probably not big
enough to cause a problem.
char* _buffer = new char[_nSizeBuf];
std::string _sPathFileName = "C:\temp.txt";

.. . .

void myFunction() {

std::ifstream inStream;
inStream.open( _sPathFileName.c_str() );
if( inStream )
{
// read() expects the 2nd argument to be signed int;
// however, _nSizeBuf is unsigned int.
inStream.read( _buffer, _nSizeBuf );
}

}


Thanks for any insight!

- direction40
 
J

Jim Langston

Notebooker said:
Hello,

I'm an intermediate noob reading-in data from ascii-file using an
ifstream object.

I have specified a c-style string buffer with size of type size_t and I
am specifying to use this buffer size as the number of characters to
read in using the function read(). The issue I am having is read()
expects that the value for the number of characters to read-in will be
of type std::streamsize, which is apparently signed int. My buffer
size, being of type size_t, is unsigned int.

I am getting the follwing compile-time warning in MSVC++ 2005 EE:

warning C4267: 'argument' : conversion from 'size_t' to
'std::streamsize', possible loss of data

1. What are the implications of this down-cast in this case? My guess
is that I will process the buffer thinking I have read-in my specified
size when in fact I have read-in upto only the maximum number allowable
by signed int.

The maximum number of chars to read should be quite large for a signed int
dependong on your implementation. For a 4 byte, 8 bit signed int this would
be 2,147,483,648 characters. Now, as long as your specified size isn't over
2 billion (on a system with 4 byte/8 bit ints) there won't be a problem.
2. If I want to read-in as much data as possible in one-shot, is my
only solution in this case to define the length of the _buffer array
using a signed int?

Actually, an unsigned int can store a number larger than a signed int. For
our 4 byte/8 bit systems, an unsigned int can store a value up to
4,294,967,296. Just make sure you don't specify a number larger than the
unsigned int can hold, otherwise it will overflow the sign bit and become a
negative number, which would cause problems (unknown what .read() would do
with a negative value).

In your code, there won't be a problem, 1024 is quite a bit smaller by a
number of magnitudes than 2 billion. You can, if you desire, make this
warning go away:

inStream.read( _buffer, static_cast<signed int>( _nSizeBuf ) );
or probably more prefered:
inStream.read( _buffer, static_cast<std::streamsize>(
_nSizeBuf ) );

in your trivial code this won't be a problem. You can check for overflow if
you want however, and should if you will be reading large files or am unsure
of the value of _nSizeBuf something like:

if ( static_cast<std::streamsize>( _nSizeBuf ) < 0 )
throw "Buffer Size overflowing std::streamsize!";
else
inStream.read( _buffer, static_cast<std::streamsize>( _nSizeBuf ) );

It depends on how the code will be used, if you have control of the
buffersize or it's a user defined value, etc...

In practice, however, you can usually just do the static_cast without
worrying about overflow unless you define a very large buffer.
CODE SNIPPET FOLLOWS:

#include <fstream>
#include <string>
. . .

// somewhere ...
size_t _nSizeBuf = (int) ( 1024 / sizeof(char) ); // Probably not big
enough to cause a problem.
char* _buffer = new char[_nSizeBuf];
std::string _sPathFileName = "C:\temp.txt";

. . .

void myFunction() {

std::ifstream inStream;
inStream.open( _sPathFileName.c_str() );
if( inStream )
{
// read() expects the 2nd argument to be signed int;
// however, _nSizeBuf is unsigned int.
inStream.read( _buffer, _nSizeBuf );
}

}


Thanks for any insight!

- direction40
 
?

=?iso-8859-1?q?Kirit_S=E6lensminde?=

Notebooker said:
Hello,

I'm an intermediate noob reading-in data from ascii-file using an
ifstream object.

I have specified a c-style string buffer with size of type size_t and I
am specifying to use this buffer size as the number of characters to
read in using the function read(). The issue I am having is read()
expects that the value for the number of characters to read-in will be
of type std::streamsize, which is apparently signed int. My buffer
size, being of type size_t, is unsigned int.

I am getting the follwing compile-time warning in MSVC++ 2005 EE:

warning C4267: 'argument' : conversion from 'size_t' to
'std::streamsize', possible loss of data

1. What are the implications of this down-cast in this case? My guess
is that I will process the buffer thinking I have read-in my specified
size when in fact I have read-in upto only the maximum number allowable
by signed int.

2. If I want to read-in as much data as possible in one-shot, is my
only solution in this case to define the length of the _buffer array
using a signed int?



CODE SNIPPET FOLLOWS:

#include <fstream>
#include <string>
. . .

// somewhere ...
size_t _nSizeBuf = (int) ( 1024 / sizeof(char) ); // Probably not big
enough to cause a problem.
char* _buffer = new char[_nSizeBuf];
std::string _sPathFileName = "C:\temp.txt";

. . .

void myFunction() {

std::ifstream inStream;
inStream.open( _sPathFileName.c_str() );
if( inStream )
{
// read() expects the 2nd argument to be signed int;
// however, _nSizeBuf is unsigned int.
inStream.read( _buffer, _nSizeBuf );
}

}


Thanks for any insight!


I've been coming across this a lot in going through COM interfaces
where 32 bit integers are common, whereas many of the C++ types I use
are 64 bit.

The trick of doing the cast and seeing if the result is negative seems
a bit scary to me. I wouldn't go anywhere near that. Luckily there is a
better solution.

Try something like this (not done with help of compiler - expect the
usual muppetry):

if ( nSizeBuf > std::numeric_limits< signed int >::max() )
// Won't work - we will have an overflow
else
inStream.read( _buffer, signed int( nSizeBuf );


K
 
?

=?iso-8859-1?q?Erik_Wikstr=F6m?=

Hello,

I'm an intermediate noob reading-in data from ascii-file using an
ifstream object.

I have specified a c-style string buffer with size of type size_t and I
am specifying to use this buffer size as the number of characters to
read in using the function read(). The issue I am having is read()
expects that the value for the number of characters to read-in will be
of type std::streamsize, which is apparently signed int. My buffer
size, being of type size_t, is unsigned int.

Is there a good reason not to use std::streamsize instead of size_t? By
using the same type as the library you get two things, first you don't
get the warnings and second you can be sure never to get values out of
range.
size_t _nSizeBuf = (int) ( 1024 / sizeof(char) );

Just like to point out that sizeof(char) == 1, always.
 
J

Jacek Dziedzic

Jim said:
> [...]
in your trivial code this won't be a problem. You can check for overflow if
you want however, and should if you will be reading large files or am unsure
of the value of _nSizeBuf something like:

if ( static_cast<std::streamsize>( _nSizeBuf ) < 0 )
throw "Buffer Size overflowing std::streamsize!";
else
inStream.read( _buffer, static_cast<std::streamsize>( _nSizeBuf ) );

It depends on how the code will be used, if you have control of the
buffersize or it's a user defined value, etc...

I don't mean to be picky, but doesn't this only detect
half of the overflows, i.e. when a variable overflows,
it doesn't necessarily wrap to a negative value, right?

cheers,
- J.
 
K

Kai-Uwe Bux

Jim Langston wrote:

[snip]
Actually, an unsigned int can store a number larger than a signed int.
For our 4 byte/8 bit systems, an unsigned int can store a value up to
4,294,967,296. Just make sure you don't specify a number larger than the
unsigned int can hold, otherwise it will overflow the sign bit and become
a negative number, which would cause problems (unknown what .read() would
do with a negative value).

In your code, there won't be a problem, 1024 is quite a bit smaller by a
number of magnitudes than 2 billion. You can, if you desire, make this
warning go away:

inStream.read( _buffer, static_cast<signed int>( _nSizeBuf ) );
or probably more prefered:
inStream.read( _buffer, static_cast<std::streamsize>(
_nSizeBuf ) );

in your trivial code this won't be a problem. You can check for overflow
if you want however, and should if you will be reading large files or am
unsure of the value of _nSizeBuf something like:

if ( static_cast<std::streamsize>( _nSizeBuf ) < 0 )
throw "Buffer Size overflowing std::streamsize!";
else
inStream.read( _buffer, static_cast<std::streamsize>( _nSizeBuf ) );

Hm: if _nSizeBuf is too large, the conversion cast has either undefined
behavior or at least implementation defined behavior. So when the test is
supposed to kick in it could theoretically fail. I would try to get by
without the cast:

if ( std::numeric_limits<std::streamsize>::max() < _nSizeBuf ) {
...

Now, the issue might be complicated by arithmetic conversions doing
something. Does anybody know how to get the blessings of the standard for
this kind of check? (I hate signed integer types.)


Best

Kai-Uwe Bux
 
N

Notebooker

Thanks all for the great feedback.

Is the result of sizeof not platform / OS dependent? Eg: 64-bit OS char
will be 2 bytes ?

Originally I had the size of the buffer defined by a size_t because I
was using a non-dynamic array (no use of "new") and I had read that the
maximum size of an array was defined by a value of size_t. I guess I
interpreted that wrong.



What is a 4 byte / 8-bit integer as 4bytes on a 32-bit OS = 32 bits.

I like the ideas for checking for overflow.

- direction40
 
J

Jerry Coffin

Thanks all for the great feedback.


Is the result of sizeof not platform / OS dependent? Eg: 64-bit OS char
will be 2 bytes ?

No. "sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1;
the result of sizeof applied to any other fundamental type (3.9.1) is
implementation-defined." ($5.3.3/1).
 
?

=?iso-8859-1?q?Erik_Wikstr=F6m?=

What is a 4 byte / 8-bit integer as 4bytes on a 32-bit OS = 32 bits.

Jim used that notation to point out that there is no guarantee in C++
that a byte is 8 bits. While this is true for most modern machines it
does not have to be, I seem to recall that there have been some with 13
bits per byte (or was it 11?). One could imagine a computer with 16
bits per byte in which case 4 bytes would be 64 bits.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,965
Messages
2,570,148
Members
46,710
Latest member
FredricRen

Latest Threads

Top