file length...

S

Steve

Hi,
I'm trying to read a binary file into a buffer:

std::ifstream ifs(fileName,
std::ios::in|std::ios::binary);
if (!ifs)
return;
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];
memcpy(buf,ifs.rdbuf(),len);

However, this is not working as expected since
buf contains not what the original file contains...

I do not want to read character-by-character from
file and the above does not work. So what is the
correct way of doing this "simple" stuff?

Thanks,
Steve
 
G

Gianni Mariani

Steve said:
Hi,
I'm trying to read a binary file into a buffer:
See:

http://www.cplusplus.com/ref/iostream/istream/read.html


std::ifstream ifs(fileName,
std::ios::in|std::ios::binary);
if (!ifs)
return;
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];
memcpy(buf,ifs.rdbuf(),len);

However, this is not working as expected since
buf contains not what the original file contains...

I do not want to read character-by-character from
file and the above does not work. So what is the
correct way of doing this "simple" stuff?

Thanks,
Steve
 
R

Rolf Magnus

Steve said:
Hi,
I'm trying to read a binary file into a buffer:

std::ifstream ifs(fileName,
std::ios::in|std::ios::binary);
if (!ifs)
return;
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];
memcpy(buf,ifs.rdbuf(),len);

ifs.rdbuf() returns a pointer to an instance of class basic_filebuf. Copying
the bytes that this objects is made of into an array of char doesn't make
any sense. But you can use its member function sgetn to get data.
 
D

Dietmar Kuehl

Steve said:
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];

This is a pretty dangerous approach: there is actually no
portable approach to determine the number of characters in a
file except counting them, e.g. like this:

/**/ std::streamsize len = std::distance(
/**/ std::istreambuf_iterator<char>(ifs),
/**/ std::istreambuf_iterator said:
memcpy(buf,ifs.rdbuf(),len);

This does not read the files, it copies the contents of
some struct. Don't do it! Probably the most efficient and
portable approach to reading a file in C++ is something like
this:

/**/ std::istringstream out;
/**/ out << ifs.rdbuf();
/**/ std::string buf = out.str();

My personally preferred notation would be this:

/**/ std::vector<char> buf;
/**/ std::copy(std::istreambuf_iterator<char>(ifs),
/**/ std::istreambuf_iterator<char>(),
/**/ std::back_inserter(buf));

.... but this is typically considerably slower than the other
approach although it could actually be made faster. :-(
 
A

Alex Vinokur

Dietmar Kuehl said:
Steve said:
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];

This is a pretty dangerous approach: there is actually no
portable approach to determine the number of characters in a
file except counting them, e.g. like this:

/**/ std::streamsize len = std::distance(
/**/ std::istreambuf_iterator<char>(ifs),
/**/ std::istreambuf_iterator<char>());
[snip]


Should a code below work too?
 
A

Alex Vinokur

Alex Vinokur said:
Dietmar Kuehl said:
Steve said:
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];

This is a pretty dangerous approach: there is actually no
portable approach to determine the number of characters in a
file except counting them, e.g. like this:

/**/ std::streamsize len = std::distance(
/**/ std::istreambuf_iterator<char>(ifs),
/**/ std::istreambuf_iterator<char>());
[snip]


Should a code below work too?
[snip]

Class 'ostreambuf_iterator' doesn't have constructor ostreambuf_iterator().
So, the code above doesn't work.

Must we use 'seekp' and 'tellp' to get length of 'ofs'?
 
D

Dietmar Kuehl

No. Output iterators don't cannot be compared to each other.
Class 'ostreambuf_iterator' doesn't have constructor
ostreambuf_iterator(). So, the code above doesn't work.

This is another issue. However:
Must we use 'seekp' and 'tellp' to get length of 'ofs'?

No. It is '0' because you haven't opened any file. Of course, if
you had opened a file for output but not for reading or appending,
it would still be '0' as it would be truncated. If you want to
know the size (e.g. the number of characters), open the file for
reading and use 'distance()'. You cannot determine the number of
character in a file using the result of seeks, e.g. because the
difference type may be too small to represent the result. Also,
depending on how you opened the file, you should use the results
as opaque handles to a position anyway because the difference
would not necessarily match the number of characters between them.

Actually, there is rarely any need to determine the file size in
the first place! If you need this information, you are probably
doing things with files which are non-portable anyway and thus
better addressed using a different, non-standard interface to
files, for example 'mmap()'.
 
A

Alex Vinokur

Dietmar Kuehl said:
Steve said:
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];

This is a pretty dangerous approach: there is actually no
portable approach to determine the number of characters in a
file except counting them, e.g. like this:

/**/ std::streamsize len = std::distance(
/**/ std::istreambuf_iterator<char>(ifs),
/**/ std::istreambuf_iterator<char>());
[snip]


Those methods has different behavior (see program below).
There are another methods with once more behavior?

Compiler g++ 3.4.1
// ====== foo.cpp : BEGIN ======
#include <cassert>
#include <string>
#include <iostream>
#include <fstream>
using namespace std;

// ------------------
#define TXT_FILE_NAME "foo.txt"
#define BIN_FILE_NAME "foo.bin"

enum OpenMode { BIN_MODE, TXT_MODE };

// -----------------
size_t get_filesize_via_fseek_ftell (
const char * const filename_i,
OpenMode mode_i
)
{
FILE* fp = NULL;

assert ((mode_i == TXT_MODE) || (mode_i == BIN_MODE));
fp = fopen(filename_i, ((mode_i == TXT_MODE) ? "r" : "rb"));
assert (fp);

int rc = fseek(fp, 0, SEEK_END);
assert (rc == 0);

const size_t ret_filesize (ftell(fp));

rc = fclose(fp);
assert (rc == 0);

return ret_filesize;
}


// -----------------
size_t get_filesize_via_seekg_tellg (
const char * const filename_i,
OpenMode mode_i
)
{
ifstream fs;

assert ((mode_i == TXT_MODE) || (mode_i == BIN_MODE));

if (mode_i == TXT_MODE) fs.open (filename_i);
else fs.open (filename_i, ios::binary);

assert (fs);
assert (fs.is_open());


fs.seekg(0, ios::beg);
const ios::pos_type start_pos = fs.tellg();

fs.seekg(0, ios::end);
const ios::pos_type end_pos = fs.tellg();

const size_t ret_filesize (static_cast<size_t>(end_pos - start_pos));

fs.close();
assert (!fs.is_open());

return ret_filesize;
}


// -----------------
size_t get_filesize_via_distance (
const char * const filename_i,
OpenMode mode_i
)
{
ifstream fs;

assert ((mode_i == TXT_MODE) || (mode_i == BIN_MODE));

if (mode_i == TXT_MODE) fs.open (filename_i);
else fs.open (filename_i, ios::binary);

assert (fs);
assert (fs.is_open());

const size_t ret_filesize (static_cast<size_t>(distance(
istreambuf_iterator<char>(fs),
istreambuf_iterator<char>()
)
)
);

fs.close();
assert (!fs.is_open());

return ret_filesize;
}


// -----------------
void create_file (
const string& data_i,
const char * const filename_i,
OpenMode mode_i
)
{
remove (filename_i);

ofstream fs;

assert ((mode_i == TXT_MODE) || (mode_i == BIN_MODE));

if (mode_i == TXT_MODE) fs.open (filename_i);
else fs.open (filename_i, ios::binary);

assert (fs);
assert (fs.is_open());

fs << data_i;

fs.close();
assert (!fs.is_open());

}


int main ()
{
const char data[] = "\n";
create_file (data, TXT_FILE_NAME, TXT_MODE);
create_file (data, BIN_FILE_NAME, BIN_MODE);

// -------------------------------------------
cout << endl;

cout << "--- get_filesize_via_fseek_ftell" << endl;

cout << "Created in TXT mode, read in TXT mode: "
<< get_filesize_via_fseek_ftell (TXT_FILE_NAME, TXT_MODE)
<< endl;

cout << "Created in BIN mode, read in BIN mode: "
<< get_filesize_via_fseek_ftell (BIN_FILE_NAME, BIN_MODE)
<< endl;

cout << "Created in TXT mode, read in BIN mode: "
<< get_filesize_via_fseek_ftell (TXT_FILE_NAME, BIN_MODE)
<< endl;

cout << "Created in BIN mode, read in TXT mode: "
<< get_filesize_via_fseek_ftell (BIN_FILE_NAME, TXT_MODE)
<< endl;


// -------------------------------------------
cout << endl;

cout << "--- get_filesize_via_seekg_tellg" << endl;

cout << "Created in TXT mode, read in TXT mode: "
<< get_filesize_via_seekg_tellg (TXT_FILE_NAME, TXT_MODE)
<< endl;

cout << "Created in BIN mode, read in BIN mode: "
<< get_filesize_via_seekg_tellg (BIN_FILE_NAME, BIN_MODE)
<< endl;

cout << "Created in TXT mode, read in BIN mode: "
<< get_filesize_via_seekg_tellg (TXT_FILE_NAME, BIN_MODE) << endl;

cout << "Created in BIN mode, read in TXT mode: "
<< get_filesize_via_seekg_tellg (BIN_FILE_NAME, TXT_MODE)
<< endl;


// -------------------------------------------
cout << endl;

cout << "--- get_filesize_via_distance" << endl;

cout << "Created in TXT mode, read in TXT mode: "
<< get_filesize_via_distance (TXT_FILE_NAME, TXT_MODE)
<< endl;

cout << "Created in BIN mode, read in BIN mode: "
<< get_filesize_via_distance (BIN_FILE_NAME, BIN_MODE)
<< endl;

cout << "Created in TXT mode, read in BIN mode: "
<< get_filesize_via_distance (TXT_FILE_NAME, BIN_MODE)
<< endl;

cout << "Created in BIN mode, read in TXT mode: "
<< get_filesize_via_distance (BIN_FILE_NAME, TXT_MODE)
<< endl;

return 0;
}

// ====== foo.cpp : END ========



// ====== Run Log ========

--- get_filesize_via_fseek_ftell
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_seekg_tellg
Created in TXT mode, read in TXT mode: 2
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

--- get_filesize_via_distance
Created in TXT mode, read in TXT mode: 1
Created in BIN mode, read in BIN mode: 1
Created in TXT mode, read in BIN mode: 2
Created in BIN mode, read in TXT mode: 1

// =======================
 
A

Alex Vinokur

Alex Vinokur said:
Dietmar Kuehl said:
Steve said:
ifs.seekg(0,std::ios::end);
int len = ifs.tellg();
ifs.seekg(0,std::ios::beg);
char* buf = new char[len];

This is a pretty dangerous approach: there is actually no
portable approach to determine the number of characters in a
file except counting them, e.g. like this:

/**/ std::streamsize len = std::distance(
/**/ std::istreambuf_iterator<char>(ifs),
/**/ std::istreambuf_iterator<char>());
[snip]
[snip]


std::streampos len = ifs.rdbuf()->pubseekoff (0,ios::end,ios::in)));
ifs.rdbuf()->pubseekpos (0,ios::in);
 
D

Dietmar Kuehl

Alex said:

Actually, it is pretty obvious that you will get different results
with the various methods. The real question to ask first is what
you want to do with the file size and then use an adequat method
to determine that number - assuming that it is worth the effort to
determine it in the first place: at least on the systems I work
normally on (variants of UNIX) the file size is rather volatile
and effectively you cannot reliably assume that it does not change
even if you keep the file open. That is, for the intend of the
person who started this thread the file size is useless at best
and dangerous at worst: you may end up overwriting memory, get
less bytes than you wanted, etc.

If you just need the file size for informational purposes, e.g. to
display a progress bar, the seek methods should be OK. If you want
to determine the number of characters precisely (which, as I have
discussed above, is probably a pointless endeavour) you are stuck
with counting them: apart from text and binary mode, you also have
different encodings which may influence the number of characters.
Just install a locale with multi-byte encodings or contrive a
corresponding 'codecvt' facet and you get infinitely more fun into
the game.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,201
Messages
2,571,049
Members
47,654
Latest member
LannySinge

Latest Threads

Top