fasteste way to fill a structure.

L

Larry I Smith

simon said:
Thanks for that, I get 1.24 sec and 6mb.
I just need to check what the difference is with my code.
Here are the 3 programs:

Regards,
Larry

Thanks for that, this is great.
I wonder if my Trim(...) function was not part of the problem.

After profiling I noticed that delete [], (or even free(..) ) takes around
50% of the whole time.

Maybe I should get rid of the dynamic allocation all together.

Simon

What does your profiler say about simon2.cpp?

Actually 1.24 seconds is pretty good for 100000 records.

As far as the memory usage goes, did you read the 2
articles on malloc that I posted earlier? Whether you
use new/delete or std::string (which does its own new/delete
behind the scenes) doesn't make much difference in performance
or memory usage, but std::string allows you much more
flexibility when manipulating the strings after you've
filled your vector (i.e. later in the program).

Due to the many (200000) tiny memory allocations, your memory
usage would be about:

2.5 * (sizeof(sFiledData) * 100000)

when both strings (sSomeString1 & sSomeString2) are small.

16 bytes minimum (plus the pointer kept in sFileData) will
be allocated for each of those strings. So, using pointers
in sFileData, the actual memory used for one sFiledData
is at least 48 bytes.

Regards,
Larry
 
S

Steven T. Hatton

Larry said:
simon said:
On my pc (an old Gateway PII 450MHZ with 384MB of RAM):

simon.cpp runs in 2.20 seconds and uses 5624KB of memory.

Thanks for that, I get 1.24 sec and 6mb.
I just need to check what the difference is with my code.
simon2.cpp runs in 2.22 seconds and uses 6272KB of memory.

Your mileage may vary. I'm running SuSE Linux v9.3 and
using the GCC "g++" compiler v3.3.5.

Regards,
Larry
Here are the 3 programs:

Regards,
Larry

Thanks for that, this is great.
I wonder if my Trim(...) function was not part of the problem.

After profiling I noticed that delete [], (or even free(..) ) takes
around 50% of the whole time.

Maybe I should get rid of the dynamic allocation all together.

Simon

What does your profiler say about simon2.cpp?

Actually 1.24 seconds is pretty good for 100000 records.

As far as the memory usage goes, did you read the 2
articles on malloc that I posted earlier? Whether you
use new/delete or std::string (which does its own new/delete
behind the scenes) doesn't make much difference in performance
or memory usage, but std::string allows you much more
flexibility when manipulating the strings after you've
filled your vector (i.e. later in the program).

Due to the many (200000) tiny memory allocations, your memory
usage would be about:

2.5 * (sizeof(sFiledData) * 100000)

when both strings (sSomeString1 & sSomeString2) are small.

16 bytes minimum (plus the pointer kept in sFileData) will
be allocated for each of those strings. So, using pointers
in sFileData, the actual memory used for one sFiledData
is at least 48 bytes.

Regards,
Larry

What am I missing here? There must be some part of the problem that I
missed. The following reads in 100000 data paris, one per line, in far
less than a second.

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <iomanip>
#include <iterator>

using namespace std;

template<typename Key_T, typename Value_T>
struct Data {
Key_T _key;
Value_T _value;

istream& fromStream(istream& in) {
return in >> _key >> _value;
}

ostream& toStream(ostream& out) const {
ios::fmtflags oldFlags(out.setf(ios::left));
out << setw(25) << _key << _value;
out.setf(oldFlags);
}

};

template<typename Key_T, typename Value_T>
istream& operator>>(istream& in, Data<Key_T, Value_T>& data) {
return data.fromStream(in);
}

template<typename Key_T, typename Value_T>
ostream& operator<<(ostream& out, const Data<Key_T,Value_T>& data) {
return data.toStream(out);
}

int main(int argc, char* argv[]) {
if(!(argc > 1)) {
cerr << "records filename: " << endl;
return -1;
}

ifstream ifs(argv[1]);

if(!ifs.is_open()) {
cerr << "Failed to open file: " << argv[1] << endl;
return -1;
}

typedef Data<string, string> D_T;
typedef vector<D_T> DV_T;
DV_T dv;

copy(istream_iterator<D_T>(ifs),istream_iterator<D_T>(),back_inserter(dv));
// copy(dv.begin(), dv.end(),ostream_iterator<D_T>(cout,"\n"));
}
 
L

Larry I Smith

Steven said:
Larry said:
simon said:
On my pc (an old Gateway PII 450MHZ with 384MB of RAM):

simon.cpp runs in 2.20 seconds and uses 5624KB of memory.
Thanks for that, I get 1.24 sec and 6mb.
I just need to check what the difference is with my code.

simon2.cpp runs in 2.22 seconds and uses 6272KB of memory.

Your mileage may vary. I'm running SuSE Linux v9.3 and
using the GCC "g++" compiler v3.3.5.

Regards,
Larry
Here are the 3 programs:

<snip code>

Regards,
Larry
Thanks for that, this is great.
I wonder if my Trim(...) function was not part of the problem.

After profiling I noticed that delete [], (or even free(..) ) takes
around 50% of the whole time.

Maybe I should get rid of the dynamic allocation all together.

Simon
What does your profiler say about simon2.cpp?

Actually 1.24 seconds is pretty good for 100000 records.

As far as the memory usage goes, did you read the 2
articles on malloc that I posted earlier? Whether you
use new/delete or std::string (which does its own new/delete
behind the scenes) doesn't make much difference in performance
or memory usage, but std::string allows you much more
flexibility when manipulating the strings after you've
filled your vector (i.e. later in the program).

Due to the many (200000) tiny memory allocations, your memory
usage would be about:

2.5 * (sizeof(sFiledData) * 100000)

when both strings (sSomeString1 & sSomeString2) are small.

16 bytes minimum (plus the pointer kept in sFileData) will
be allocated for each of those strings. So, using pointers
in sFileData, the actual memory used for one sFiledData
is at least 48 bytes.

Regards,
Larry

What am I missing here? There must be some part of the problem that I
missed. The following reads in 100000 data paris, one per line, in far
less than a second.

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <iomanip>
#include <iterator>

using namespace std;

template<typename Key_T, typename Value_T>
struct Data {
Key_T _key;
Value_T _value;

istream& fromStream(istream& in) {
return in >> _key >> _value;
}

ostream& toStream(ostream& out) const {
ios::fmtflags oldFlags(out.setf(ios::left));
out << setw(25) << _key << _value;
out.setf(oldFlags);
}

};

template<typename Key_T, typename Value_T>
istream& operator>>(istream& in, Data<Key_T, Value_T>& data) {
return data.fromStream(in);
}

template<typename Key_T, typename Value_T>
ostream& operator<<(ostream& out, const Data<Key_T,Value_T>& data) {
return data.toStream(out);
}

int main(int argc, char* argv[]) {
if(!(argc > 1)) {
cerr << "records filename: " << endl;
return -1;
}

ifstream ifs(argv[1]);

if(!ifs.is_open()) {
cerr << "Failed to open file: " << argv[1] << endl;
return -1;
}

typedef Data<string, string> D_T;
typedef vector<D_T> DV_T;
DV_T dv;

copy(istream_iterator<D_T>(ifs),istream_iterator<D_T>(),back_inserter(dv));
// copy(dv.begin(), dv.end(),ostream_iterator<D_T>(cout,"\n"));
}

Each of his 100000 records (each 192 bytes long) contains multiple
fields that must be parsed out of the record, then have lead/trail
blanks trimmed; additional fields in the record must be parsed out
and converted to int. Each record also contains fields that are to
be skipped over (i.e. ignored). Once all of the fields are parsed
out of a record, an object of class sFileData is constructed using
the data parsed from the record; then that sFileData object is
put into the vector. Only after all of this is the next record
read from the file. So most of the work is data parsing.

Larry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,296
Messages
2,571,535
Members
48,281
Latest member
DaneLxa72

Latest Threads

Top