Help with file comparison

C

chris

Does anyone have any tips for comparing large amounts of data?

The data in question is around 1000 lines of up to 400 delimited key value
pairs. Each line has a unique identifier stored in one of the fields. Each
field needs to be compared to the corresponding line/field in a "master file".
In addition, each field has formatting rules, such as trimming spaces and
zeroes, ignore capitalization, etc., which are stored in a database. I have
done similar things in Java using a HashMap of HashMaps approach, but the
performance was awful.

Would using the STL hash_map in C++ result in the same performace problems?
Would it be better to read the file into memory with mmap() or fopen/fread
and work with individually malloc'ed pieces of data, using memcmp() to
compare, and just store the pointers in an array or map? Also, any hints
for divided up all those pairs? Is strtok() my best bet?

Thanks for any help or feedback.
 
C

Chris \( Val \)

| Does anyone have any tips for comparing large amounts of data?
|
| The data in question is around 1000 lines of up to 400 delimited key value
| pairs. Each line has a unique identifier stored in one of the fields. Each
| field needs to be compared to the corresponding line/field in a "master file".
| In addition, each field has formatting rules, such as trimming spaces and
| zeroes, ignore capitalization, etc., which are stored in a database. I have
| done similar things in Java using a HashMap of HashMaps approach, but the
| performance was awful.
|
| Would using the STL hash_map in C++ result in the same performace problems?
| Would it be better to read the file into memory with mmap() or fopen/fread
| and work with individually malloc'ed pieces of data, using memcmp() to
| compare, and just store the pointers in an array or map? Also, any hints
| for divided up all those pairs? Is strtok() my best bet?

- Read the key-values pairs into 'std::map<>' accossiative container(s).
- Write a simple trim function
- Write a simple function to ignore case

Once you have done that, you're almost finished :).

Note howerver, that 'C++' knows nothing of databses.

Cheers.
Chris Val
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,147
Messages
2,570,833
Members
47,380
Latest member
AlinaBlevi

Latest Threads

Top