C
chris
Does anyone have any tips for comparing large amounts of data?
The data in question is around 1000 lines of up to 400 delimited key value
pairs. Each line has a unique identifier stored in one of the fields. Each
field needs to be compared to the corresponding line/field in a "master file".
In addition, each field has formatting rules, such as trimming spaces and
zeroes, ignore capitalization, etc., which are stored in a database. I have
done similar things in Java using a HashMap of HashMaps approach, but the
performance was awful.
Would using the STL hash_map in C++ result in the same performace problems?
Would it be better to read the file into memory with mmap() or fopen/fread
and work with individually malloc'ed pieces of data, using memcmp() to
compare, and just store the pointers in an array or map? Also, any hints
for divided up all those pairs? Is strtok() my best bet?
Thanks for any help or feedback.
The data in question is around 1000 lines of up to 400 delimited key value
pairs. Each line has a unique identifier stored in one of the fields. Each
field needs to be compared to the corresponding line/field in a "master file".
In addition, each field has formatting rules, such as trimming spaces and
zeroes, ignore capitalization, etc., which are stored in a database. I have
done similar things in Java using a HashMap of HashMaps approach, but the
performance was awful.
Would using the STL hash_map in C++ result in the same performace problems?
Would it be better to read the file into memory with mmap() or fopen/fread
and work with individually malloc'ed pieces of data, using memcmp() to
compare, and just store the pointers in an array or map? Also, any hints
for divided up all those pairs? Is strtok() my best bet?
Thanks for any help or feedback.