S
Steve555
Hi
My vector ( typedef vector<movieRatingPair*> ratingsV ) contains
this struct:
typedef struct{
unsigned short movieID;
unsigned short rating;
} movieRatingPair;
....and on average will contain 200 such structs;
The structs are always stored in the vector in ascending order of
movieID.
I need to find a quick way to check if the vector contains a specific
movieID. I can not use a map instead of a vector, because the extra
few bytes they use creates a problem.(*)
Simply iterating through the vector is proving a bit slow; I guess I
can write some binary search code, but just wanted to check I wasn't
re-inventing the wheel, or simply missing a much better way of dealing
with the whole thing.
Many thanks for any advice.
Steve
(*)There are nearly half million instances of these vectors. (1 per
customer which are stored in an stl map, keyed by their userID.)
In total, therefore, there are 100+ million records. It's effectively
a read-only database, with the data being initially fetched from disk,
and then the data MUST be kept in 2.5Gb of free memory for quick
access/analysis.
Unfortunately, using maps instead of vectors means each record is 32
bytes * 100+million = 3Gb
My vector ( typedef vector<movieRatingPair*> ratingsV ) contains
this struct:
typedef struct{
unsigned short movieID;
unsigned short rating;
} movieRatingPair;
....and on average will contain 200 such structs;
The structs are always stored in the vector in ascending order of
movieID.
I need to find a quick way to check if the vector contains a specific
movieID. I can not use a map instead of a vector, because the extra
few bytes they use creates a problem.(*)
Simply iterating through the vector is proving a bit slow; I guess I
can write some binary search code, but just wanted to check I wasn't
re-inventing the wheel, or simply missing a much better way of dealing
with the whole thing.
Many thanks for any advice.
Steve
(*)There are nearly half million instances of these vectors. (1 per
customer which are stored in an stl map, keyed by their userID.)
In total, therefore, there are 100+ million records. It's effectively
a read-only database, with the data being initially fetched from disk,
and then the data MUST be kept in 2.5Gb of free memory for quick
access/analysis.
Unfortunately, using maps instead of vectors means each record is 32
bytes * 100+million = 3Gb