A
Amit Bhatia
Hi,
I am trying to write a hash function that works on a pair of integers.
so I have a pair<int,int> of integers. Let the first element be called
"a" and second as "b".
I know before hand that:
0<=a<30
-2^a<=b<=2^a-1.
In more detail, I am using this to keep a record of lattice points seen
on a multi resolution hierarchical grid. A resolution level "l" has
2^{l*d} elements where d is the dimension of the space.
Also, there can be at most two different instantiations of the same
region in the grid for my work. To distinguish them, I add +1 and put
-ve sign in front of one copy.
So <a,b>, and <a,-(b+1)> represent the same region in the grid.
As a result, there can be a key <4,3> and another key <4,-3>
If all the b's were >0, then I could already think of a very simple
perfect hash function:
size_t value = 1<<a+b;
However, if b<0 is also possible, then, I can only think of
size_t value;
if(b>=0) value = 1<<a+b;
else value = 2^30 + 1<<a+-(b+1); // I know a<30 for sure.
Now I used this in my code, and did some profiling and I found that I am
spending almost 80 % of the time just looking up if a key exists in this
hash map. (there are no other major calculations in the program function
where this "find" operation is called).
So for example, in one particular execution of the program, the function
which calls this key lookup is called
158905316 times and takes 100 sec (~80%) of the time.
My question is: can I do much faster than this? I am not an expert, and
I am using g++ 4.2 on a pentium 4 2.4 Ghz with 512 MB RAM. But still
this should be much faster.
What is really surprising is that if I use maps instead, they are
running faster.
So for about, 83127320 calls, I take 16.06 sec (~47.26%) if I use a stl map.
thanks,
-amit.
I am trying to write a hash function that works on a pair of integers.
so I have a pair<int,int> of integers. Let the first element be called
"a" and second as "b".
I know before hand that:
0<=a<30
-2^a<=b<=2^a-1.
In more detail, I am using this to keep a record of lattice points seen
on a multi resolution hierarchical grid. A resolution level "l" has
2^{l*d} elements where d is the dimension of the space.
Also, there can be at most two different instantiations of the same
region in the grid for my work. To distinguish them, I add +1 and put
-ve sign in front of one copy.
So <a,b>, and <a,-(b+1)> represent the same region in the grid.
As a result, there can be a key <4,3> and another key <4,-3>
If all the b's were >0, then I could already think of a very simple
perfect hash function:
size_t value = 1<<a+b;
However, if b<0 is also possible, then, I can only think of
size_t value;
if(b>=0) value = 1<<a+b;
else value = 2^30 + 1<<a+-(b+1); // I know a<30 for sure.
Now I used this in my code, and did some profiling and I found that I am
spending almost 80 % of the time just looking up if a key exists in this
hash map. (there are no other major calculations in the program function
where this "find" operation is called).
So for example, in one particular execution of the program, the function
which calls this key lookup is called
158905316 times and takes 100 sec (~80%) of the time.
My question is: can I do much faster than this? I am not an expert, and
I am using g++ 4.2 on a pentium 4 2.4 Ghz with 512 MB RAM. But still
this should be much faster.
What is really surprising is that if I use maps instead, they are
running faster.
So for about, 83127320 calls, I take 16.06 sec (~47.26%) if I use a stl map.
thanks,
-amit.