R
Robert Klemme
Hugh said:Hugh said:While my brain is behaving like cottage cheese, it's probably not
the time to ask how one might guarantee that you don't stomp on the
hashes of other ojects in the system. If you have an even number of
elements, all the same Fixnum, like [1,1,1,1] then they would hash
to 0, as would [2,2], I "think".
irb(main):004:0> [1,1].inject(0) { |a,b| a ^= b.hash}
=> 0
irb(main):005:0> [2,1,1,2].inject(0) { |a,b| a ^= b.hash}
=> 0
Btw, the assignment is superfluous. The result of a^b.hash is the
next iteration's a.
Yes, good point, the result of the block....
Yep.
------------------------------------------------------
Enumerable#inject enum.inject(initial) {| memo, obj | block }
=> obj enum.inject {| memo, obj | block } => obj
------------------------------------------------------------------------
Combines the elements of _enum_ by applying the block to an
accumulator value (_memo_) and each element in turn. At each
step, _memo_ is set to the value returned by the block. The
first form ================================================
[...]lo>> [1,2, 1,2].inject(0) { |a,b| ((a << 3) ^ b.hash) & 0xFFFF_FFFF}irb(main):006:0>
Yes. The algorithm can certainly be improved on. Typically you
rather do something similar to
(a.hash ^ (b.hash << 3) ^ (c.hash << 7)) & MAX_HASH
09:53:59 [~]: irbs=> 2781[2,1,1,2].inject(0) { |a,b| ((a << 3) ^ b.hash) & 0xFFFF_FFFF}=> 1885
Ah, so that's what MAX_HASH is, I couldn't remember how big Fixnums
were.
No, that's not really the limit - I just choose 32 bit as it's an often
used size.
13:41:59 [~]: ruby -e 'i=1;i<<=1 while Fixnum === i;i>>=1;puts i.to_s(16),
i.class'
20000000
Fixnum
I was thinking about something like a linear congruential
random number generator like:
brains hgs 22 %> irb
irb(main):001:0> [2,1,1,2].inject(0) {|a,b| ((a * 31)+b.hash) %
4093082899} => 151936
irb(main):002:0> [2,2,1,1].inject(0) {|a,b| ((a * 31)+b.hash) %
4093082899} => 153856
irb(main):003:0> [2,1,2,1].inject(0) {|a,b| ((a * 31)+b.hash) %
4093082899} => 151996
irb(main):004:0>
like
http://www.cs.bell-labs.com/cm/cs/pearls/markovhash.c
from "Programming Pearls".
That code does only one modulus on the NHASH. Your code does it for each
iteration. But otherwise 31 seems a good number - Java containers use 31,
too.
From AbstractList:
public int hashCode() {
int hashCode = 1;
Iterator i = iterator();
while (i.hasNext()) {
Object obj = i.next();
hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
}
return hashCode;
}
with a largish prime grabbed from
http://primes.utm.edu/lists/small/small.html#10
being the biggest I could see less than 0XFFFF_FFFF (4294967295)
------------------------------------------------ Class: Fixnum <
Integer A +Fixnum+ holds +Integer+ values that can be
represented in a native machine word (minus 1 bit). If any
operation on a +Fixnum+
Should that be 0x7FFF_FFFF? (2147483647)
According to
http://www.rsok.com/~jrm/printprimes.html
this would seem to be a prime number, so could be used as the
modulus anyway.
You can even use bit and on this which I would expect to be slightly more
efficient.
Kind regards
robert