Anyway to enhance the performance of ruby hashes? My program uses
hashes heavily. So if i could enhance hash performance, or find a
fasster alternative to hashes, i could improve the overall performance
of my program. Gimme some suggestions.
Indexing them by symbols instead of strings helps a bit.
But hashes are so fast that it's hard to even make a meaningful
benchmark. For example in the following meaningless benchmark
symbol-hashes are faster than string-hashes:
$ time ./meaningless_hash_benchmark.rb --symbols
real 0m16.117s
user 0m14.009s
sys 0m0.485s
$ time ./meaningless_hash_benchmark.rb
real 0m18.525s
user 0m16.306s
sys 0m0.475s
$ cat ./meaningless_hash_benchmark.rb
#!/usr/bin/ruby
N = 2000000
# Lame
if ARGV.empty?
keys = File.readlines("H").map{|x| x.chomp}
else
keys = File.readlines("H").map{|x| x.chomp.to_sym}
end
h = {}
keys.each{|k| h[k] = rand }
k0 = keys[0]
k1 = keys[1]
k2 = keys[2]
# A meaningless loop
N.times{|i|
h[k0] += h[k1]
h[k1] += h[k2]
h[k2] += h[k0]
}
$
15% is pretty impressive for a benchmark dominated by method calls.
That's related to how hashes work. To get a_hash[a_key]
* a_key is converted to a number (hashed)
* the number is looked up in a_hash's internal array
** we can get 0 answer - then a_key is definitely not there
** we can get 1 answer - then verify it by a_real_key == a_key
** we can get 2+ answers - then verify them all until match is found
(this case is rare, and high numbers here are even more rare)
* So if we found a_real_key, such that a_real_key == a_key, then
return a_value_at_a_real_key
* == can actually cost a lot. If you use Strings, their contents need
to be compared. If you use Symbols, you only check whether they're the
same object. Each Symbol object is different, so this is enough.