hash key performance

Matthias WÃ¤chter · Oct 2, 2007

I am just wondering why different hash keys result in such a
significantly different performance.

I have written 11 simple test cases, all using the following template:

num=ARGV[0].to_i

var=:key

a={:key => "test"}

num.times {
b=a[:key][0]
}

Beside this test_symbol.rb variant, I replace :key in the example
above with false (test_false.rb), an integer value 1 (test_int.rb),
with nil (test_nil.rb), with "id" (test_string.rb). In six other
variants I replace the key with var, and set var to a symbol
(test_var_symbol.rb), false (test_var_false.rb), an integer value
(test_var_int.rb), nil (test_var_nil.rb), "id" (test_var_string.rb),
and "id".frozen (test_var_frozen.rb).

This is my performance chart using time (dual-core CPU, called with
num=30_000_000). I ran it once (dropped timing values), then three
consecutive times and added the times (which is the sort key as well):

test_int.rb 37.029
test_var_int.rb 37.659
test_symbol.rb 37.773
test_var_symbol.rb 37.785
test_var_frozen.rb 37.852
test_var_string.rb 40.515
test_var_false.rb 45.468
test_false.rb 45.687
test_var_nil.rb 45.894
test_nil.rb 46.068
test_string.rb 53.032

No surprise is that repeated lookup of a string as a key results in
the worst performance, and using a Fixnum gives best performance.

More interesting is a bulk of tests that relate to using symbols and
variables for int, symbol and frozen string, all within timing
accuracy. A noticable, but still small gap can be seen for a
nonfrozen string in a variable.

But here it comes: Using false and, even worse, nil (in either
literal or variable form) is way slower (a good 20%) than using
integers and symbols, even if they follow the same "immutable
object" philosophy of fixed object_ids.

Why is that? What's so special about false and nil to have such a
degraded performance as a hash key?

- Matthias

Marcin MielÅ¼yÅ„ski · Oct 2, 2007

Matthias said:
But here it comes: Using false and, even worse, nil (in either
literal or variable form) is way slower (a good 20%) than using
integers and symbols, even if they follow the same "immutable
object" philosophy of fixed object_ids.

Why is that? What's so special about false and nil to have such a
degraded performance as a hash key?

hash.c:rb_any_hash is the answer to your question (funcall('hash') is
discarded only for Fixnums, Symbols and Strings. Note that String
numbers would be way better if it's hash was cached until first
destructive operation.

lopex

Matthias WÃ¤chter · Oct 3, 2007

hash.c:rb_any_hash is the answer to your question (funcall('hash') is
discarded only for Fixnums, Symbols and Strings.

So the first two cases of "switch (TYPE(a))" should be enhanced with
checks for nil, true and false like this:

switch (TYPE(a)) {
case T_FIXNUM:
case T_SYMBOL:
+ case T_NIL:
+ case T_FALSE:
+ case T_TRUE:
return (int)a;
break;

Objections?

- Matthias

Ryan Davis · Oct 3, 2007

So the first two cases of "switch (TYPE(a))" should be enhanced with
checks for nil, true and false like this:

switch (TYPE(a)) {
case T_FIXNUM:
case T_SYMBOL:
+ case T_NIL:
+ case T_FALSE:
+ case T_TRUE:
return (int)a;
break;

Send that and your original timings to ruby-core@ and/or file a bug =20
on rubyforge in the ruby project.

I made a blockchain and want to make a cryptocurrency, but my code doesn't verify hash of each block	2	Jun 2, 2024
How to change key name in json file with python	0	Oct 2, 2022
Hash key : string or symbol!?	6	Apr 9, 2009
hash example	2	Jul 12, 2010
replace lines in a file using hash key-value pairs	3	May 20, 2011
hash, key, problem...	4	Jun 18, 2008
BITCOIN PROGRAMMING - CODE INCLUDED - needs slight modification in linux terminal - NSA please do not block	0	Nov 2, 2024
I HAVE MADE AN IMPROVED INPUT FOR INTEGERS ONLY	3	Oct 28, 2024

hash key performance

Matthias WÃ¤chter

Marcin MielÅ¼yÅ„ski

Matthias WÃ¤chter

Ryan Davis

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads