Hash keys

J

J. Cooper

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?
2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

Thanks
 
R

Rick DeNatale

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?

Symbol keys
+ :a is one less keystroke than "a"
+? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
- More symbols get interned which can't get garbage
collected, even after the hash is.

String keys

+ keys can be GCed when removed from hash, or when the hash is GCed.
2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

Rails is probably most responsible for popularizing symbol keys.
ActiveSupport implements a HashWithIndifferentAccess which can use
either symbols or strings interchangeably in access methods.
Internally it uses string keys.
 
R

Robert Dober

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?
----------------- 8< -----------------
Symbols, look at this code

Symbol.send( :define_method, :to_proc ){
lambda{|x| x.send self }
} unless RUBY_VERSION === /^1\.9/

string_keys = %w{a b c}
symbol_keys = string_keys.map(&:to_sym)

string_hash = Hash[ *string_keys.zip([42]*3).to_a.flatten ]
symbol_hash = Hash[ *symbol_keys.zip([42]*3).to_a.flatten ]

p [:symbol, symbol_hash]
p [:string, string_hash]
puts "So far everything looks fine"
string_hash.each_pair do |k,v| k << "..." end
------------------------ 8< ----------------------
Ruby does a fine job by freezing keys of hashes, but I prefer to use
immutable objects as keys whenever it is possible
for that very reason, in our case that favors Symbols
2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?
Handle them? If I pretended to be even more stupid than I actually
believe to be I would say yes sure
a_hash.clear ;)

But I guess that you want to change from one to the other, let me show
you from String to Symbol
------------------------ 8< ----------------------
### Do not do this at home :)
Array.send :define_method, :each_with_index do
count = 0
inject([]){|iwi,e| count+=1; iwi << [e,count=count+1]}
end unless RUBY_VERSION === /^1\.9/
string_hash = %w{A Brave New
World}.each_with_index.inject({}){|h,(v,i)| h.update v => i }
p string_hash

symbol_hash = Hash[ *string_hash.to_a.map{|k,v|[k.to_sym,v]}.flatten ]
p symbol_hash
------------------------ 8< ----------------------
HTH
Robert
 
R

Robert Dober

Symbol keys
+ :a is one less keystroke than "a"
+? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
- More symbols get interned which can't get garbage
collected, even after the hash is.
Hmm very interesting but is this not rather an implementation choice?
Which does not make the information less valuable of course, just
curious?

Cheers
Robert
 
R

Robert Klemme

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?

It depends: my personal convention is this: use symbols if the set of
keys is limited and probably known beforehand; use strings if the data
is read from an external resource (e.g. a file) and there could be
arbitrary key values.
2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

I do not think there is a smooth one size fits all way. You could of
course convert a Hash containing on set of keys to the other one. I
don't think it is worthwhile though and haven't seen it so far.

Kind regards

robert
 
J

J. Cooper

Alright, so in general if the hash is going to interact with the outside
world, I should use string keys, and it's not worth it particularly to
worry about handling mismatch (unless I'm embarking on a Rails-sized
framework)?

I guess I had figured symbols made more sense, as a key is kinda just an
identifier and there isn't a reason to perform string functions on it.
But I didn't realize the deal with the GC
 
J

Justin Collins

Rick said:
Symbol keys
+ :a is one less keystroke than "a"
+? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
- More symbols get interned which can't get garbage
collected, even after the hash is.

String keys

+ keys can be GCed when removed from hash, or when the hash is GCed.


Another thing to look at is symbols are unique, whereas each time you
use a string literal to access the hash, you are creating a new object:

h1 = { :a => 1, :b => 2 }
h2 = { "a" => 1, "b" => 2 }

h2["a"] # Creates a one-time use string "a"
h1[:a] #No new object created, :a already exists

But if you are creating lots and lots of symbols, then that's lots of
unique objects being created which are not going to be garbage collected.

Of course, strings and symbols are not the only choices for hash keys,
any object can be used. What you want may depend on the circumstances.

-Justin
 
P

Phrogz

I do not think there is a smooth one size fits all way. You could of
course convert a Hash containing on set of keys to the other one. I
don't think it is worthwhile though and haven't seen it so far.

http://api.rubyonrails.org/classes/HashWithIndifferentAccess.html

http://facets.rubyforge.org/rdoc/core/classes/Hash.html#M000070

(Neither are a refutation of your statements, just throwing a few data
points into a discussion that I'm too busy to formally join at the
moment.)
 
R

Robert Klemme

Alright, so in general if the hash is going to interact with the outside
world, I should use string keys, and it's not worth it particularly to
worry about handling mismatch (unless I'm embarking on a Rails-sized
framework)?

I am not sure what you mean by "handling mismatch". If by mismatch you
mean access with symbols and strings: I usually do not worry about this,
because I write the code that puts data into the Hash and reads it - so
I know what happens or can control. Personally I prefer uniform access.

If by "outside world" you mean, a "data source that you do not control"
(e.g. web server logfiles, CSV data) then yes, in those cases I would
use Strings, namely the strings I read from that source.
I guess I had figured symbols made more sense, as a key is kinda just an
identifier and there isn't a reason to perform string functions on it.
But I didn't realize the deal with the GC

Well, if there is a limited set (e.g. states of an object like :eek:pen and
:closed for an IO stream) then it makes perfectly sense to use symbols.

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top