Accessing Hash elements in sorted order?

C

Chris

Problem: I wanted to access the elements of a Hash (specifically from
CGI#params) in sorted order.

My resolution: Hash#sort returns an array which can be accessed via
Array#each, but each item is another array. So the return of
CGI.param.sort is:

[[key1,value1],[key2,value2],[key3,value3],...,[keyn,valuen]]

So I wrote (essentially -- the meat):

cgi = CGI.new
cgi.params.sort.each { |pair| puts "#{pair[0]}=#{pair[1]}" }

Is this *really* the best way to access the elements of Hash in sorted
order? I suppose, if at base, this is the way to do it, I could extend
Hash to abstract this...

-ceo
 
F

Florian Gross

Chris said:
cgi = CGI.new
cgi.params.sort.each { |pair| puts "#{pair[0]}=#{pair[1]}" }

One small suggestion:

cgi.params.sort.each do |(key, value)|
puts "#{key}=#{value}"
end
Is this *really* the best way to access the elements of Hash in sorted
order? I suppose, if at base, this is the way to do it, I could extend
Hash to abstract this...

There has been some discussion going on whether Ruby's Hash should be
sorted by default on this list recently.

Regards,
Florian Gross
 
C

Chris

Florian said:
Chris said:
cgi = CGI.new
cgi.params.sort.each { |pair| puts "#{pair[0]}=#{pair[1]}" }


One small suggestion:

cgi.params.sort.each do |(key, value)|
puts "#{key}=#{value}"
end

This is *exactly* what I was looking for. Figured there had to be a way
to access the key/value directly as above.
There has been some discussion going on whether Ruby's Hash should be
sorted by default on this list recently.

I would vote in favor of it.

-ceo
 
M

Markus

I would vote in favor of it.

I'd vote against, for the following reasons.

1. Every time it comes up the thread goes quite a while before the
advocates realize that they aren't all assuming the same sorting
order. Some are thinking "sorted by key (obviously)" while
others are thinking "sorted by order of insertion (obviously)"
and still others are thinking "sorted by value" or "sorted by
some arbitrary key so long as it's consistent," etc.
2. Many of the proposals are not particularly well defined when you
consider issues like modification and alternate means of
construction.
3. Hash, like String, Array, etc. is a fairly well established data
structure. Although it might for some uses be nice to have a
string where the characters were "automatically" sorted into
alphabetical order, this isn't what anyone familiar with strings
would expect.
4. It may well impose a significant performance penalty
5. It may well break existing code
6. It is easy enough to produce the desired effect(s) by other
means.

Instead, I support the extension of => to act as a general operator
in all contexts or (much better, IMHO) the addition of many more user
definable/overridable operators that would let people add new classes
(with sweet syntax) that worked the way they wanted (see
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/111627 for
more details on this idea). If any of these became popular, it could be
climb it's way into the language after being field tested.

-- Markus
 
H

Hal Fulton

Markus said:
I'd vote against, for the following reasons.

1. Every time it comes up the thread goes quite a while before the
advocates realize that they aren't all assuming the same sorting
order. Some are thinking "sorted by key (obviously)" while
others are thinking "sorted by order of insertion (obviously)"
and still others are thinking "sorted by value" or "sorted by
some arbitrary key so long as it's consistent," etc.

Let's distinguis between "sorted" and "ordered." If it's ordered, you
can sort it however you like.


Hal
 
M

Markus

Let's distinguis between "sorted" and "ordered." If it's ordered, you
can sort it however you like.

Agreed (as a terminology distinction), with the caveat that "you
can sort it however you like" is subject to some reasonablity
restrictions.

-- Markus
 
J

Jim Haungs

One of the nifty abstractions in Smalltalk that is missing in Ruby is
the Association class.

An association is simply a key/value pair, and a dictionary is simply
a Set of Associations, where duplicates are not allowed because the
default equality operation on Association compares two associations by
their keys.

Association also defines a default <= ordering method.

Separating these two concepts supports cleaner sorting by key and by
value, and simplifies various kinds of enumeration.

Assocations are also useful by themselves for other kinds of paired
objects.
 
M

Markus

One of the nifty abstractions in Smalltalk that is missing in Ruby is
the Association class.

Yes! That's what I was thinking of when I suggested that '=>'
should be an operator rather than a local syntax hack. I was thinking
it would return a one element Hash (my smalltalk is so rusty I don't
even squeak) but having Associations would be much better!

-- MarkusQ
 
F

Florian Gross

Gavin said:
cgi.params.sort.each do |key, value|

works as well. Do the parentheses add anything -- like
future-proofing?

I think it is most clear when using parentheses. (Because it looks
exactly like the array structure: [number, [english, german]])

Personally, I would like the x = 1, 2 case to be equivalent to x, = 1,
2. (It would allow blocks to reject arguments easily. E.g. #each could
yield(item, index) and if you were to do |item| it would work correctly.)

Here are some possible styles. Maybe you will prefer an other one than me:

number_words = [
[1, %w{one eins}],
[2, %w{two zwei}],
[3, %w{three drei}]
]

number_words.each do |(number, (english, german))|
puts "#{number} is '#{english}' in English and " +
"'#{german}' in German"
end

number_words.each do |number, words|
# I think this should be *words for clarity, but in the
# rhs of the argument list assignment there is no splash. (AFAIK)
english, german = words
...
end

number_words.each do |number, (english, german)|
# Combined
...
end

Hm, I guess it's mostly an "explicit is easier to understand than
implicit" decision for me.

Regards,
Florian Gross
 
F

Florian Gross

ChrisO said:
When I saw the parens, I thought they might need to be there, but I
digressed from asking since it was simple enough in irb to experiment
and find out they were NOT needed.

Still, it initially lead me as a fairly new player in Ruby, to consider
that the parens were perhaps some "mapping" mechanism since I had a
single item "pair" in that place before. Instead, when I realized the
parens were superfluous, I reasoned that Ruby was smart enough to "map"
the key/value on it's own it would seem. (Similar to #scan on Match
objects from the #scan regex? Kinda? If someone would care to
comment...?)

It's all just Ruby's assignment semantics at work. I'm not sure if
they're documented in the old Pickaxe at all, but if they are the
documentation will be slightly out-dated. (Semantics were different in
1.6.8 and I think that between 1.8.0 and 1.8.1 they changed slightly for
block parameters in respect to the difference between lambda { } and
Proc.new)

You can see the connections here:

a = [1, 2] # a: [1, 2]
a, = [1, 2] # a: 1
a = 1, 2 # a: [1, 2] -- equivalent to yield() sample giving warning
a, b = [1, 2] # a: 1, b: 2
(a, b) = [1, 2] # a: 1, b: 2 -- same as a, b = *[1, 2]

eval("yield [1, 2]") { |a| p a } # a: [1, 2]
eval("yield [1, 2]") { |a,| p a } # a: 1
eval("yield [1, 2]") { |a,b| p [a,b] } # a: 1, b: 2
eval("yield [1, 2]") { |(a,b)| p [a,b] } # a: 1, b: 2

eval("yield(1, 2)") { |a| p a } # a: [1, 2] -- warning
eval("yield(1, 2)") { |a,| p a } # a: 1
eval("yield(1, 2)") { |a,b| p [a,b] } # a: 1, b: 2
eval("yield(1, 2)") { |(a,b)| p [a,b] } # a: 1, b: 2

Regards,
Florian Gross
 
B

Brian Candler

I'd vote against, for the following reasons.

1. Every time it comes up the thread goes quite a while before the
advocates realize that they aren't all assuming the same sorting
order. Some are thinking "sorted by key (obviously)" while
others are thinking "sorted by order of insertion (obviously)"
and still others are thinking "sorted by value" or "sorted by
some arbitrary key so long as it's consistent," etc.
2. Many of the proposals are not particularly well defined when you
consider issues like modification and alternate means of
construction.
3. Hash, like String, Array, etc. is a fairly well established data
structure. Although it might for some uses be nice to have a
string where the characters were "automatically" sorted into
alphabetical order, this isn't what anyone familiar with strings
would expect.
4. It may well impose a significant performance penalty
5. It may well break existing code
6. It is easy enough to produce the desired effect(s) by other
means.

7. It would almost certainly no longer be a Hash!! In other words, to get
any efficiency for the 'each' method, you would have to implement it as a
B-Tree or similar data structure, not a hash. I'm sure there's a tree
library in RAA, so you can always just use that instead.

That's unless 'sorted' simply means 'kept in order of insertion'. In that
case, it could be a combination of a hash and a doubly-linked list. But that
means it still wouldn't be just a hash.

Regards,

Brian.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top