How to sort a table in Ruby?

N

nkb

Hi.
I've a table with about 10 fields. I would like to be able to sort
ascending or decendingly, based on any one of the fields. I was
wondering if there might an elegant way to do this in Ruby? In C, I do
it by keeping track of the index of the array and then assigning them
individually. Thanks!!!
 
B

Brian Candler

I've a table with about 10 fields. I would like to be able to sort
ascending or decendingly, based on any one of the fields. I was
wondering if there might an elegant way to do this in Ruby? In C, I do
it by keeping track of the index of the array and then assigning them
individually. Thanks!!!

Something like this?

a = [
[5,3,5,"foo"],
[2,1,9,"bar"],
[3,9,3,"baz"],
]
a.sort { |x,y| x[3] <=> y[3] } # sort on 4th element

You can get a reverse sort by

a.sort { |x,y| y[3] <=> x[3] }

Regards,

Brian.
 
R

Robert Klemme

Brian Candler said:
I've a table with about 10 fields. I would like to be able to sort
ascending or decendingly, based on any one of the fields. I was
wondering if there might an elegant way to do this in Ruby? In C, I do
it by keeping track of the index of the array and then assigning them
individually. Thanks!!!

Something like this?

a = [
[5,3,5,"foo"],
[2,1,9,"bar"],
[3,9,3,"baz"],
]
a.sort { |x,y| x[3] <=> y[3] } # sort on 4th element

You can get a reverse sort by

a.sort { |x,y| y[3] <=> x[3] }

That's even better:

a.sort_by {|row| row[3]}

Regards

robert
 
G

Gavin Kistner

a.sort { |x,y| x[3] <=> y[3] } # sort on 4th element
That's even better:
a.sort_by {|row| row[3]}

The reason that #sort_by is better than #sort is described in the
documentation. In short (IIRC) it's because it caches the key values
for each row (performing the block only once for each row) and uses
those to sort, rather than invoking the block for every unique pair it
has to compare.

A downside, however, is the inability to specify something like a
reverse sort. (Although for that case you can just reverse the array
afterwards.)

Actually, does anyone have a compelling example where #sort produce a
result (which is reasonable) which #sort_by cannot? (By 'reasonable' I
mean that something like a.sort{ |x,y| x[3] <=> y[7] } does something
impossible for #sort_by, but probably would produce erratic results,
given no way of knowing in which order the pairs come.)
 
R

Robert Klemme

Gavin Kistner said:
a.sort { |x,y| x[3] <=> y[3] } # sort on 4th element
That's even better:
a.sort_by {|row| row[3]}

The reason that #sort_by is better than #sort is described in the
documentation. In short (IIRC) it's because it caches the key values
for each row (performing the block only once for each row) and uses
those to sort, rather than invoking the block for every unique pair it
has to compare.

Yep. Plus, it's shorter to type, which was the main advantage I wanted to
point out here. :)
A downside, however, is the inability to specify something like a
reverse sort. (Although for that case you can just reverse the array
afterwards.)

Actually, does anyone have a compelling example where #sort produce a
result (which is reasonable) which #sort_by cannot? (By 'reasonable' I
mean that something like a.sort{ |x,y| x[3] <=> y[7] } does something
impossible for #sort_by, but probably would produce erratic results,
given no way of knowing in which order the pairs come.)

The only one that I can think of at the moment is the one where there is
no proper comparision op defined. This is a made up example:
NoMethodError: undefined method `<=>' for #<struct El foo=nil, bar=nil>
from (irb):3

=> [#<struct El foo=6, bar=7>, #<struct El foo=5, bar=9>, #<struct El
foo=4, bar=8>, #<struct El foo=8, bar=9>, #<struct El foo=8, bar=7>,
#<struct El
[QUOTE="# said:
values.sort {|a,b| c = a.foo <=> b.foo; c == 0 ? a.bar <=> b.bar : c}
[/QUOTE]
=> [#<struct El foo=1, bar=2>, #<struct El foo=2, bar=2>, #<struct El
foo=3, bar=5>, #<struct El foo=4, bar=5>, #<struct El foo=4, bar=8>,
#<struct El
foo=5, bar=9>, #<struct El foo=6, bar=7>, #<struct El foo=8, bar=2>,
#<struct El foo=8, bar=7>, #<struct El foo=8, bar=9>]

You can do this with sort_by but you have to create new instances (Arrays
in this case), which might or might not be acceptable depending on the
circumstances:
values.sort_by {|x| [x.foo, x.bar]}
=> [#<struct El foo=1, bar=2>, #<struct El foo=2, bar=2>, #<struct El
foo=3, bar=5>, #<struct El foo=4, bar=5>, #<struct El foo=4, bar=8>,
#<struct El
foo=5, bar=9>, #<struct El foo=6, bar=7>, #<struct El foo=8, bar=2>,
#<struct El foo=8, bar=7>, #<struct El foo=8, bar=9>]

I guess most real world cases can be covered by #sort_by.

Kind regards

robert
 
F

Florian Gross

Gavin said:
a.sort { |x,y| x[3] <=> y[3] } # sort on 4th element
That's even better:
a.sort_by {|row| row[3]}

The reason that #sort_by is better than #sort is described in the
documentation. In short (IIRC) it's because it caches the key values for
each row (performing the block only once for each row) and uses those to
sort, rather than invoking the block for every unique pair it has to
compare.

A downside, however, is the inability to specify something like a
reverse sort. (Although for that case you can just reverse the array
afterwards.)

It is possible when you want to sort by something that can be expressed
as a number:

persons.sort_by { |person| -person.age }

Regards,
Florian Gross
 
F

Florian Frank

The reason that #sort_by is better than #sort is described in the
documentation. In short (IIRC) it's because it caches the key values
for each row (performing the block only once for each row) and uses
those to sort, rather than invoking the block for every unique pair it
has to compare.

sort_by is only better if creating a temporary array of length n for
sort keys is faster than yielding the block or calling the <=> method
on elements ca. n*log(n) times. In theory this is always computational
faster but not necessarily on real machines that have to manage the
required memory. Using sort! instead of sort also can make a
difference.

The trade-off leans towards sort_by if the computations of the keys is
really slow (like accessing the filesystem or doing a dns-lookup) or if
comparing the computed keys is much faster than comparing the elements
directly. The last case can be used to speed up a sort by computing a
string of a complex object which can be compared more efficiently than
comparing the objects themselves. An interesting article on this topic
is "A Fresh Look at Efficient Perl Sorting" by Uri Guttman and Larry
Rosler, even if it's quite perl-centric:
http://www.sysarch.com/perl/sort_paper.html
A downside, however, is the inability to specify something like a
reverse sort. (Although for that case you can just reverse the array
afterwards.)

If your elements respond to a meaningful unary - you can do something
like
[1,3,4,2,5].sort_by { |x| -x }
for reverse sorting.
Actually, does anyone have a compelling example where #sort produce a
result (which is reasonable) which #sort_by cannot? (By 'reasonable' I
mean that something like a.sort{ |x,y| x[3] <=> y[7] } does something
impossible for #sort_by, but probably would produce erratic results,
given no way of knowing in which order the pairs come.)

Well, you could always cheat like in this reverse sort example:

class Cheat < Struct.new:)a)
def <=>(other) other.a <=> a end
end

[1,3,4,2,5].sort_by { |x| Cheat.new(x) }

Florian Frank
 
B

Brian Candler

Using sort! instead of sort also can make a
difference.

remembering that Array#sort is exactly Array#dup.sort! (as it the case with
many other methods, e.g. String#downcase is String#dup.downcase!)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,159
Messages
2,570,879
Members
47,416
Latest member
LionelQ387

Latest Threads

Top