sort_by: multiple fields with reverse sort

R

Rahul Kumar

I need to use *sort_by* to sort a table, since the user could select
columns in any order.

b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],
["newton", 10, 3] ]
b.sort_by{|x| [ x[1], x[0] ]}

Works fine. However, often the user will want a reverse sort on some
field. I do not off-hand know the datatype of the field, but in most
cases it will be a String.

I tried:

b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.
Thx, rahul.
 
S

Stefano Crocco

|I need to use *sort_by* to sort a table, since the user could select
|columns in any order.
|
| b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],
|["newton", 10, 3] ]
| b.sort_by{|x| [ x[1], x[0] ]}
|
|Works fine. However, often the user will want a reverse sort on some
|field. I do not off-hand know the datatype of the field, but in most
|cases it will be a String.
|
|I tried:
|
| b.sort_by{|x| [ !x[1] ]}
|This works (one criteria), but this does not:
| b.sort_by{|x| [ !x[1], x[0] ]}
|
|Another thread here, suggested using a minus for Numbers but what about
|Strings.
|Thx, rahul.

Are you sure the one criterium version works as you think? !x[1] returns a
boolean value (true if x[1] is false or nil and false otherwise). So, if the
contents of the arrays are all strings or numbers, it'll compare items by
comparing equal arrays only containing the element false. Most likely, this
would give you the elements in the same order.

I don't know how to do what you want using sort_by, but you can do that using
sort. For example, the following code does a reverse sort on the 1 index and a
normal sort on the 0 index

b.sort do |a, b|
res = -(a[1] <=> b[1])
res = a[0] <=> b[0] if res == 0
res
end

For more information, look at the documentation for Enumerable#sort and the
Comparable module (for the <=> operator).

I hope this helps

Stefano
 
Y

Y. NOBUOKA

The one way to do a reverse sort is wrapping a sort target in a class
for a reverse sort.
For instance:

# This class is a class for a reverse sort.
# An Object of various datatypes can be wrapped in an object of this class.
class ItemForReverseSort
def initialize( item )
@item =3D item
end
def item
@item
end
def <=3D>( target )
( self.item <=3D> target.item ) * (-1)
end
end

b =3D [ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5], ["newton", 1=
0, 3] ]

# normal sort
p b.sort_by{|x| [ x[1], x[0] ]}
#=3D> [["newton", 10, 3], ["archie", 20, 5], ["radio", 20, 5], ["radio", =
30, 5]]

# normal sort for 1st item, and reverse sort for 2nd item
p b.sort_by{|x| [ x[1], ItemForReverseSort.new(x[0]) ]}
#=3D> [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5], ["radio", =
30, 5]]


2010/10/13 Rahul Kumar said:
I need to use *sort_by* to sort a table, since the user could select
columns in any order.

=A0 =A0b=3D[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],
["newton", 10, 3] ]
=A0 =A0b.sort_by{|x| [ x[1], x[0] ]}

Works fine. However, often the user will want a reverse sort on some
field. I do not off-hand know the datatype of the field, but in most
cases it will be a String.

I tried:

=A0 =A0b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
=A0 =A0b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.
Thx, rahul.

--=20
NOBUOKA Yuya
 
R

Rahul Kumar

Y. NOBUOKA wrote in post #949783:
The one way to do a reverse sort is wrapping a sort target in a class
for a reverse sort.
For instance:

Thanks, i will have to consider this.

To the previous reply,
Are you sure the one criterium version works as you think? !x[1] returns
a
boolean value

Frankly, i am not sure. i just tried it on irb.

My issue is that i do not know how many columns a user will sort on, and
what combination. So the normal sort() appears cumbersome, unless i can
dynamically build a string and eval() it. Not sure i know how to do
that.

sort_by allows me to give a list as argument, so that appears suitable.
 
R

Rahul Kumar

When i sat down to code, i realized that i could not just put the
sort_keys list to sort_by, i would have to unroll the sort_keys array.
So, i might as well use sort. This seems to work in the small case i
have. Could someone comment on this, and suggest cleaner better way to
do. Thanks.

b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5], ["newton",
10, 3] ]
sort_keys=[1,0]
rev_flag = [true, false]

c=b.sort{|x,y|
res = 0
sort_keys.each_with_index { |e,i|
if rev_flag
res = y[e] <=> x[e]
else
res = x[e] <=> y[e]
end
break if res != 0
}
res
}
c
 
R

Rob Biedenharn

When i sat down to code, i realized that i could not just put the
sort_keys list to sort_by, i would have to unroll the sort_keys array.
So, i might as well use sort. This seems to work in the small case i
have. Could someone comment on this, and suggest cleaner better way to
do. Thanks.

b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5], ["newton",
10, 3] ]
sort_keys=[1,0]
rev_flag = [true, false]

c=b.sort{|x,y|
res = 0
sort_keys.each_with_index { |e,i|
if rev_flag
res = y[e] <=> x[e]
else
res = x[e] <=> y[e]
end
break if res != 0
}
res
}
c



While it is not strictly equivalent to array_of_strings.sort.reverse,
this little addition to String allows array_of_strings.sort_by{|s|-s}

class String
RAB_REPLACE = ('a'..'z').to_a.reverse.join.freeze
def -@
self.tr('a-z',RAB_REPLACE)
end
end

Then you can do:

b.sort_by {|ary| [ary[1], -ary[0]] }

irb> b.sort_by {|ary| [ary[1], -ary[0]] }
=> [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5], ["radio",
30, 5]]

If your strings are always simple words, then this may be enough for
you (or at least give you an idea).

If your array is short (i.e., the difference between sort and sort_by
is acceptable), then you can always do:

irb> b.sort {|e1,e2| (e1[1] <=> e2[1]).nonzero? || e2[0] <=> e2[0] }
=> [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5], ["radio",
30, 5]]

( which doesn't rely on String#-@ )

-Rob

Rob Biedenharn
(e-mail address removed) http://AgileConsultingLLC.com/
(e-mail address removed) http://GaslightSoftware.com/
 
R

Ryan Davis

The one way to do a reverse sort is wrapping a sort target in a class
for a reverse sort.
For instance:
=20
# This class is a class for a reverse sort.
# An Object of various datatypes can be wrapped in an object of this = class.
class ItemForReverseSort
def initialize( item )
@item =3D item
end
def item
@item
end
def <=3D>( target )
( self.item <=3D> target.item ) * (-1)
end
end

I like this, but think it is more complex than it needs to be.

module ReverseSort
def <=3D> target
-super
end
end

class Object
def -@
self.dup.extend ReverseSort
end
end

b =3D [ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5], =
["newton", 10, 3] ]

p b.sort_by{|x| [ x[1], -x[0] ]}
# =3D> [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5], =
["radio", 30, 5]]
 
S

Steve Howell

When i sat down to code, i realized that i could not just put the
sort_keys list to sort_by, i would have to unroll the sort_keys array.
So, i might as well use sort. This seems to work in the small case i
have. Could someone comment on this, and suggest cleaner better way to
do. Thanks.

b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5], ["newton",
10, 3] ]
sort_keys=[1,0]
rev_flag = [true, false]

c=b.sort{|x,y|
  res = 0
  sort_keys.each_with_index { |e,i|
    if rev_flag
      res = y[e] <=> x[e]
    else
      res = x[e] <=> y[e]
    end
    break if res != 0
  }
  res}


I definitely feel that sort_by has a compelling advantage over sort
for any large-ish dataset involving keys with nontrivial
transformations from the original object. You obviously want to avoid
doing NlogN transformations when N would suffice.

Does Ruby's sort_by method guarantee a stable sort?

If it does, then you could do sort_by the first field (reversing as
needed), then sort_by the second field (reversing as needed), then
sort_by the third field (reversing as needed), and so on. Doing M
sort passes on 1 key is no less efficient than doing 1 sort pass on M
keys, except to the extent that the latter maybe short-circuits a few
comparisons. At worst you are talking an M-ish difference that is
mostly dwarfed by NlogN.

Your essential underlying question about representing the "inverse" of
a key is pretty intriguing. I don't know of any general way to solve
that problem, other than encapsulating it in a class, as has been
suggested.
 
S

Steve Howell

I need to use *sort_by* to sort a table, since the user could select
columns in any order.

    b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],
["newton", 10, 3] ]
    b.sort_by{|x| [ x[1], x[0] ]}

Works fine. However, often the user will want a reverse sort on some
field. I do not off-hand know the datatype of the field, but in most
cases it will be a String.

I tried:

    b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
    b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.
Thx, rahul.

It would be nice if Ruby supported a sort_by on steroids.

sorted_list = list.multi_field_sort_by(
{ |x| x.department.name },
{ |x| x.position.name },
desc { |x| x.level },
desc { |x| x.salary ? x.salary : x.rate * db_lookup(x,
'hours_worked') }
)

I believe you could write a decent multi_field_sort_by in Ruby that
would be efficient for large enough lists to outperform more tedious
approaches, but it would be even better if Ruby natively supported it.

My proposed syntax might be slightly off, but you get the idea. You'd
pass a list of blocks that represent the successive tiebreakers, and
multi_field_sort_by would presumably cache the results from each
transformation, evaluating the blocks only as necessary. The "desc"
thingy would actually produce some kind of wrapper that
multi_field_sort_by could introspect to know that it needs to apply a
particular tiebreaker in reverse order.
 
J

Jeremy Bopp

It would be nice if Ruby supported a sort_by on steroids.

sorted_list = list.multi_field_sort_by(
{ |x| x.department.name },
{ |x| x.position.name },
desc { |x| x.level },
desc { |x| x.salary ? x.salary : x.rate * db_lookup(x,
'hours_worked') }
)

I believe you could write a decent multi_field_sort_by in Ruby that
would be efficient for large enough lists to outperform more tedious
approaches, but it would be even better if Ruby natively supported it.

My proposed syntax might be slightly off, but you get the idea. You'd
pass a list of blocks that represent the successive tiebreakers, and
multi_field_sort_by would presumably cache the results from each
transformation, evaluating the blocks only as necessary. The "desc"
thingy would actually produce some kind of wrapper that
multi_field_sort_by could introspect to know that it needs to apply a
particular tiebreaker in reverse order.

How about something like this:

module Enumerable
# sort_by will take a key generator and an optional comparator
# and perform a Schwartzian Transform over the data.
#
# This is a general solution which is relatively inefficient than
# a purpose-built sorting function if the keys are trivial to
# generate.
def sort_by(cmp = lambda { |a, b| a <=> b }, &key)
collect do |item| # Generate keys from the list items.
[key[item], item]
end.sort do |a, b| # Sort the keys.
cmp[a[0], b[0]]
end.collect do |kv| # Return the items in key sort order.
kv[1]
end
end
end

# This will cause the sort to operate over the second and then
# the first column.
key = lambda { |item| [item[1], item[0]] }

# This will cause the sort to operate normally on the second
# column and in reverse on the first column.
cmp = lambda do |a, b|
res = a[0] <=> b[0]
break res unless res == 0
b[1] <=> a[1]
end

# Sample data from Ryan's post.
a = [
["radio", 30, 5],
["radio", 20, 5],
["archie", 20, 5],
["newton", 10, 3]
]

# Sort over the second and then first columns each in normal order.
p a.sort_by(&key)
# => [["newton", 10, 3], ["archie", 20, 5], ["radio", 20, 5],
# ["radio", 30, 5]]

# Sort over the second column in normal order and then the first
# column in reverse order.
p a.sort_by(cmp, &key)
# => [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5],
# ["radio", 30, 5]]
 
S

Steve Howell

It would be nice if Ruby supported a sort_by on steroids.
  sorted_list = list.multi_field_sort_by(
    { |x| x.department.name },
    { |x| x.position.name },
    desc { |x| x.level },
    desc { |x| x.salary ? x.salary : x.rate * db_lookup(x,
'hours_worked') }
  )
I believe you could write a decent multi_field_sort_by in Ruby that
would be efficient for large enough lists to outperform more tedious
approaches, but it would be even better if Ruby natively supported it.
My proposed syntax might be slightly off, but you get the idea.  You'd
pass a list of blocks that represent the successive tiebreakers, and
multi_field_sort_by would presumably cache the results from each
transformation, evaluating the blocks only as necessary.  The "desc"
thingy would actually produce some kind of wrapper that
multi_field_sort_by could introspect to know that it needs to apply a
particular tiebreaker in reverse order.

How about something like this:

module Enumerable
  # sort_by will take a key generator and an optional comparator
  # and perform a Schwartzian Transform over the data.
  #
  # This is a general solution which is relatively inefficient than
  # a purpose-built sorting function if the keys are trivial to
  # generate.
  def sort_by(cmp = lambda { |a, b| a <=> b }, &key)
    collect do |item|   # Generate keys from the list items.
      [key[item], item]
    end.sort do |a, b|  # Sort the keys.
      cmp[a[0], b[0]]
    end.collect do |kv| # Return the items in key sort order.
      kv[1]
    end
  end
end

# This will cause the sort to operate over the second and then
# the first column.
key = lambda { |item| [item[1], item[0]] }

# This will cause the sort to operate normally on the second
# column and in reverse on the first column.
cmp = lambda do |a, b|
        res = a[0] <=> b[0]
        break res unless res == 0
        b[1] <=> a[1]
      end

# Sample data from Ryan's post.
a = [
  ["radio", 30, 5],
  ["radio", 20, 5],
  ["archie", 20, 5],
  ["newton", 10, 3]
]

# Sort over the second and then first columns each in normal order.
p a.sort_by(&key)
# => [["newton", 10, 3], ["archie", 20, 5], ["radio", 20, 5],
#                                                 ["radio", 30, 5]]

# Sort over the second column in normal order and then the first
# column in reverse order.
p a.sort_by(cmp, &key)
# => [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5],
#                                                 ["radio", 30, 5]]

I think it's a good start, but it still requires you to write a
somewhat customized cmp function. A truly optimized multi-field sort
might also be able to avoid calculating all elements of the key as
well, since they are often only needed as tiebreakers.
 
S

Steve Howell

I need to use *sort_by* to sort a table, since the user could select
columns in any order.

    b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],
["newton", 10, 3] ]
    b.sort_by{|x| [ x[1], x[0] ]}

Works fine. However, often the user will want a reverse sort on some
field. I do not off-hand know the datatype of the field, but in most
cases it will be a String.

I tried:

    b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
    b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.

Here is a solution that allows you to lazily evaluate keys and specify
that certain keys are to be reversed.

module Enumerable
def sort_by(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
fields = elem[0]
key = elem[1]
if fields.size <= i
fields = key_methods[0].call(key)
end
end
if key_methods[1] == :desc
a, b = b, a
end
result = (a[0] <=> b[0])
return result unless result == 0
i += 1
end
return result
end

self.collect do |item|
[ [], item ]
end.sort do |a, b|
compare(a, b, key_methods)
end.collect do |kv|
kv[1]
end
end
end

a = [
["radio", 30, 5],
["radio", 40, 5],
["radio", 20, 5],
["archie", 20, 5],
["newton", 10, 3]
]

puts a.sort_by(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect
 
S

Steve Howell

I need to use *sort_by* to sort a table, since the user could select
columns in any order.
    b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],
["newton", 10, 3] ]
    b.sort_by{|x| [ x[1], x[0] ]}
Works fine. However, often the user will want a reverse sort on some
field. I do not off-hand know the datatype of the field, but in most
cases it will be a String.
    b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
    b.sort_by{|x| [ !x[1], x[0] ]}
Another thread here, suggested using a minus for Numbers but what about
Strings.

Here is a solution that allows you to lazily evaluate keys and specify
that certain keys are to be reversed.

  module Enumerable
    def sort_by(*key_methods)
      # allow for multiple key_methods and only
      # evaluate them when they are truly needed
      # for the sort
      def compare(a,b, key_methods)
        i = 0
        while i < key_methods.size do
          for elem in [a, b] do
            fields = elem[0]
            key = elem[1]
            if fields.size <= i
              fields = key_methods[0].call(key)
            end
          end
          if key_methods[1] == :desc
            a, b = b, a
          end
          result = (a[0] <=> b[0])
          return result unless result == 0
          i += 1
        end
        return result
      end

      self.collect do |item|
        [ [], item ]
      end.sort do |a, b|
        compare(a, b, key_methods)
      end.collect do |kv|
        kv[1]
      end
    end
  end

  a = [
    ["radio", 30, 5],
    ["radio", 40, 5],
    ["radio", 20, 5],
    ["archie", 20, 5],
    ["newton", 10, 3]
  ]

  puts a.sort_by(
    [Proc.new { |a| a[0] }],
    [Proc.new { |a| a[1] }, :desc]
  ).inspect


Oops, I forgot to verify that I was actually caching results. Here is
a fixed version.

module Enumerable
def sort_by_multi(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
key = elem[1]
if elem[0].size <= i
elem[0] << key_methods[0].call(key)
puts " #{elem.inspect}"
end
end
if key_methods[1] == :desc
a, b = b, a
end
result = (a[0] <=> b[0])
return result unless result == 0
i += 1
end
return result
end

self.collect do |item|
[ [], item ]
end.sort do |a, b|
compare(a, b, key_methods)
end.collect do |kv|
kv[1]
end
end
end

a = [
["radio", 30, 5],
["radio", 40, 5],
["radio", 20, 5],
["archie", 20, 5],
["newton", 10, 3]
]

puts a.sort_by_multi(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

You can see that it only evaluates the second key for tiebreaker
purposes. (Of course, in this example, the key calculation is
trivial, but in the real world you might have a key that is expensive
to evaluate.)

[["radio"], ["radio", 30, 5]]
[["radio"], ["radio", 20, 5]]
[["radio", 30], ["radio", 30, 5]]
[["radio", 20], ["radio", 20, 5]]
[["newton"], ["newton", 10, 3]]
[["radio"], ["radio", 40, 5]]
[["radio", 40], ["radio", 40, 5]]
[["archie"], ["archie", 20, 5]]
[["archie", 20, 5], ["newton", 10, 3], ["radio", 40, 5], ["radio", 30,
5], ["radio", 20, 5]]
 
S

Steve Howell

This doesn't do what you think (or at least imply that) it does.

Care to explain?

puts a.sort_by_multi(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

$ ruby foo.rb

[["archie", 20, 5], ["newton", 10, 3], ["radio", 40, 5], ["radio",
30, 5], ["radio", 20, 5]]

$ cat foo.rb
module Enumerable
def sort_by_multi(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
key = elem[1]
if elem[0].size <= i
elem[0] << key_methods[0].call(key)
end
end
if key_methods[1] == :desc
a, b = b, a
end
result = (a[0] <=> b[0])
return result unless result == 0
i += 1
end
return result
end

self.collect do |item|
[ [], item ]
end.sort do |a, b|
compare(a, b, key_methods)
end.collect do |kv|
kv[1]
end
end
end

a = [
["radio", 30, 5],
["radio", 40, 5],
["radio", 20, 5],
["archie", 20, 5],
["newton", 10, 3]
]

puts a.sort_by_multi(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect
 
S

Steve Howell

This doesn't do what you think (or at least imply that) it does.

Ok, I found and fixed a bug (*) and reworked the example to
(hopefully) make the implied intent more clear.

rows = [
{:name => "al", :salary => 40},
{:name => 'charlie', :salary => 100},
{:name => "al", :salary => 50},
{:name => 'diane', :salary => 1000, :commission => 40},
{:name => 'diane', :salary => 1000, :commission => 30},
{:name => 'ed', :salary => 20},
]

def do_sql_like_sort(rows)
def asc(field)
[Proc.new { |row| row[field] }, :asc ]
end
def desc(field)
[Proc.new { |row| row[field] }, :desc ]
end
# The semantics for sort_by_multi are similar to
# SQL.
rows.sort_by_multi(
asc:)name),
desc:)salary),
desc:)commission)
)
end

expected_result = [
{:name=>"al", :salary=>50},
{:name=>"al", :salary=>40},
{:name=>"charlie", :salary=>100},
{:name=>"diane", :salary=>1000, :commission => 40},
{:name=>"diane", :salary=>1000, :commission => 30},
{:name=>"ed", :salary=>20},
]


module Enumerable
def sort_by_multi(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
key = elem[1]
if elem[0].size <= i
elem[0] << key_methods[0].call(key)
end
end
x, y = (key_methods[1] == :desc) ? [b, a] : [a, b]
result = (x[0] <=> y[0])
return result unless result == 0
i += 1
end
return result
end

self.collect do |item|
[ [], item ]
end.sort do |a, b|
compare(a, b, key_methods)
end.collect do |kv|
kv[1]
end
end
end

sorted_rows = do_sql_like_sort(rows)
if sorted_rows != expected_result
raise 'fail'
end

* - The previous iteration of this code was swapping a and b instead
of making copies before swapping. This broke the case where two of
the fields were to be sorted in descending order.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top