J
Jon Egil Stand
# MAIN QUESTION
# Is there a nice and fast way to do set operations on arrays of
# non-trivial objects.
# BACKGROUND
# Set operations are beatiful:
a = [1,2,3,4,5]
b = [2,4,6,8,10]
(a & b)
# => [2,4]
# When comparing lists, which is more or less what administration of
# pension schemes is all about, this is very, very nice.
# I very often use stuff like
(a - (a&b))
# => [1,3,5]
# The array library is blazingly fast, I'm really happy with it.
# The challenge arises when my lists don't comprise of fixnums, but instead of
# more specified objects. To simplify:
class Person
def initialize(name, ssid)
@name = name
@ssid = ssid
end
attr_reader :name, :ssid
end
list = []
list << Person.new('Peter Zapffe', 1)
list << Person.new('Peter Pan', 2)
list << Person.new('Saint Peter', 3)
# Let's say I want to UNION that to (b = [2,4,6,8,10]) from above, using ssid
# as key.It should in that case return the 'Peter Pan'-person, since he is the
# only one with an ssid included i the list.
(list & b)
# => []
# I can do stuff like:
list_ssid = list.collect{|p| p.ssid}
# => [1,2,3]
union = list_ssid & b
# => [2]
list.select{|p| union.include? p.ssid}
# => [#<Person:0x2b8b300 @name="Peter Pan", @ssid=2>]
# This works, and correctness is always nice, but it doesn't scale very nice,
# basicly because union.include? scans the union-list from scratch every time.
# I have implemented this before, by sorting both lists and stepping through
# them one at a time. That's still correct and much faster, but I have
a feeling
# it's possible to do it in a much more elegant way, using some sort of
# set-operations.
# I would appreciate any hints and or pointers.
# Thank's for reading.
# JE
# Is there a nice and fast way to do set operations on arrays of
# non-trivial objects.
# BACKGROUND
# Set operations are beatiful:
a = [1,2,3,4,5]
b = [2,4,6,8,10]
(a & b)
# => [2,4]
# When comparing lists, which is more or less what administration of
# pension schemes is all about, this is very, very nice.
# I very often use stuff like
(a - (a&b))
# => [1,3,5]
# The array library is blazingly fast, I'm really happy with it.
# The challenge arises when my lists don't comprise of fixnums, but instead of
# more specified objects. To simplify:
class Person
def initialize(name, ssid)
@name = name
@ssid = ssid
end
attr_reader :name, :ssid
end
list = []
list << Person.new('Peter Zapffe', 1)
list << Person.new('Peter Pan', 2)
list << Person.new('Saint Peter', 3)
# Let's say I want to UNION that to (b = [2,4,6,8,10]) from above, using ssid
# as key.It should in that case return the 'Peter Pan'-person, since he is the
# only one with an ssid included i the list.
(list & b)
# => []
# I can do stuff like:
list_ssid = list.collect{|p| p.ssid}
# => [1,2,3]
union = list_ssid & b
# => [2]
list.select{|p| union.include? p.ssid}
# => [#<Person:0x2b8b300 @name="Peter Pan", @ssid=2>]
# This works, and correctness is always nice, but it doesn't scale very nice,
# basicly because union.include? scans the union-list from scratch every time.
# I have implemented this before, by sorting both lists and stepping through
# them one at a time. That's still correct and much faster, but I have
a feeling
# it's possible to do it in a much more elegant way, using some sort of
# set-operations.
# I would appreciate any hints and or pointers.
# Thank's for reading.
# JE