--------------020306020109080509080700
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Levin said:
Ah, yes. I'd probably do it with SyncEnumerator rather than zip as
well, then[1]. I just used zip since its what you started with and you
didn't specify performance as an issue.
Performance would probably be the wrong motivation to switch to
SyncEnumerator. On my machine it is about 100 times slower then
Enumerable#zip.
sync-enum: 1.676
zip-enum: 0.018
sizedqueue: 0.226
-Levin
Thanks for bringing this up.
The variant with each_zip in Enumerable needs only one thread:
user system total real
sync-enum: 1.594000 0.500000 2.094000 ( 2.110000)
zip-enum: 0.016000 0.000000 0.016000 ( 0.015000)
sizedqueue: 0.250000 0.000000 0.250000 ( 0.250000)
each_zip: 0.125000 0.000000 0.125000 ( 0.125000)
but no match against zip-enum in terms of speed.
Still interesting that threads are 10 times faster in this case than
continuations.
cheers
Simon
--------------020306020109080509080700
Content-Type: text/plain;
name="compare.bm.rb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="compare.bm.rb"
#! /usr/bin/ruby -w
require 'generator'
require 'enumerator'
require 'benchmark'
require 'thread'
module Enumerable
def each_zip(other)
q = SizedQueue.new(1)
Thread.new {other.each{|l| q << l}}
self.each{|a| yield a, q.pop}
end
end
s1 = (%w{this is the first test} * 300).join("\n")
s2 = (%w{this is the first test} * 300).join("\n")
#s2[4321] = "Failure"
Benchmark.bm(15) do |bm|
bm.report("sync-enum:") {
first, second = [s1, s2].collect { |i| i.to_enum
each_line) }
SyncEnumerator.new(first,second).each { |a,b|
puts "Error near: #{b}" if a != b
}
}
bm.report("zip-enum:") {
first, second = [s1, s2].collect { |i| i.to_enum
each_line) }
first.zip(second) { |a,b|
puts "Error near: #{b}" if a != b
}
}
bm.report("sizedqueue:") {
q1, q2 = SizedQueue.new(1), SizedQueue.new(1)
Thread.new { s1.each_line {|l| q1 << l}; q1 << nil}
Thread.new { s2.each_line {|l| q2 << l}; q2 << nil}
while true
a,b = q1.pop, q2.pop
break if !a || !b
puts "Error near: #{b}" if a != b
end
}
bm.report("each_zip:") {
first, second = [s1, s2].collect { |i| i.to_enum
each_line) }
first.each_zip(second) { |a, b|
puts "Error near: #{b}" if a != b
}
}
end
--------------020306020109080509080700--