M
Michal Suchanek
So, I was tinkering with ways to build a hash out of transforming an
array, knowing the standard/idiomatic
id_list = [:test1, :test2, :test3]
id_list.inject({}) { |a,e| a[e]=e.object_id ; a }
I also decided to try something like this:
Hash[ *id_list.collect { |e| [e,e.object_id]}.flatten]
and further (attempt to) optimize it via
Hash[ *id_list.collect { |e| [e,e.object_id]}.flatten!]
and
Hash[ *id_list.collect! { |e| [e,e.object_id]}.flatten!]
Running this via Benchmark#bmbm gives pretty interesting, and to me,
unexpected, results (on a 3.2 GHz P4, 1GB of RAM, FC5 with ruby 1.8.4)
require 'benchmark'
id_list = (1..1_000_000).to_a
Benchmark::bmbm do |x|
x.report("inject") { id_list.inject({}) { |a,e| a[e] = e.object_id ; a} }
x.report("non-bang") { Hash[ *id_list.collect { |e| [e,e.object_id]}.flatten] }
x.report("bang") { Hash[ *id_list.collect { |e| [e,e.object_id]}.flatten!] }
x.report("two-bang") { Hash[ *id_list.collect! { |e| [e,e.object_id]}.flatten!] }
end
Rehearsal --------------------------------------------
inject 16.083333 0.033333 16.116667 ( 9.670747)
non-bang 1657.050000 1.800000 1658.850000 (995.425642)
bang 1593.716667 0.016667 1593.733333 (956.334565)
two-bang 1604.816667 1.350000 1606.166667 (963.803356)
-------------------------------- total: 4874.866667sec
user system total real
inject 5.183333 0.000000 5.183333 ( 3.102379)
non-bang zsh: segmentation fault ruby
Ow?
Also, I just thought of a similar way to accomplish the same thing:
x.report("zip") { Hash[ *id_list.zip(id_list.collect {|e| e.object_id})] }
Array#collect! won't work right with this, of course, but it seems to
have equally-bad performance. Is Array#inject just optimized for this,
or something?
The reason why you are seeing this (performance as well as timing) is
most likely caused by the different approach. When you use #inject you
just create one copy of the Array (the Hash). When you use #collect you
create at least one additional copy of the large array plus a ton of two
element arrays. That's way less efficient considering memory usage and
GC. You'll probably see much different results if your input array is
much shorter (try with 10 or 100 elements).
I also get a segfault (with 1.8.5 on Gentoo):
ruby -w test.rb
Rehearsal --------------------------------------------
inject 7.210000 0.870000 8.080000 ( 13.011414)
non-bang 4620.790000 179.730000 4800.520000 (6029.976497)
bang 4586.140000 200.530000 4786.670000 (5970.267190)
two-bang 4599.560000 268.080000 4867.640000 (6035.687979)
------------------------------- total: 14462.910000sec
user system total real
inject 10.740000 1.990000 12.730000 ( 15.500874)
non-bang Segmentation fault
hramrach@hp-tc2110:11(0) 06120053 16]~ $ ruby -v
ruby 1.8.5 (2006-12-04 patchlevel 2) [i686-linux]
Of course, it would be better to test with 1.8.6 but it takes quite
some time to retest ;-)
Thanks
Michal