Ran some tests on my 2.8GHz Pentium D Dual Core, 2GB, 160 GB S-ATA II.
Things we're much more like Robert expected: 3.27 times slower.
544-> time perl test.pl
1048576
3.215u 0.143s 0:03.37 99.4% 0+0k 0+0io 0pf+0w
545-> time ruby test.rb
1048576
10.532u 0.350s 0:10.98 99.0% 0+0k 0+0io 8pf+0w
Now then, changing the regexp to a precreated one ran SLOWER for me
(huh?)
549-> time ruby test1.rb
1048576
11.006u 0.323s 0:11.36 99.6% 0+0k 0+0io 0pf+0w
Just for grins, presized the block array to the full size needed but
this had no impact what-so-ever. Hmmm....
Decided to run the profiler over it. Does it seem strange to you that
IO#each_line would (appear?) to take so long on a system w/the disk I/O
of mine when sequentially accessing a file???
ruby -r profile test.rb
1048576
% cumulative self self total
time seconds seconds calls ms/call ms/call name
78.91 455.14 455.14 1 455140.00 576810.00 IO#each_line
15.91 546.91 91.77 3145728 0.03 0.03 String#chomp!
5.18 576.81 29.90 1048576 0.03 0.03 Array#<<
0.00 576.81 0.00 2 0.00 0.00 IO#write
0.00 576.81 0.00 1 0.00 0.00 Array#size
0.00 576.81 0.00 1 0.00 0.00 Kernel.puts
0.00 576.81 0.00 1 0.00 0.00 Fixnum#to_s
0.00 576.81 0.00 1 0.00 576810.00 IO#open
0.00 576.81 0.00 1 0.00 0.00 File#initialize
0.00 576.81 0.00 1 0.00 576810.00 #toplevel
Ken
Robert said:
Martin said:
3) test.rb
log = 'log'
block = []
File.open( log ) { |f|
f.each_line { |line|
line.chomp!
if ( line =~ /Start Start Start Start/ ) then
Revesing the RX and the string is usually more efficient.
I was pretty sure the problem was creating a regexp object from a
literal regexp each time, but oddly enough saying rx = /..../ before the
loop and rx =~ line inside made no difference. Does ruby already
optimise this case?
Yes. It's usually more efficient to use the literal inside the code.
Cheers
robert