Possibly improvable code

Ralf Müller · Jun 16, 2004

Hi,

I made a comparison:

ram@lilith:~/src/ruby$time ruby -e 'num=STDIN.gets; a=Array.new; (0..9).each
{|i| a = 0}; (0..num.length).each {|i| a[num[i..i].to_i] += 1 };
a.each_index {|i| print "#{i} #{a}\n"} ' < pi.out
0 16096
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m1.023s
user 0m0.970s
sys 0m0.010s

and

ram@lilith:~/src/ruby$time gawk '{i=0;j=length($0);while( i <= j) {i++;
z=substr($0,i,1); {count=count+ 1; a[z]=a[z]+1; }}}END{ for ( i in a ) {print
i ":" a;}print "Digits: " count;}' < pi.out | sort
0:16095
1:16107
2:15965
3:15915
4:15967
5:16169
6:15934
7:16091
8:15999
9:15790
:1
Digits: 160033

real 0m0.305s
user 0m0.310s
sys 0m0.000s

I'm really new to ruby. So I would like to know, if there is any possible
improvement of the ruby code.
pi.out contains about 160000 digits of PI and is made by "ruby sample/pi.rb >
pi.out' inside the src-dir of the ruby-*.tgz

Any hints???

thanks
ralf

Ruben · Jun 16, 2004

At Wed, 16 Jun 2004 23:13:26 +0900,

Ralf said:
ram@lilith:~/src/ruby$time ruby -e 'num=STDIN.gets; a=Array.new; (0..9).each
{|i| a = 0}; (0..num.length).each {|i| a[num[i..i].to_i] += 1 };
a.each_index {|i| print "#{i} #{a}\n"} ' < pi.out

# get the input
num = STDIN.gets;
# initialize array entries with 0
a = Array.new(255,0);
# count the occurence of each byte in the input
num.each_byte {|i| a += 1};
# only interested in the byte/characters for 0-9
(?0..?9).each {|i| print "#{.pack("c")} #{a}\n"};

I suspect the num[i..i].to_i is pretty expensive... and it's not
really needed in this case.

Cheers,
Ruben

Ralf Müller · Jun 16, 2004

Am Mittwoch, 16. Juni 2004 16:13 schrieb Ralf Müller:

Hi,

I made a comparison:

ram@lilith:~/src/ruby$time ruby -e 'num=STDIN.gets; a=Array.new;
(0..9).each {|i| a = 0}; (0..num.length).each {|i| a[num[i..i].to_i] +=
1 }; a.each_index {|i| print "#{i} #{a}\n"} ' < pi.out
0 16096
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m1.023s
user 0m0.970s
sys 0m0.010s

and

ram@lilith:~/src/ruby$time gawk '{i=0;j=length($0);while( i <= j) {i++;
z=substr($0,i,1); {count=count+ 1; a[z]=a[z]+1; }}}END{ for ( i in a )
{print i ":" a;}print "Digits: " count;}' < pi.out | sort
0:16095
1:16107
2:15965
3:15915
4:15967
5:16169
6:15934
7:16091
8:15999
9:15790

:1

Digits: 160033

real 0m0.305s
user 0m0.310s
sys 0m0.000s

I'm really new to ruby. So I would like to know, if there is any possible
improvement of the ruby code.
pi.out contains about 160000 digits of PI and is made by "ruby sample/pi.rb

pi.out' inside the src-dir of the ruby-*.tgz

Click to expand...

Any hints???

thanks
ralf

tried with Hash instead of Array:
ram@lilith:~/src/ruby$time ruby -e 'str=STDIN.gets; h=Hash.new(); (0..9).each
{|i| h[i.to_s] = 0}; (0..str.length-1).each {|i| h[str[i..i]] += 1 }; h.each
{|k,v| print "#{k} #{v}\n"} ' < pi.out | sort
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.920s
user 0m0.910s
sys 0m0.010s

gains 10%.

why the lucky stiff · Jun 16, 2004

Ralf said:
tried with Hash instead of Array:
ram@lilith:~/src/ruby$time ruby -e 'str=STDIN.gets; h=Hash.new(); (0..9).each
{|i| h[i.to_s] = 0}; (0..str.length-1).each {|i| h[str[i..i]] += 1 }; h.each
{|k,v| print "#{k} #{v}\n"} ' < pi.out | sort

why@stungun$ time ruby -n -e 'c={};$_.each_byte{|b|c||=0; c+=1};
c.sort.each {|k,v| puts "#{k - ?0} #{v}"}' < pi.out
0 3454
1 3514
2 3358
3 3452
4 3483
5 3510
6 3488
7 3431
8 3482
9 3464

real 0m0.052s
user 0m0.051s
sys 0m0.000s

same idea as yours. just runs 3x swift.

_why

Relm · Jun 16, 2004

Ralf said:
Ralf said:

tried with Hash instead of Array:
ram@lilith:~/src/ruby$time ruby -e 'str=STDIN.gets; h=Hash.new(); (0..9).each
{|i| h[i.to_s] = 0}; (0..str.length-1).each {|i| h[str[i..i]] += 1 }; h.each
{|k,v| print "#{k} #{v}\n"} ' < pi.out | sort

Click to expand...

why@stungun$ time ruby -n -e 'c={};$_.each_byte{|b|c||=0; c+=1};
c.sort.each {|k,v| puts "#{k - ?0} #{v}"}' < pi.out

same idea as yours. just runs 3x swift.

I typo a lot, so no command line. Maybe 150-170x speed of OP.

n = STDIN.gets
tot = 0
9.times do |i|
c = n.count(i.to_s)
tot += c
puts "#{i} #{c}"
end
puts "9 #{n.size - tot}"

Ralf Müller · Jun 17, 2004

11111111111111111111111111111111111111111111111111111111111111
ram@lilith:~/src/ruby$cat rec_pi.rb
n = STDIN.gets
tot = 0
9.times do |i|
c = n.count(i.to_s)
tot += c
puts "#{i} #{c}"
end
puts "9 #{n.size - tot}"

ram@lilith:~/src/ruby$time ruby rec_pi.rb < pi.out
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.045s
user 0m0.040s
sys 0m0.000s

22222222222222222222222222222222222222222222222222222222222222222
ram@lilith:~/src/ruby$cat rec_pi_a.rb
c={};
$_.each_byte{|b| c||=0; c+=1 };
c.sort.each {|k,v| puts "#{k - ?0} #{v}"}

ram@lilith:~/src/ruby$time ruby -n rec_pi_a.rb < pi.out
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.260s
user 0m0.240s
sys 0m0.010s

3333333333333333333333333333333333333333333333333333333333333333333
ram@lilith:~/src/ruby$cat rec_pi_b.rb
num = STDIN.gets;
a = Array.new(255,0);
num.each_byte {|i| a += 1};
(?0..?9).each {|i| print "#{.pack("c")} #{a}\n"};

ram@lilith:~/src/ruby$time ruby rec_pi_b.rb < pi.out
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.194s
user 0m0.160s
sys 0m0.010s

####################################################################

The main difference seems to be using "count" instead of "each_byte". But why
is it 5-6x faster to "count" 9.times than "each_byte" only once?
Or did i ignore anything?

ralf
(-,-)Zzz

Ruben · Jun 17, 2004

At Thu, 17 Jun 2004 16:48:21 +0900,

Ralf said:
11111111111111111111111111111111111111111111111111111111111111
ram@lilith:~/src/ruby$cat rec_pi.rb
n = STDIN.gets
tot = 0
9.times do |i|
c = n.count(i.to_s)
tot += c
puts "#{i} #{c}"
end
puts "9 #{n.size - tot}"

The main difference seems to be using "count" instead of "each_byte". But why
is it 5-6x faster to "count" 9.times than "each_byte" only once?
Or did i ignore anything?

The above code is pretty nice.. I didn't know about the
'count'-method. The reason this is faster, is because the 'big loop',
the scan over the string, is in this code done internally, in the C
implementation of the count method. The other code has the loop
explicitly in ruby-code, which is slower.

The difference however is that the above code does the scan over the
String several times (# of chars to check - 1) in C, while all the other
code scans the String only once in ruby.

Probably there is a trade-off... maybe if you would wanna count all
occurences of [a-zA-Z0-9], then you would have to do the
'count'-method 57 times (57 because the above code assumes that only
the characters you count appear in the String), effectively scanning
the String 57 times. This might be slower than scanning the String
only once in ruby-code. (didn't test this though...)

Ruben

Relm · Jun 17, 2004

The difference however is that the above code does the scan over the
String several times (# of chars to check - 1) in C, while all the other
code scans the String only once in ruby.

Probably there is a trade-off... maybe if you would wanna count all
occurences of [a-zA-Z0-9], then you would have to do the
'count'-method 57 times (57 because the above code assumes that only
the characters you count appear in the String), effectively scanning
the String 57 times. This might be slower than scanning the String
only once in ruby-code. (didn't test this though...)

There is a trade-off, but you may need to throw in some kanji or unicode
to get there. With just [a-zA-Z0-9], a single scan in Ruby is still
slower. Here are some timings:

user system total real
array: 0.880000 0.080000 0.960000 ( 0.958946)
hash: 1.210000 0.070000 1.280000 ( 1.330615)
count: 0.100000 0.000000 0.100000 ( 0.091805)
delete: 0.030000 0.020000 0.050000 ( 0.051760)

Both 'array' and 'hash' use #each_byte (M iterations in Ruby code),
'count' uses #count repeatedly (62*M iterations in C code), and 'delete'
uses #delete to partition the string into 5 substrings and #delete! to
count characters (5*M + 62*M/5 = about 17*M iterations in C code).

And the benchmark code:

#!/usr/bin/ruby -w

str = ['a'..'z','A'..'Z',0..9].map {|x| x.to_a}.flatten.join * 10000

require 'benchmark'

Benchmark.bm(8) do |bm|
bm.report('array:') do
a = Array.new(255,0);
str.each_byte {|i| a += 1};
[?0..?9,?a..?z,?A..?Z].map {|x| x.to_a}.flatten.each {|i|
"#{.pack("c")} #{a}"
};
end
bm.report('hash:') do
c={}
str.each_byte{|b|c||=0; c+=1}
c.sort.each {|k,v| "#{k.chr} #{v}"}
end
bm.report('count:') do
['0'..'9','a'..'z','A'..'Z'].map {|x| x.to_a}.flatten.each do |i|
"#{i} #{str.count(i)}"
end
end
bm.report('delete:') do
['0'..'9','a'..'m','n'..'z','A'..'M','N'..'Z'].each do |range|
s = str.delete("^#{range.first}-#{range.last}")
range.each do |i|
n = s.size
s.delete!(i)
"#{i} #{n - s.size}"
end
end
end
end

__END__

Ralf Müller · Jun 18, 2004

Am Donnerstag, 17. Juni 2004 00:00 schrieb Relm:

Ralf said:
Ralf said:

tried with Hash instead of Array:
ram@lilith:~/src/ruby$time ruby -e 'str=STDIN.gets; h=Hash.new();
(0..9).each {|i| h[i.to_s] = 0}; (0..str.length-1).each {|i|
h[str[i..i]] += 1 }; h.each {|k,v| print "#{k} #{v}\n"} ' < pi.out |
sort

Click to expand...

why@stungun$ time ruby -n -e 'c={};$_.each_byte{|b|c||=0; c+=1};
c.sort.each {|k,v| puts "#{k - ?0} #{v}"}' < pi.out

same idea as yours. just runs 3x swift.

Click to expand...

I typo a lot, so no command line. Maybe 150-170x speed of OP.

n = STDIN.gets
tot = 0
9.times do |i|
c = n.count(i.to_s)
tot += c
puts "#{i} #{c}"
end
puts "9 #{n.size - tot}"

I tried this

ram@lilith:~/src/ruby$cat trcount-simple.rb
#!/usr/local/bin/ruby
#num=STDIN.gets
num=STDIN.gets
# prepare an array
a = Array.new;
(0..9).each {|i| a = 0}
# for num in an single line
#(0..num.length).each {|n| a[num[n..n].to_i] += 1 }
(0..num.length-1).each {|n| a[num[n..n].to_i] += 1 }
# print the result
a.each_index {|index| print "#{index} #{a[index]}\n" }
ram@lilith:~/src/ruby$time ruby trcount-simple.rb < pi.out
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.964s
user 0m0.950s
sys 0m0.010s
ram@lilith:~/src/ruby$time ruby -e 'num=STDIN.gets; a=Array.new; (0..9).each
{|i| a = 0
}; (0..num.length).each {|i| a[num[i..i].to_i] += 1 }; a.each_index {|i| print
"#{i} #{a[i
]}\n"} ' < pi.out
0 16096
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.955s
user 0m0.940s
sys 0m0.000s

It does not seem to make any difference if the source i given at the command
line or in a separate file:
ram@lilith:~/src/ruby$cat rec_pi.rb
n = STDIN.gets
tot = 0
9.times do |i|
c = n.count(i.to_s)
tot += c
puts "#{i} #{c}"
end
puts "9 #{n.size - tot}"
ram@lilith:~/src/ruby$time ruby rec_pi.rb < pi.out
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.048s
user 0m0.040s
sys 0m0.010s
ram@lilith:~/src/ruby$time ruby -e 'n = STDIN.gets; tot = 0; 9.times do |i| c
= n.count(i.to_s); tot += c; puts "#{i} #{c}"; end; puts "9 #{n.size - tot}"'
<pi.out
0 16095
1 16107
2 15965
3 15915
4 15967
5 16169
6 15934
7 16091
8 15999
9 15790

real 0m0.043s
user 0m0.010s
sys 0m0.030s

But maybe, you're right and these conditions do just not lead to that
difference. Under which circumstances does the difference occur?

regards
ralf

Python code problem	2	Apr 23, 2023
Need help for javascript code	3	Sep 28, 2022
I Need Fix In Code	1	Apr 12, 2023
Need help with this Python code.	2	Jun 13, 2023
Help with code	2	Oct 11, 2022
Help with code	0	Jun 12, 2022
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
How do I fix this issue in sqaurespace code block?	1	Jul 2, 2024

Possibly improvable code

Ralf Müller

Ruben

Ralf Müller

why the lucky stiff

Relm

Ralf Müller

Ruben

Relm

Ralf Müller

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads