improved mathn.rb and Prime class

P

Phil Duby

Hi,

I have been programming for a lot of years, but just started with Ruby
recently. To help learn some of the basics, I implemented a version of a
sieve prime number finder. It turns out that my version is a **LOT** faster
than the version in mathn.rb. What do I need to do to get this
implementation to someone to add to Ruby / mathn ?
I actually did 2 different types of changes. One was the algorythm, which
should be no problem to drop into mathn. The other was to turn the prime
number storage into class variables, so that only one set of prime numbers
will be calculated (for a single session). Multiple instances all access
the same data, and the first instance to need new prime numbers adds them
for all other instances. That *should* not cause any problems, but as a
Ruby Noobee, I may not know about side effects this could cause in the way
the existing Prime class is used.

To make sure my implementation correctly calculates prime numbers, I located
a file on the Internet with the first 100008 prime numbers (up to 1299827),
I was able to exactly duplicate the file using my implementation. I also
compared my output against the existing mathn, but have not run the mathn
version past 75000 prime numbers. The benchmarks below shows why. For
70001 through 75000, mathn takes over 30 minutes on my 1G windows XP pro
machine with 512M. My version takes just over 9 seconds, and the ratio
between the 2 gets larger as more prime numbers are calculatated

C:\Progs\working\math>ruby bench_math.rb


Starting benchmark script for mathn.rb version of Prime

user system total real
mathn Prime: 1 to 1000 2.718000 0.000000 2.718000 ( 2.719000)
mathn Prime: 1001 to 2000 7.844000 0.000000 7.844000 ( 7.844000)
mathn Prime: 2001 to 3000 12.922000 0.000000 12.922000 ( 12.953000)
mathn Prime: 3001 to 4000 18.047000 0.000000 18.047000 ( 18.079000)
mathn Prime: 4001 to 5000 23.141000 0.000000 23.141000 ( 23.172000)
mathn Prime: 5001 to 6000 28.203000 0.000000 28.203000 ( 28.251000)
mathn Prime: 6001 to 7000 33.375000 0.000000 33.375000 ( 33.422000)
mathn Prime: 7001 to 8000 38.485000 0.000000 38.485000 ( 38.516000)
mathn Prime: 8001 to 9000 43.672000 0.000000 43.672000 ( 43.720000)
mathn Prime: 9001 to 10000 48.781000 0.000000 48.781000 ( 48.798000)
mathn Prime: 10001 to 11000 53.797000 0.000000 53.797000 ( 53.860000)
mathn Prime: 11001 to 12000 58.890000 0.000000 58.890000 ( 59.002000)
mathn Prime: 12001 to 13000 64.813000 0.000000 64.813000 ( 64.853000)
mathn Prime: 13001 to 14000 69.750000 0.000000 69.750000 ( 69.913000)
mathn Prime: 14001 to 15000 74.234000 0.000000 74.234000 ( 74.355000)

C:\Progs\working\math>ruby bench_math.rb


Starting benchmark script for mathn.rb version of Prime

user system total real
mathn Prime: 1 to 5000 64.828000 0.015000 64.843000 ( 64.908000)
mathn Prime: 5001 to 10000 192.656000 0.000000 192.656000 (192.911000)
mathn Prime: 10001 to 15000 320.391000 0.016000 320.407000 (320.884000)
mathn Prime: 15001 to 20000 448.640000 0.000000 448.640000 (449.710000)
mathn Prime: 20001 to 25000 575.797000 0.016000 575.813000 (576.641000)
mathn Prime: 25001 to 30000 703.782000 0.015000 703.797000 (705.333000)
mathn Prime: 30001 to 35000 830.984000 0.016000 831.000000 (832.315000)
mathn Prime: 35001 to 40000 958.281000 0.047000 958.328000 (959.754000)
mathn Prime: 40001 to 45000 1086.016000 0.000000 1086.016000
(1087.716000)
mathn Prime: 45001 to 50000 1213.469000 0.015000 1213.484000
(1215.460000)
mathn Prime: 50001 to 55000 1341.015000 0.032000 1341.047000
(1343.137000)
mathn Prime: 55001 to 60000 1460.656000 0.031000 1460.687000
(1464.024000)
mathn Prime: 60001 to 65000 1583.344000 0.000000 1583.344000
(1583.886000)
mathn Prime: 65001 to 70000 1714.094000 0.000000 1714.094000
(1715.650000)
mathn Prime: 70001 to 75000 1823.421000 0.000000 1823.421000
(1823.438000)

Now numbers for my (best so far) version

C:\Progs\working\math>ruby bench_prime_8.rb


Starting benchmark script for hpdmath8 version of Prime
using 6 primes to build skip known primes offset list

user system total real
Primes 8: 1 to 1000 0.187000 0.000000 0.187000 ( 0.188000)
Primes 8: 1001 to 2000 0.281000 0.000000 0.281000 ( 0.281000)
Primes 8: 2001 to 3000 0.375000 0.000000 0.375000 ( 0.375000)
Primes 8: 3001 to 4000 0.438000 0.000000 0.438000 ( 0.437000)
Primes 8: 4001 to 5000 0.500000 0.000000 0.500000 ( 0.500000)
Primes 8: 5001 to 6000 0.547000 0.000000 0.547000 ( 0.547000)
Primes 8: 6001 to 7000 0.594000 0.000000 0.594000 ( 0.594000)
Primes 8: 7001 to 8000 0.640000 0.000000 0.640000 ( 0.641000)
Primes 8: 8001 to 9000 0.672000 0.000000 0.672000 ( 0.672000)
Primes 8: 9001 to 10000 0.719000 0.000000 0.719000 ( 0.718000)
Primes 8: 10001 to 11000 0.750000 0.000000 0.750000 ( 0.750000)
Primes 8: 11001 to 12000 0.765000 0.000000 0.765000 ( 0.766000)
Primes 8: 12001 to 13000 0.797000 0.000000 0.797000 ( 0.797000)
Primes 8: 13001 to 14000 0.828000 0.000000 0.828000 ( 0.828000)
Primes 8: 14001 to 15000 0.875000 0.000000 0.875000 ( 0.875000)

C:\Progs\working\math>ruby bench_prime_8.rb


Starting benchmark script for hpdmath8 version of Prime
using 6 primes to build skip known primes offset list

user system total real
Primes 8: 1 to 5000 1.766000 0.016000 1.782000 ( 1.781000)
Primes 8: 5001 to 10000 3.156000 0.000000 3.156000 ( 3.156000)
Primes 8: 10001 to 15000 4.016000 0.000000 4.016000 ( 4.016000)
Primes 8: 15001 to 20000 4.734000 0.000000 4.734000 ( 4.734000)
Primes 8: 20001 to 25000 5.328000 0.000000 5.328000 ( 5.328000)
Primes 8: 25001 to 30000 5.797000 0.000000 5.797000 ( 5.798000)
Primes 8: 30001 to 35000 6.312000 0.000000 6.312000 ( 6.312000)
Primes 8: 35001 to 40000 6.766000 0.000000 6.766000 ( 6.766000)
Primes 8: 40001 to 45000 7.125000 0.000000 7.125000 ( 7.125000)
Primes 8: 45001 to 50000 7.516000 0.000000 7.516000 ( 7.516000)
Primes 8: 50001 to 55000 7.859000 0.000000 7.859000 ( 7.859000)
Primes 8: 55001 to 60000 8.188000 0.000000 8.188000 ( 8.188000)
Primes 8: 60001 to 65000 8.484000 0.000000 8.484000 ( 8.485000)
Primes 8: 65001 to 70000 8.766000 0.000000 8.766000 ( 8.766000)
Primes 8: 70001 to 75000 9.078000 0.000000 9.078000 ( 9.078000)
 
P

Phil Duby

Antonio Cangiano said:
Phil said:
Hi,

I have been programming for a lot of years, but just started with Ruby
recently. To help learn some of the basics, I implemented a version of a
sieve prime number finder. It turns out that my version is a **LOT**
faster than the version in mathn.rb. What do I need to do to get this
implementation to someone to add to Ruby / mathn ?
I actually did 2 different types of changes. One was the algorythm,
which should be no problem to drop into mathn. The other was to turn the
prime number storage into class variables, so that only one set of prime
numbers will be calculated (for a single session). Multiple instances
all access the same data, and the first instance to need new prime
numbers adds them for all other instances. That *should* not cause any
problems, but as a Ruby Noobee, I may not know about side effects this
could cause in the way the existing Prime class is used.

To make sure my implementation correctly calculates prime numbers, I
located a file on the Internet with the first 100008 prime numbers (up to
1299827), I was able to exactly duplicate the file using my
implementation. I also compared my output against the existing mathn,
but have not run the mathn version past 75000 prime numbers. The
benchmarks below shows why. For 70001 through 75000, mathn takes over 30
minutes on my 1G windows XP pro machine with 512M. My version takes just
over 9 seconds, and the ratio between the 2 gets larger as more prime
numbers are calculatated

Hi Phil,
I was in a similar scenario. However, please note that while mathn.rb
(Ruby 1.8.x) is extremely slow, mathn.rb coming with Ruby 1.9 is way
faster. It would be nice if you could benchmark your results against the
latter. If your code is an improvement over the current library I am sure
matz will consider it.

Cheers,
Antonio

Here is the code of mathn.rb (Ruby 1.9): [ .. snip .. snip .. ]

The Ruby 1.9 version of mathn.rb is using some of the same concept /
optimization as my version. I took a copy of the Ruby 1.9 mathn.rb,
replaced the implementation of Prime with my version, adjusted my version to
provide the same interface as the 1.9 version, and benchmarked the 2
versions. My version has a slightly higher startup cost, and is a little
slower providing the first few prime numbers. I start with a shorter
hard-coded list of low prime numbers, and I have an array lookup where Ruby
1.9 has a hard-coded value. However, the Ruby 1.9 version optimization is
based on the (product of) the first 2 prime number [2, 3] while mine is
based on the first 6 prime numbers. That means mine becomes faster as more
/ higher prime numbers are being searched for. The benefits of hard-coding
versus working with the product of more primes balances out at about 8900
prime numbers. After that my version becomes increasingly faster. My
version also stores some additional intermediate information, so it does not
need to use modulo. I expected modulo to be a fairily expensive operation,
but some benchmarking on other code shows that modulo is faster than my
accumulated intermediate results, at least for small numbers. Perhaps
modulo becomes more expensive for larger numbers, and that accounts for the
relatively rapid divergence in the 5000 prime step benchmarks below, and
also for the larger than expected (by me) delay in my version catching up to
the modulo version. A previous version of my implementation was only using
the product of the first 2 primes, and I was not seeing as big of
differences with that as I am with Ruby 1.9.

I also need to study the single line from Ruby 1.9 that actually finds the
next prime:
@@primes.push @@next_to_check if @@primes[2..@@ulticheck_index].find
{|prime| @@next_to_check % prime == 0 }.nil?
That seems to have the same result as my code, but I use a 45 line method to
get there. Part of the difference is the logic to avoid using modulo. The
rest is (I think) using iterators and blocks where I have looping structures
(I know, bad habits carried over from other languages). My looping
structure includes early exit and restart cases. I think the .find in above
is handling that.

If there is interest in my current version, I can drop a copy someplace.
For now, I will do some more work on it.

Starting benchmark script for hpdmathn.rb adjusted from Ruby 1.9 version
user system total real
hpdmathn Prime: 1 to 8900 4.172000 0.031000 4.203000 (
4.219000)
Starting benchmark script for mathn.rb from Ruby 1.9 version of Prime
user system total real
mathn 1.9 Prime: 1 to 8900 3.922000 0.297000 4.219000 (
4.219000)

Ruby 1.9 benchmark (1000 primes at a time)
user system total real
mathn 1.9 Prime: 1 to 1000 0.172000 0.031000 0.203000 (
0.203000)
mathn 1.9 Prime: 1001 to 2000 0.266000 0.000000 0.266000 (
0.265000)
mathn 1.9 Prime: 2001 to 3000 0.312000 0.016000 0.328000 (
0.328000)
mathn 1.9 Prime: 3001 to 4000 0.422000 0.015000 0.437000 (
0.438000)
mathn 1.9 Prime: 4001 to 5000 0.438000 0.031000 0.469000 (
0.468000)
mathn 1.9 Prime: 5001 to 6000 0.516000 0.047000 0.563000 (
0.563000)
mathn 1.9 Prime: 6001 to 7000 0.625000 0.016000 0.641000 (
0.640000)
mathn 1.9 Prime: 7001 to 8000 0.641000 0.047000 0.688000 (
0.688000)
mathn 1.9 Prime: 8001 to 9000 0.672000 0.109000 0.781000 (
0.781000)
mathn 1.9 Prime: 9001 to 10000 0.625000 0.203000 0.828000 (
0.828000)
mathn 1.9 Prime: 10001 to 11000 0.828000 0.063000 0.891000 (
0.891000)
mathn 1.9 Prime: 11001 to 12000 0.891000 0.094000 0.985000 (
0.985000)
mathn 1.9 Prime: 12001 to 13000 0.922000 0.109000 1.031000 (
1.031000)
mathn 1.9 Prime: 13001 to 14000 1.000000 0.109000 1.109000 (
1.109000)
mathn 1.9 Prime: 14001 to 15000 1.109000 0.078000 1.187000 (
1.188000)

My Prime class bench (1000 primes at a time)
user system total real
hpdmathn Prime: 1 to 1000 0.219000 0.031000 0.250000 (
0.250000)
hpdmathn Prime: 1001 to 2000 0.281000 0.000000 0.281000 (
0.282000)
hpdmathn Prime: 2001 to 3000 0.359000 0.000000 0.359000 (
0.360000)
hpdmathn Prime: 3001 to 4000 0.438000 0.000000 0.438000 (
0.437000)
hpdmathn Prime: 4001 to 5000 0.500000 0.000000 0.500000 (
0.500000)
hpdmathn Prime: 5001 to 6000 0.547000 0.000000 0.547000 (
0.547000)
hpdmathn Prime: 6001 to 7000 0.593000 0.000000 0.593000 (
0.594000)
hpdmathn Prime: 7001 to 8000 0.641000 0.000000 0.641000 (
0.640000)
hpdmathn Prime: 8001 to 9000 0.672000 0.000000 0.672000 (
0.672000)
hpdmathn Prime: 9001 to 10000 0.703000 0.000000 0.703000 (
0.703000)
hpdmathn Prime: 10001 to 11000 0.750000 0.000000 0.750000 (
0.751000)
hpdmathn Prime: 11001 to 12000 0.766000 0.000000 0.766000 (
0.765000)
hpdmathn Prime: 12001 to 13000 0.796000 0.000000 0.796000 (
0.797000)
hpdmathn Prime: 13001 to 14000 0.829000 0.000000 0.829000 (
0.828000)
hpdmathn Prime: 14001 to 15000 0.875000 0.000000 0.875000 (
0.875000)

Ruby 1.9 benchmark (5000 primes at a time)
user system total real
mathn 1.9 Prime: 1 to 5000 1.532000 0.156000 1.688000 (
1.687000)
mathn 1.9 Prime: 5001 to 10000 3.047000 0.422000 3.469000 (
3.469000)
mathn 1.9 Prime: 10001 to 15000 4.687000 0.547000 5.234000 (
5.235000)
mathn 1.9 Prime: 15001 to 20000 5.922000 0.922000 6.844000 (
6.844000)
mathn 1.9 Prime: 20001 to 25000 7.437000 0.468000 7.905000 (
7.923000)
mathn 1.9 Prime: 25001 to 30000 8.954000 0.313000 9.267000 (
9.266000)
mathn 1.9 Prime: 30001 to 35000 10.406000 0.359000 10.765000 (
10.782000)
mathn 1.9 Prime: 35001 to 40000 12.125000 0.532000 12.657000 (
12.657000)
mathn 1.9 Prime: 40001 to 45000 13.734000 1.000000 14.734000 (
14.734000)
mathn 1.9 Prime: 45001 to 50000 15.391000 2.062000 17.453000 (
17.454000)
mathn 1.9 Prime: 50001 to 55000 16.234000 1.406000 17.640000 (
17.642000)
mathn 1.9 Prime: 55001 to 60000 19.328000 2.063000 21.391000 (
21.392000)
mathn 1.9 Prime: 60001 to 65000 20.500000 2.703000 23.203000 (
23.203000)
mathn 1.9 Prime: 65001 to 70000 22.188000 2.766000 24.954000 (
24.985000)
mathn 1.9 Prime: 70001 to 75000 23.844000 2.968000 26.812000 (
26.813000)

My Prime class bench (5000 primes at a time)
user system total real
hpdmathn Prime: 1 to 5000 1.781000 0.047000 1.828000 (
1.829000)
hpdmathn Prime: 5001 to 10000 3.156000 0.000000 3.156000 (
3.156000)
hpdmathn Prime: 10001 to 15000 4.000000 0.000000 4.000000 (
4.000000)
hpdmathn Prime: 15001 to 20000 4.703000 0.000000 4.703000 (
4.704000)
hpdmathn Prime: 20001 to 25000 5.297000 0.000000 5.297000 (
5.297000)
hpdmathn Prime: 25001 to 30000 5.750000 0.000000 5.750000 (
5.750000)
hpdmathn Prime: 30001 to 35000 6.282000 0.000000 6.282000 (
6.281000)
hpdmathn Prime: 35001 to 40000 6.718000 0.000000 6.718000 (
6.719000)
hpdmathn Prime: 40001 to 45000 7.110000 0.000000 7.110000 (
7.110000)
hpdmathn Prime: 45001 to 50000 7.468000 0.000000 7.468000 (
7.469000)
hpdmathn Prime: 50001 to 55000 7.797000 0.000000 7.797000 (
7.797000)
hpdmathn Prime: 55001 to 60000 8.125000 0.000000 8.125000 (
8.141000)
hpdmathn Prime: 60001 to 65000 8.453000 0.000000 8.453000 (
8.454000)
hpdmathn Prime: 65001 to 70000 8.719000 0.000000 8.719000 (
8.719000)
hpdmathn Prime: 70001 to 75000 9.016000 0.000000 9.016000 (
9.016000)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,705
Latest member
Stefkari24

Latest Threads

Top