A algorithm for finding the number

zuerrong · Dec 1, 2010

Hi members,

Maybe this is not a special question to ruby.
But since my ruby app has meet this algorithm problem, so please give
the suggestion if you'd like.

I have these datas in a mysql table:

mysql> select startNum,endNum from ip_data limit 10;
+----------+----------+
| startNum | endNum |
+----------+----------+
| 16777216 | 16777471 |
| 16843008 | 16843263 |
| 16909056 | 16909311 |
| 17367040 | 17498111 |
| 17498112 | 17563647 |
| 17563648 | 17825791 |
| 17825792 | 18153471 |
| 18153472 | 18219007 |
| 18219008 | 18350079 |
| 18350080 | 18874367 |
+----------+----------+

(Those are actually the bigint for ip addresses.)

Given a numer, say 17498200, I want to find this number is in which
row of the table.

I could run the SQL with ruby's mysql API (dbi and mysql::dbd):

select * from ip_data where startNum <= #{number} and endNum >= #{number}

But I found that is much slow.

The talbe is not so large, has only 339542 rows totally.

So I was thinking loading the whole data into memory, discarding
mysql, and find a algorithm for doing it quickly.

Any suggestion please? Thanks.

Regards,
zuerrong

Mike Stephens · Dec 1, 2010

The common sense thing to do is exactly what you have done. Make sure
you have indexes on the fields. If it's still slow, try a different
database. Five minutes and you should be able to move the data into
Access (Jet), like this:

require 'win32ole'
connection = WIN32OLE.new('ADODB.Connection')
connection.Open('Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=c:\Ruby\RubyAccess.mdb')
recordset = WIN32OLE.new('ADODB.Recordset')
sql = 'SELECT * FROM T1'
recordset.Open(sql, connection)
data = recordset.GetRows.transpose
puts data
recordset.close
end

zuerrong · Dec 1, 2010

2010/12/1 Mike Stephens said:
The common sense thing to do is exactly what you have done. Make sure
you have indexes on the fields. If it's still slow, try a different
database.

Yes I have added keys to the columns in mysql.
While it's still slooow.
So I was thinking moving the whole data into memory, and find a good
algorithm for it.

Thanks.

Mike Stephens · Dec 1, 2010

I don't know how SQLite works but Access Jet would only fetch across the
network the index pages and the target data page(s). Perhaps SQLite is
bringing the whole database across. A normal server-based database would
obviously solve your problem.

botp · Dec 1, 2010

So I was thinking loading the whole data into memory, discarding
mysql, and find a algorithm for doing it quickly.
Any suggestion

try doing a soft binary search on the first column

best regards -botp

Barry Smith · Dec 1, 2010

[Note: parts of this message were removed to make it a legal post.]

Anyone know how i Unsubscribe from this ?

Giampiero Zanchi · Dec 1, 2010

zuerrong wrote in post #965316:

+----------+----------+
| startNum | endNum |
+----------+----------+
| 16777216 | 16777471 |
| 16843008 | 16843263 |
| 16909056 | 16909311 |
| 17367040 | 17498111 |
| 17498112 | 17563647 |
| 17563648 | 17825791 |
| 17825792 | 18153471 |
| 18153472 | 18219007 |
| 18219008 | 18350079 |
| 18350080 | 18874367 |
+----------+----------+

select * from ip_data where startNum <= #{number} and endNum >=
#{number}

try with

select * from ip_data where startNum = (select max(startnum) from
ip_data where startnum <= #{number})

Ben Bleything · Dec 1, 2010

I don't know how SQLite works but Access Jet would only fetch across the
network the index pages and the target data page(s). Perhaps SQLite is
bringing the whole database across. A normal server-based database would
obviously solve your problem.

They've said twice they're using mysql, not sqlite. And remember that
most folks aren't on Windows and can't use Access.

Quite aside from that, sqlite doesn't do any network access at all, it
is always filesystem access only.

Ben

Mike Stephens · Dec 1, 2010

Ben Bleything wrote in post #965416:

They've said twice they're using mysql, not sqlite. And remember that
most folks aren't on Windows and can't use Access.

Well spotted. However I don't know why you think Windows is not widely
used. It is in the corporate world.

Quite aside from that, sqlite doesn't do any network access at all, it
is always filesystem access only.

In the corporate world you rarely - if ever - store data except across a
network

Ben Bleything · Dec 1, 2010

Well spotted. However I don't know why you think Windows is not widely
used. It is in the corporate world.

Call it long experience in the Ruby community. At any rate, giving a
platform-specific solution, whether it's windows or not, is a big
assumption.

In the corporate world you rarely - if ever - store data except across a
network

Yes, of course. But since sqlite has no concept of the network, it all
appears to be local disk access as far as the database is concerned.
If you happen to be accessing it over a network filesystem, sqlite has
no idea. My point was that sqlite will never "bring the whole database
across" since it has no idea what that even means.

Ben

Mike Stephens · Dec 1, 2010

Ben Bleything wrote in post #965426:

My point was that sqlite will never "bring the whole database
across" since it has no idea what that even means.

Whilst this is not a Ruby issue, it is worth pointing out to those that
don't already know, that client databases like Jet (and I would wager,
SQLite) do not read all the database file if there are indexes they can
use.

Your program will read the first bit of the file and detect if there are
applicable indexes. It will then fetch the index pages and look up the
addresses (typically blocks that hold data pages) and fetch those from
the file system.

If you are looking at mickey mouse databases or stuff off a local drive,
this might not be of any consequence, but if you start to get hundreds
of
thousands of records, or millions, it will dramatically cut down your
fetches across the network and the work your program has to do to set up
the data.

If you have joins, the program might be able to do them purely off the
indexes, so that next to no real data has to be fetched.

The original poster was talking about MySQL as you rightly said. The
first issue is therefore to check why the Select was running slowly -
outside of Ruby. There are many reasons why that might be.

Databases are sophisticated components that have millions of man hours
lovingly invested in them. It would be entirely illogical to attempt to
reinvent the wheel in your Ruby program.

Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023
find the number of hours in a day	2	Sep 21, 2009
(ArgumentError) - in `initialize': wrong number of arguments (4 for0)	11	Mar 6, 2011
Finding number of file from gzip'ed format	2	Feb 29, 2008
Finding a bit string in another bitstring	9	Oct 30, 2009
undefined method `debug' for nil:NilClass(NoMethodError	2	Aug 4, 2010
MySQL - "create table" creates malfunctioning tables	0	Jan 25, 2013
Multi-threaded DBI causing segfault	3	Jan 20, 2010

A algorithm for finding the number

zuerrong

Mike Stephens

zuerrong

Mike Stephens

botp

Barry Smith

Giampiero Zanchi

Ben Bleything

Mike Stephens

Ben Bleything

Mike Stephens

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads