J
Justus Ohlhaver
Ken said:I'm going to suggest what Todd Benson and Rolando Abarca suggested,
which
is to just work with strings in the database. Don't bother with
computing
some kind of (possibly unique) hash. Use a CREATE INDEX statement to
index the headline field, and you'll probably never notice a speed
difference between your roundabout method and feeding in the string
directly to the database.
--Ken
Thanks again for all you help everyone!
I have made one small test already using an additional integer column
instead of the original headline string. To convert the headline string
into an integer value I used the .hash method. The db I'm using is
mysql. Using a very small sample of entries (about 1000) I found
virtually no difference at all in the time it took to check the entire
table for existing entries when comparing using the string column vs.
using the integer column for all searches. If there is any difference in
time it takes it would be less than 1%. Considering that there is an
additional computation (.hash method) being performed when using the
integer column one could maybe assume that the latter - the integer
column - by itself must slightly faster for the database to check. In
any case I am going to stick with the original string column for the
headline field for now.
I will try to optimize the table indexing the headline field as
suggested. One question regarding this: Can this be done from rails or
are these mysql commands ('CREATE INDEX' etc.)?
Thanks again for all the help!
Justus
and found virtually no difference in the time it took to compare about
100,000 rows in MySQL.
using an integer value which was derived using 'headline'.hash
Again, thanks everybody