Hash

R

Ron Green

What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?
 
P

Phlip

Ron said:
What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?

Run "another test".hash - you get a different number.

It's good for hashes; look them up in any "data structures" textbook. It's a
unique number useful for rapidly accessing that string.

And please DON'T SCREAM!
 
M

Marcel Molina Jr.

Run "another test".hash - you get a different number.

It's good for hashes; look them up in any "data structures"
textbook. It's a unique number useful for rapidly accessing that

It should be noted though that String#hash isn't garaunteed to be
unique.
 
P

Phlip

Ron said:
Then,again I ask, what is it good for?

Then I answer again: A tutorial on data structures tells how to use them.

(Maybe I shouldn't post when I'm having a bad day, huh?;)
 
R

Ron Green

Peter said:
It's still useful as a hash. Marcel wasn't wrong, but *no* fixed size
hash
is "guaranteed" to be unique as that's absolutely impossible, per the
pigeonhole principle
(http://en.wikipedia.org/wiki/Pigeonhole_principle).
String#hash's hash is of a far lower "quality" than that offered by,
say,
SHA-1 or SHA-2.

Regards,
Peter Cooper
http://www.rubyinside.com/

Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
 
M

Marcin Raczkowski

Ron said:
Then,again I ask, what is it good for?
It's for internal use, it's used in Hash to make accessing and finding
strings in hash faster
 
K

Konrad Meyer

--nextPart9518005.DP6sjhEFFz
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?

Let's put it this way. MD5 and SHA-* hashes aren't *guaranteed* to be
unique either. There's just many more cases where strings will share a hash
with String#hash as opposed to something like MD5/SHA-*.

Hashes are useful for identify strings in hashtables. You use this every
time you say something like:

foo =3D {"bar" =3D> "baz"}
foo["bar"] # =3D> "baz"

HTH,
=2D-=20
Konrad Meyer <[email protected]> http://konrad.sobertillnoon.com/

--nextPart9518005.DP6sjhEFFz
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBG5EriCHB0oCiR2cwRAvaaAJ91scTFNpTDfPNUaXzOswwfcyvTjwCgoRBH
Qfi/XhKnqsF1CjNQDINtUP8=
=UXmM
-----END PGP SIGNATURE-----

--nextPart9518005.DP6sjhEFFz--
 
A

Alex Young

Ron said:
Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
In general, you wouldn't use String#hash, although you might conceivably
want to override it. It's there for Hash. From the documentation on
Object#hash:

"Generates a Fixnum hash value for this object. This function must have
the property that a.eql?(b) implies a.hash == b.hash. The hash value is
used by class Hash."

Note the direction of implication: a == b => a.hash == b.hash, not
a.hash == b.hash => a == b.
 
R

Ron Green

Alex said:
In general, you wouldn't use String#hash, although you might conceivably
want to override it. It's there for Hash. From the documentation on
Object#hash:

"Generates a Fixnum hash value for this object. This function must have
the property that a.eql?(b) implies a.hash == b.hash. The hash value is
used by class Hash."

Note the direction of implication: a == b => a.hash == b.hash, not
a.hash == b.hash => a == b.

I think I understand. In other words it's not something I would use
directly. I just ran across it in Peter's book and wanted to make sure I
understood. Thanks everybody. Sorry if my ignorance pissed you off
Philip.
 
R

Ron Green

Ron said:
Michael Bevilacqua-Linn wrote:



I think I'll try Data Structures and Algorithms (Addison-Wesley Series
in Computer Science and Information Pr)

When I said it couldn't be used for identity I meant if you can't
guarantee uniqueness.How would you know if you retrieved the correct
data.
 
P

Phlip

Ron said:
When I said it couldn't be used for identity I meant if you can't
guarantee uniqueness.How would you know if you retrieved the correct
data.

You don't need uniqueness. The hash did its job when you can almost
instantly chop billions of strings down to a short list of candidate
strings. After the hash collision, you trivially search the list for the
actual target. The point is to access the stored value, at the target
location, quickly!

This is how Google works, for example...
 
R

Ron Green

Phlip said:
You don't need uniqueness. The hash did its job when you can almost
instantly chop billions of strings down to a short list of candidate
strings. After the hash collision, you trivially search the list for the
actual target. The point is to access the stored value, at the target
location, quickly!

This is how Google works, for example...

Thank you.
 
M

Marcin Mielżyński

Ron Green pisze:
Thank you.

Here is an example:

a class whose instances have always the same #hash and #eql? which
always returns true

class Foo
def hash
puts "hash called"
0
end

def eql? other
puts "eql? called"
true
end
end


h = {}
f1 = Foo.new
f2 = Foo.new

# given

h[f1] = :blah

here, f1.hash is used to locate the bucket to be inserted into, so it
will only output "hash called".

h[f1]

here, the f1.hash is used to locate the bucket and then references will
be compared (f1 is identical to f1 so eql? won't have to be called).

h[f2]

here, the f2.hash is used to locate the bucket, it will be found, but f2
is not identical to f1, so eql? method will have to be used (which in
turn returns true, so the objects are considered equal) and finally the
lookup will be successfull.


In a hash, when two different objects return the same hash value, it's
called a collision. Equality operation here just atcs like a guard to
make sure we are dealing with right object.

lopex
 
S

Sam Smoot

In general, you wouldn't use String#hash, although you might conceivably
want to override it. It's there for Hash. From the documentation on
Object#hash:

"Generates a Fixnum hash value for this object. This function must have
the property that a.eql?(b) implies a.hash == b.hash. The hash value is
used by class Hash."

Note the direction of implication: a == b => a.hash == b.hash, not
a.hash == b.hash => a == b.

I might suggest you should *never* overwrite String#hash, though you
may want to overwrite Object#hash in your own classes. It's a bit
tricky though since the hash should be unique (or close enough), never
change, and two objects that share the same hash should be
Object#equal. You might also overwrite String#hash in a singleton
class instance, but never the base declaration. That's just askin' for
trouble. ;)
 
S

Sam Smoot

Ron Green pisze:
Thank you.

Here is an example:

a class whose instances have always the same #hash and #eql? which
always returns true

class Foo
def hash
puts "hash called"
0
end

def eql? other
puts "eql? called"
true
end
end

h = {}
f1 = Foo.new
f2 = Foo.new

# given

h[f1] = :blah

here, f1.hash is used to locate the bucket to be inserted into, so it
will only output "hash called".

h[f1]

here, the f1.hash is used to locate the bucket and then references will
be compared (f1 is identical to f1 so eql? won't have to be called).

h[f2]

here, the f2.hash is used to locate the bucket, it will be found, but f2
is not identical to f1, so eql? method will have to be used (which in
turn returns true, so the objects are considered equal) and finally the
lookup will be successfull.

In a hash, when two different objects return the same hash value, it's
called a collision. Equality operation here just atcs like a guard to
make sure we are dealing with right object.

lopex

Excellent! Thanks! I don't know if I've ever seen it explained so
clearly. That makes a lot more sense to me now. (And naturally,
disregard where I said Object#equal in my other post. Couldn't
remember positively if it was "equal" or "eql?". ;-) )

Anyways, good stuff. I actually need to do this myself soon enough...
 
L

Lionel Bouton

Ron Green wrote the following on 09.09.2007 21:48 :
I think I understand. In other words it's not something I would use
directly.

You could. Hashes are mainly used to restrict the set of objects you
have to look into to find objects identical to one you have or detect
changes in values (with a small margin for false negatives you must be
able to afford).
The Hash class uses it (I'm guessing storing the objects in a balanced
tree using object hashes as keys for quick access).

I used a hash method (not the Ruby's default one because I wasn't sure
it would still use the same algorithm in Ruby 3.0... :) ) not so long
ago to code a correlation algorithm across text contents. I'll spare the
details, but I used hashes to get both item lookup speed and storage
space efficiency.

Lionel
 
R

Rick DeNatale

I might suggest you should *never* overwrite String#hash, though you
may want to overwrite Object#hash in your own classes. It's a bit
tricky though since the hash should be unique (or close enough), never
change, and two objects that share the same hash should be
Object#equal.

No, just the other way around. Two object which are equal should have
the same hash value. But there's no requirement that two objects with
the same hash value be equal.

The way hash and equal interact in the implementation of the Hash
class is that comparing two objects hash values acts as a quick test
to rule out the objects being equal. If they don't have the same hash
they are assumed NOT to be equal. If they do then equality is tested
for the final determination.

If you want to think of it from an analogy with criminal law, if an
object is suspected of being equal to another object, the hashes must
be the same for an indictment, and the trial consists of actually
testing for equality.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
474,264
Messages
2,571,323
Members
48,006
Latest member
TerranceCo

Latest Threads

Top