Subclassing Hash to enforce value uniqueness ala key uniqueness.

Adam Gardner · Nov 18, 2008

First of all, hello to everyone. This is my first message to this list.

I find myself needing something very much like a Hash, but with unique
one-to-one mapping in both directions, both key-to-value, and
value-to-key. Now, I've been looking around a bit on Google, then
Rubyforge, the RAA, and the archives of this list, and I'm a bit
surprised I wasn't able to find a prior implimentation of this. Time to
impliment my own, right?

Well, before I got too far into the process, I though I would ask for
people's thoughts here. First off, here is the effect I would like to
see. I'll call my subclass OneToOne for now, though I'm open to better
names.

oto = OneToOne[:a=>1, :b=>2, :c=>3] => {:a=>1, :b=>2, :c=>3} # or on 1.8.x, {:b=>2, :c=>3, :a=>1}
oto[:d] = 1 => 1
oto => {:b=>2,:c=>3,:d=>1}
OneToOne[:a=>1,:b=>1,:c=>1]

Click to expand...

=> {:c=>1}
...
etc

Here is some code which begins to implement such a class:

class OneToOne < Hash

def self.[](*args)
super(*args).invert.invert
end

def initialize(*args)
super(*args)
self.fix
end

def []=(key,val)
self.delete_if {|k,v| v == val}
super(key,val)
end

def store(key,val)
self.delete_if {|k,v| v == val}
super(key,val)
end

def key(val) #this is just a nicety to arrange for code to run in 1.9
and 1.8
if Hash.method_defined?

key)
super(val)
else
self.index(val)
end
end

def fix
self.replace(self.invert.invert)
end
end

This code works to a certain degree, but I see two problems going
forward. First of all, Hash is a built-in in ruby which is written (as
far as I am aware) entirely in C. As such, there is no 'core' method
which can be overridden to affect change in all the methods which build
on it. At least, implementing []= did not give me store, and
implimenting store did not give me []=. Neither gives me merge. How many
other methods will I therefore have to override to ensure value
uniqueness? It seems to me, every method which could modify the contents
of the hash in some way (other than simply removing entries). Even then,
could I guarantee no two keys would have the same value, with the same
level of certainty I have that no two values have the same key?

The second problem is efficiency: It seems to me that this could
probably be done much more efficiently, especially if implemented in C.
It doesn't affect my much for what I need it for right now, but if I
want to reuse it later (or if someone else wants to use it), it could
matter a great deal.

(Incidentally: searching for information about this has been somewhat
hampered by the fact that the phrase 'hash value' is very ambiguous
between the a value contained in a hash and paired with a key, or value
returned by a hashing algorithm such as Object#hash or MD5. It doesn't
help that most of other terms I can think of to search for, such as
bidirectional or unique or one-to-one, are more common when talking
about the latter than the former. Now I understand why several other
languages choose the term 'Dictionary'.)

Anyway, if someone has either a) a prior implementation of this, or b)
some suggestions on improving my implimentation, I'd love to hear it.

Thanks for your time,
- Adam Gardner

Trans · Nov 18, 2008

Anyway, if someone has either a) a prior implementation of this, or b)
some suggestions on improving my implimentation, I'd love to hear it.

You may be able to save yourself some time with a little meta
programming. Eg.

class HashSet < Hash

Hash.instance_methods(false).each do |m|
define_method(m, *a, &b)
r =3D super
fix
r
end
end

def fix
self.replace(self.invert.invert)
end

end

You'll want to improve upon that code of course, probably limit it to
only certain methods. But you get the idea.

T.

Sebastian Hungerecker · Nov 18, 2008

Adam said:
The second problem is efficiency: It seems to me that this could
probably be done much more efficiently, especially if implemented in C.

Every time you add a value you iterate over all the other values to check
whether the value is already there. This makes adding an element O(n). Having
adding to a datastructure be an O(n) operation is usually a bad idea. Here's
how I'd probably do it (untested):

class OneToOne
def initialize()
@key_value = {}
@value_key = {}
end

def [](k)
@key_value[k]
end

def []=(k,v)
if @key_value.has_key?(k)
@value_key.delete(@key_value[k])
end
if @value_key.has_key(?v)
@key_value.delete(@value_key[v])
end
@key_value[k] = v
@value_key[v] = k
end

def to_hash
@key_value
end
def invert
@value_key
end
...
end

HTH,
Sebastian

Robert Klemme · Nov 18, 2008

Every time you add a value you iterate over all the other values to check
whether the value is already there. This makes adding an element O(n). Having
adding to a datastructure be an O(n) operation is usually a bad idea. Here's
how I'd probably do it (untested):

<snip>good example</snip>

Right, my main point would be: why subclass? You inherit a lot of
methods that you need to watch out for. Rather, as Basti showed,
implement a new class with all the necessary methods.

Kind regards

robert

PS: Sorry for the nicknaming.

James Coglan · Nov 18, 2008

[Note: parts of this message were removed to make it a legal post.]

Every time you add a value you iterate over all the other values to check
whether the value is already there. This makes adding an element O(n).
Having
adding to a datastructure be an O(n) operation is usually a bad idea.
Here's
how I'd probably do it (untested):

# example (elided)

This seems like a good idea, I'd have suggested the same. This is exactly
what Set in the standard library does to ensure uniqueness, it puts the
members as keys in a hash and lets Ruby sort out the hashtable as
efficiently as possible. So a pair of mutually inverted hash tables is
probably the way to go.

Sebastian Hungerecker · Nov 19, 2008

Sebastian said:
=A0 def to_hash
=A0 =A0 @key_value
=A0 end

This should return @key_value.dup=20

=A0 def invert
=A0 =A0 @value_key
=A0 end

And this should either return @value_key.dup or another instance of OneToOn=
e=20
where the both hashes are switched.

HTH,
Sebastian
=2D-=20
Jabber: (e-mail address removed)
ICQ: 205544826

uniqueness of keys in group_by	1	Aug 4, 2011
Ensuring uniqueness of an object at creation time	9	Feb 26, 2011
I made a blockchain and want to make a cryptocurrency, but my code doesn't verify hash of each block	2	Jun 2, 2024
How to Recover Data from BitLocker Encrypted Drive with Key?	1	Oct 28, 2024
problem determining uniqueness in array of arrays	2	Feb 4, 2011
Checking for uniqueness	2	Nov 19, 2007
Key Factors in JavaScript Testing from a QA Engineer's Perspective	0	May 29, 2024
Sets, uniqueness not unique.	26	Sep 13, 2005

Subclassing Hash to enforce value uniqueness ala key uniqueness.

Adam Gardner

Trans

Sebastian Hungerecker

Robert Klemme

James Coglan

Sebastian Hungerecker

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads