help traversing and modifying hash key and value inplace

T

Trans

Having a little trouble here. seems that I'm getting some errors trying
to dup certain values (like FIXNUM), or if I try clone it says that I
can't modify a frozen string. Is there a way to achieve this
functionality? Here's the code:

class Hash

# Returns a new hash created by traversing the hash and its
# subhashes, executing the given block on each key/value pair.
#
# h = { "A"=>"A", "B"=>"B" }
# h = h.traverse { |k,v| k.downcase! }
# h #=> { "a"=>"A", "b"=>"B" }
#
def traverse( &yld )
h = {}
self.each_pair do |k,v|
q = k.dup
f = v.dup
if f.kind_of?(Hash)
h[q] = f.traverse( &yld )
else
yield(q, f)
h[q] = f
end
end
return h
end

end


Thanks,
T.
 
R

Robert Klemme

Trans said:
Having a little trouble here. seems that I'm getting some errors trying
to dup certain values (like FIXNUM), or if I try clone it says that I
can't modify a frozen string. Is there a way to achieve this
functionality? Here's the code:

class Hash

# Returns a new hash created by traversing the hash and its
# subhashes, executing the given block on each key/value pair.
#
# h = { "A"=>"A", "B"=>"B" }
# h = h.traverse { |k,v| k.downcase! }
# h #=> { "a"=>"A", "b"=>"B" }
#
def traverse( &yld )
h = {}
self.each_pair do |k,v|
q = k.dup
f = v.dup
if f.kind_of?(Hash)
h[q] = f.traverse( &yld )
else
yield(q, f)
h[q] = f
end
end
return h
end

end

Unfortunately not all objects can be dupe'd or cloned. I wished that #dup
would return self in these cases but Matz decided to do it otherwise (both
alternatives have their merits). You need special treatment for that to
work. Alternatively use Marshal.

Another problem of your implementation is that it doesn't take recursive
structures into account. I also find the copying of values inside the
method sub optimal: there might be cases where you don't want or need copies
(for example if you add 1 to all keys and values, which creates new
instances anyway). So I'd leave the copying to the discretion of the block.

If I wanted to do the conversion you did, I'd probably do this:

h.inject({}){|h,(k,v)| h[k.downcase] = v; h}

(Yes, I know these are not exactly equivalent.)

Just some 0.02EUR...

Kind regards

robert
 
T

Trans

Thanks Robert.

I agree about returning self. Since dup can't be used when the object
is immutable (that's the condition isn't it?) then it would seem
reasonable to return self.

I'm not sure what recursive structures you mean. And I don't see how it
could work if the duping is left to the block. How could a new hash be
built up then?

Hmm... perhaps it would work better if a hash parameter were fed into
the block itself and then that could be used?

h = h.traverse { |n,k,v| n[k.downcase] = v }

Would that a better approach?

BTW, somone sent me another version similar to my original, but using
inject and somehow getting around the dup problem like so:

def traverse(&b)
inject({}) do |h,kv|
k,v = kv.first.dup, kv.last.dup
b[k,v]
h[k] = (Hash === v ? v.traverse(&b) : v)
h
end
end

T.
 
R

Robert Klemme

Trans said:
Thanks Robert.

I agree about returning self. Since dup can't be used when the object
is immutable (that's the condition isn't it?) then it would seem
reasonable to return self.

Yeah, but in that case other applications can break because they think they
have a new instance while they have not. As I said, both approaches have
their merits - a classical dilemma. :)
I'm not sure what recursive structures you mean. And I don't see how it
could work if the duping is left to the block. How could a new hash be
built up then?

No, you need to create a new Hash instance. But you don't necessarily need
to dup keys and values. That's the dup I suggested to leave to the block.
Hmm... perhaps it would work better if a hash parameter were fed into
the block itself and then that could be used?

h = h.traverse { |n,k,v| n[k.downcase] = v }

Would that a better approach?

I wouldn't do that.
BTW, somone sent me another version similar to my original, but using
inject and somehow getting around the dup problem like so:

def traverse(&b)
inject({}) do |h,kv|
k,v = kv.first.dup, kv.last.dup
b[k,v]
h[k] = (Hash === v ? v.traverse(&b) : v)
h
end
end

That's not a solution for the dup problem, as "k,v = kv.first.dup,
kv.last.dup" will throw anyway.

If you really always need copies then the easiest might acutally be to use
Marshal and work on the copy. Marshal has solved all the problems of
recursion and duping already - so why do the work twice?

Kind regards

robert
 
T

Trans

Robert said:
Yeah, but in that case other applications can break because they think they
have a new instance while they have not. As I said, both approaches have
their merits - a classical dilemma. :)

Could Marshal be used to remedy this?
No, you need to create a new Hash instance. But you don't necessarily need
to dup keys and values. That's the dup I suggested to leave to the
block.

Okay, I'll give this some more thought.
Hmm... perhaps it would work better if a hash parameter were fed into
the block itself and then that could be used?

h = h.traverse { |n,k,v| n[k.downcase] = v }

Would that a better approach?

I wouldn't do that.

No good ey? I was thinking of it would work something like inject. But
maybe this over complexifies the problem.
That's not a solution for the dup problem, as "k,v = kv.first.dup,
kv.last.dup" will throw anyway.

Oops. Your right, still some problems there.
If you really always need copies then the easiest might acutally be to use
Marshal and work on the copy. Marshal has solved all the problems of
recursion and duping already - so why do the work twice?

Okay, I'll give that a go too. Thanks, robert.

T.
 
T

Trans

So here's what I've settled on. Good? Or is there still a better way to
do?

# Returns a new hash created by traversing the hash and its subhashes,
# executing the given block on the key and value. The block should
# return a 2-element array of the form +[key, value]+.
#
# h = { "A"=>"A", "B"=>"B" }
# h.traverse { |k,v| [k.downcase, v] }
# h #=> { "a"=>"A", "b"=>"B" }
#
def traverse(&b)
inject({}) do |h,(k,v)|
nk, nv = b[k,v]
h[nk] = (Hash === nv ? nv.traverse(&b) : nv)
h
end
end
 
R

Robert Klemme

Trans said:
So here's what I've settled on. Good? Or is there still a better way to
do?

# Returns a new hash created by traversing the hash and its subhashes,
# executing the given block on the key and value. The block should
# return a 2-element array of the form +[key, value]+.
#
# h = { "A"=>"A", "B"=>"B" }
# h.traverse { |k,v| [k.downcase, v] }
# h #=> { "a"=>"A", "b"=>"B" }
#
def traverse(&b)
inject({}) do |h,(k,v)|
nk, nv = b[k,v]
h[nk] = (Hash === nv ? nv.traverse(&b) : nv)
h
end
end

There's a subtle thing: you decide whether you traverse on the other
instance *after* the block got it. Dunno whether that is what you want but
I think I remember the other version decided this based on the original
value. You probably want to leave that out altogether and have the block
decide this. In this case traverse would become even simpler.

Ah, and traverse will crash for recursive structures.
h={} => {}
h[1]=h => {1=>{...}}
h.traverse {|*a| a}
SystemStackError: stack level too deep
from (irb):5:in `traverse'
from (irb):3:in `inject'
from (irb):3:in `each'
from (irb):3:in `inject'
from (irb):3:in `traverse'
from (irb):5:in `traverse'
from (irb):3:in `inject'
from (irb):3:in `each'
from (irb):3:in `inject'
from (irb):3:in `traverse'
from (irb):5:in `traverse'
from (irb):3:in `inject'
from (irb):3:in `each'
from (irb):3:in `inject'
from (irb):3:in `traverse'
from (irb):5:in `traverse'
.... 11844 levels...
from (irb):5:in `traverse'
from (irb):3:in `inject'
from (irb):3:in `each'
from (irb):3:in `inject'
from (irb):3:in `traverse'
from (irb):5:in `traverse'
from (irb):3:in `inject'
from (irb):3:in `each'
from (irb):3:in `inject'
from (irb):3:in `traverse'
from (irb):5:in `traverse'
from (irb):3:in `inject'
from (irb):3:in `each'
from (irb):3:in `inject'
from (irb):3:in `traverse'
from (irb):12>>

:)

Kind regards

robert
 
T

Trans

Robert said:
There's a subtle thing: you decide whether you traverse on the other
instance *after* the block got it. Dunno whether that is what you want but
I think I remember the other version decided this based on the original
value.

Good catch. Thank you, so:

def traverse(&b)
inject({}) do |h,(k,v)|
nk, nv = b[k,v]
h[nk] = (Hash === v ? v.traverse(&b) : nv)
h
end
end
You probably want to leave that out altogether and have the block
decide this. In this case traverse would become even simpler.

Hmmm... not sure what "this" refers to. Assuming you mean, whether to
traverse or not. Something like:

def traverse(&b)
inject({}) do |h,(k,v)|
nk, nv = b[k,v]
h[nk] = nv
h
end
end

Is that right? In that case I wouldn't call it "traverse" though --more
like a hash verion of collect. Actually I think I already have that in
the libs as Enumerable#build_hash.
Ah, and traverse will crash for recursive structures.

Should I go through the complexity of preventing that? Well, at very
least I will note it in the docs.

Thanks robert,
T.
 
T

Trans

Sigh, sorry if my last message is garbled --blame Google groups.

BTW, speaking of #build_hash: Anyone have a "smoother" name? I was
thinking possibly #remap.

Which leads me to wonder, actually, why is #map the same as #collect?
Always struck me as strange that those two words would be considered
synonymous. I'm guessing its derivative from another language?

T.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top