small array/hash question

K

Kev Jackson

Hi all,

I'm trying to loop across a dataset and create a hash where each value
is an array so that later I can loop over the hash and for each key
(it's important I store the key), I can loop over the array contained
and spit out some results.

Without looking at the docs I wanted to do something like...

(in pseudo-code)
loop across data here
work_types[nsc_id]=do |types|
types << data[7]
end
end loop
where work_types is the hash and types is the array I want to accumulate
data in

This doesn't work, so I'm wondering what the ruby idiom for this kind of
thing would be. Essentially for each piece of data I want to get the
appropriate value from the Hash and append the value on to the end of
the array associated with the key, or if it doesn't exist in the Hash,
create a new entry with a new array populated with the value.

I'm sure there's a very simple way of doing this, but I can't see the
method I want in the standard library docs - I thought it might be
collect, but it doesn't look like it

Thanks
Kev
 
A

Ara.T.Howard

Hi all,

I'm trying to loop across a dataset and create a hash where each value is an
array so that later I can loop over the hash and for each key (it's important
I store the key), I can loop over the array contained and spit out some
results.

Without looking at the docs I wanted to do something like...

(in pseudo-code)
loop across data here
work_types[nsc_id]=do |types|
types << data[7]
end
end loop
where work_types is the hash and types is the array I want to accumulate data
in

This doesn't work, so I'm wondering what the ruby idiom for this kind of
thing would be. Essentially for each piece of data I want to get the
appropriate value from the Hash and append the value on to the end of the
array associated with the key, or if it doesn't exist in the Hash, create a
new entry with a new array populated with the value.

I'm sure there's a very simple way of doing this, but I can't see the method
I want in the standard library docs - I thought it might be collect, but it
doesn't look like it

Thanks
Kev

i think you want something like this:

harp:~ > irb
irb(main):001:0> work_types = Hash::new{|h,k| h[k] = []}
=> {}
irb(main):002:0> work_types[ 'foo' ] << 42
=> [42]
irb(main):003:0> work_types[ 'foo' ] << 42
=> [42, 42]
irb(main):004:0> work_types[ 'bar' ] << 'forty-two'
=> ["forty-two"]
irb(main):005:0> work_types
=> {"foo"=>[42, 42], "bar"=>["forty-two"]}

if not you'll have to post more about your exact problem and some sample data.

hth.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
K

Kev Jackson

Ara.T.Howard said:
Hi all,

I'm trying to loop across a dataset and create a hash where each
value is an array so that later I can loop over the hash and for each
key (it's important I store the key), I can loop over the array
contained and spit out some results.

Without looking at the docs I wanted to do something like...

(in pseudo-code)
loop across data here
work_types[nsc_id]=do |types|
types << data[7]
end
end loop
where work_types is the hash and types is the array I want to
accumulate data in

This doesn't work, so I'm wondering what the ruby idiom for this kind
of thing would be. Essentially for each piece of data I want to get
the appropriate value from the Hash and append the value on to the
end of the array associated with the key, or if it doesn't exist in
the Hash, create a new entry with a new array populated with the value.

I'm sure there's a very simple way of doing this, but I can't see the
method I want in the standard library docs - I thought it might be
collect, but it doesn't look like it

Thanks
Kev


i think you want something like this:

harp:~ > irb
irb(main):001:0> work_types = Hash::new{|h,k| h[k] = []}
=> {}
irb(main):002:0> work_types[ 'foo' ] << 42
=> [42]
irb(main):003:0> work_types[ 'foo' ] << 42
=> [42, 42]
irb(main):004:0> work_types[ 'bar' ] << 'forty-two'
=> ["forty-two"]
irb(main):005:0> work_types
=> {"foo"=>[42, 42], "bar"=>["forty-two"]}

if not you'll have to post more about your exact problem and some
sample data.

hth.

-a

I got the output I wanted with this

work_types = Hash.new
if work_types.has_key?(nsc_id) then
work_types[nsc_id]= work_types[nsc_id].include?(work_type) ?
work_types[nsc_id] : work_types[nsc_id] << work_type
else
work_types[nsc_id]= [work_type]
end

So the problem is solved, but I wonder if there's a more elegant way of
doing it (especially the check to see if the value is already in the
array). My first assumption was that assignment to a Hash took a block
(hence the pseudo code), I was actually a little suprised that it didn't ;)

Kev
 
D

Dave Burt

Kev Jackson:
loop across data here
work_types[nsc_id]=do |types|
types << data[7]
end
end loop
where work_types is the hash and types is the array I want to accumulate
data in
...
I got the output I wanted with this

work_types = Hash.new
if work_types.has_key?(nsc_id) then
work_types[nsc_id]= work_types[nsc_id].include?(work_type) ?
work_types[nsc_id] : work_types[nsc_id] << work_type
else
work_types[nsc_id]= [work_type]
end

So the problem is solved, but I wonder if there's a more elegant way of
doing it (especially the check to see if the value is already in the
array). My first assumption was that assignment to a Hash took a block
(hence the pseudo code), I was actually a little suprised that it didn't
;)

How about this?

work_types = Hash.new {|h, k| h[k] = [] }
work_types[nsc_id] << work_type unless
work_types[nsc_id].include?(work_type)

Cheers,
Dave
 
A

Ara.T.Howard

I got the output I wanted with this

work_types = Hash.new
if work_types.has_key?(nsc_id) then
work_types[nsc_id]= work_types[nsc_id].include?(work_type) ?
work_types[nsc_id] : work_types[nsc_id] << work_type
else
work_types[nsc_id]= [work_type]
end

So the problem is solved, but I wonder if there's a more elegant way of
doing it (especially the check to see if the value is already in the array).
My first assumption was that assignment to a Hash took a block (hence the
pseudo code), I was actually a little suprised that it didn't ;)

this is one easy way

work_types = Hash::new{|h,k| h[k] = []}

work_types[ nsc_id ].push( work_type ).uniq!


but does a bit of extra work. another way would be to use set

require 'set'

work_types = Hash::new{|h,k| h[k] = Set::new}

work_types[ nsc_id ] << work_type

but you must understand set and it's notion of equality. plus you lose data
order but, since you are ignoring dups, i guess this isn't important.

or perhaps you can model your data with a nested hash?

work_types = Hash::new{|h,k| h[k] = {}}

work_types[ nsc_id ][ work_type ] = true

and then use

values = work_types[ nsc_id ].keys

or just make your own apprach more compact

work_types = Hash::new

work_types[ nsc_id ] = [ work_types[ nsc_id ], work_type ].compact.uniq

you have options - and there is always sqlite if you start to feel like you
are rolling query logic on top of this data structure ;-)

regards.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
K

Kev Jackson

Ara.T.Howard said:
I got the output I wanted with this

work_types = Hash.new
if work_types.has_key?(nsc_id) then
work_types[nsc_id]= work_types[nsc_id].include?(work_type) ?
work_types[nsc_id] : work_types[nsc_id] << work_type
else
work_types[nsc_id]= [work_type]
end

So the problem is solved, but I wonder if there's a more elegant way of
doing it (especially the check to see if the value is already in the
array).
My first assumption was that assignment to a Hash took a block (hence
the
pseudo code), I was actually a little suprised that it didn't ;)


this is one easy way

work_types = Hash::new{|h,k| h[k] = []}

work_types[ nsc_id ].push( work_type ).uniq!
That looks promising - I see it essentially relies on defining Hash and
setting the initial values, so that I can avoid the "else
work_types[nsc_id]= [work_type] end" part

uniq! certainly looks like it would shorten my code.
but does a bit of extra work. another way would be to use set

require 'set'

work_types = Hash::new{|h,k| h[k] = Set::new}

work_types[ nsc_id ] << work_type
Yeah I was thinking of Set, but I want to require/include as little as
possible to keep complexity down for other people to maintain.
but you must understand set and it's notion of equality. plus you
lose data
order but, since you are ignoring dups, i guess this isn't important.

or perhaps you can model your data with a nested hash?

work_types = Hash::new{|h,k| h[k] = {}}

work_types[ nsc_id ][ work_type ] = true

and then use

values = work_types[ nsc_id ].keys

or just make your own apprach more compact

work_types = Hash::new

work_types[ nsc_id ] = [ work_types[ nsc_id ], work_type ].compact.uniq
I'm not how this works in a manner that's similar to the (verbose but
easy to understand) version I have already. In fact I can't understand
half of what's going on here!
you have options - and there is always sqlite if you start to feel
like you
are rolling query logic on top of this data structure ;-)

Dear god no! :). I'm only munging data dumps from Oracle, I'd hate to
have to store them in a database just to transform them!

Kev
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,184
Messages
2,570,976
Members
47,533
Latest member
medikillz39

Latest Threads

Top