Retrieving and copying element from array

S

Simon Harrison

If I have an array like this:

["category: cat1",
"item1, item2, item3",
"category: cat2",
"item1",
"category: cat3",
"item1, item2, item3, item4",]

How can I have a new array like this:

[["cat1", "item1", "item2", item3"],
["cat2", "item1"],
["cat3", "item1", "item2", "item3", "item4"]
]

Thanks for the help.
 
A

Anurag Priyam

["category: cat1",
=A0"item1, item2, item3",
=A0"category: cat2",
=A0"item1",
=A0"category: cat3",
=A0"item1, item2, item3, item4",]

How can I have a new array like this:

[["cat1", "item1", "item2", item3"],
=A0["cat2", "item1"],
=A0["cat3", "item1", "item2", "item3", "item4"]
]

If your initial array is called 'list' :
result =3D []
list.each_slice(2) {|i, j| result.push(i.sub(/category: /, '')); b.push(=
*j.split(', '))}

Iterate over your list in pairs (each_slice), and remove 'category: '
from the first element, while split the second element over ', ' and
append the result to an array.

--=20
Anurag Priyam
http://about.me/yeban/
 
A

Adam H.

arr.grep(/category:.*/).map{|a| [arr.at(arr.index(a) +1)]}

will work as long as there is only 1 element after the 'category'
 
S

Simon Harrison

Thanks for reply, Anurag. Can't get this to work though:

irb(main):001:0> lines = File.readlines('test.txt')
=> ["category: cat1\n", " item1\n", " item2\n", " item3\n", "category:
cat2\n", " item1\n", "category: cat3\n", " item1\n", " item2\n", "
item3\n", " item4\n", "\n"]
irb(main):002:0> puts lines
category: cat1
item1
item2
item3
category: cat2
item1
category: cat3
item1
item2
item3
item4

=> nil
irb(main):003:0> result = []
=> []
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}
NameError: undefined local variable or method `b' for main:Object
from (irb):4:in `block in irb_binding'
from (irb):4:in `each'
from (irb):4:in `each_slice'
from (irb):4
from /usr/local/bin/irb:12:in `<main>'
irb(main):005:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); j.push(*j.split(', '))}
NoMethodError: undefined method `push' for " item1\n":String
from (irb):5:in `block in irb_binding'
from (irb):5:in `each'
from (irb):5:in `each_slice'
from (irb):5
from /usr/local/bin/irb:12:in `<main>'
 
A

Anurag Priyam

Thanks for reply, Anurag. Can't get this to work though:
irb(main):001:0> lines =3D File.readlines('test.txt')
=3D> ["category: cat1\n", " item1\n", " item2\n", " item3\n", "category:
cat2\n", " item1\n", "category: cat3\n", " item1\n", " item2\n", "
item3\n", " item4\n", "\n"]
irb(main):002:0> puts lines
category: cat1
=A0item1
=A0item2
=A0item3
category: cat2
=A0item1
category: cat3
=A0item1
=A0item2
=A0item3
=A0item4

=3D> nil
irb(main):003:0> result =3D []
=3D> []
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}
NameError: undefined local variable or method `b' for main:Object

My bad; typed in wrong. The 'b' should be 'result' - we want to
collect the processed element in the same array.
result =3D []
lines.each_slice(2) {|i, j| result.push(i.sub(/category: /, '')); result=
push(*j.split(', '))}

--=20
Anurag Priyam
http://about.me/yeban/
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

Thanks for reply, Anurag. Can't get this to work though:

irb(main):001:0> lines = File.readlines('test.txt')
=> ["category: cat1\n", " item1\n", " item2\n", " item3\n", "category:
cat2\n", " item1\n", "category: cat3\n", " item1\n", " item2\n", "
item3\n", " item4\n", "\n"]
irb(main):002:0> puts lines
category: cat1
item1
item2
item3
category: cat2
item1
category: cat3
item1
item2
item3
item4

=> nil
irb(main):003:0> result = []
=> []
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}
NameError: undefined local variable or method `b' for main:Object
from (irb):4:in `block in irb_binding'
from (irb):4:in `each'
from (irb):4:in `each_slice'
from (irb):4
from /usr/local/bin/irb:12:in `<main>'
irb(main):005:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); j.push(*j.split(', '))}
NoMethodError: undefined method `push' for " item1\n":String
from (irb):5:in `block in irb_binding'
from (irb):5:in `each'
from (irb):5:in `each_slice'
from (irb):5
from /usr/local/bin/irb:12:in `<main>'

I recommend you don't store your data like this, it is fragile and error
prone. You can see your data already does not look like you have said in
your first post, each item in cat1 is its own line (ie index 1 in your first
post is "item1, item2, item3" but in your actual data, it is "item1\n", so
even after you fix the part where he said b.push instead of result.push, it
will still be wrong.

I recommend using a real data format such as yaml, xml, or json. It's
actually much easier to get started with this than you think, you can just
build the data in memory how you want it to look, then tell YAML to convert
it, and store it in a file. Ta-da, a valid YAML representation of your data.
Here is an example with this data https://gist.github.com/778772

It is slightly different in that I read them into hashes, because I dislike
storing category and items in the same array -- if it were me, I might even
go a step further and store them in a struct instead of a hash.
 
S

Simon Harrison

Josh: Thanks for the reply and link to your github example. The thing
is, this data is coming from a text file. An export from an MS Access
database. I wouldn't choose to save in that format.
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

Josh: Thanks for the reply and link to your github example. The thing
is, this data is coming from a text file. An export from an MS Access
database. I wouldn't choose to save in that format.
Hi, Simon. Okay, well, if we assume that all data will be nested below a
category, and a category is denoted by "category: ", and there isn't leading
or trailing whitespace, and all data under the category is given on one
line, then this should work with your data format.



# goal format for the data, as given in the original post
goal = [
["cat1", "item1", "item2", "item3"],
["cat2", "item1"],
["cat3", "item1", "item2", "item3", "item4"]
]


categories = Array.new
File.foreach "test.txt" do |line|
if line =~ /^category:/
categories << [ line.sub(/^category: /,'').chomp ]
else
categories.last << line.strip
end
end

goal == categories # => true



puts File.read('test.txt')
# >> category: cat1
# >> item1
# >> item2
# >> item3
# >> category: cat2
# >> item1
# >> category: cat3
# >> item1
# >> item2
# >> item3
# >> item4
 
S

Simon Harrison

That works great, thanks Josh. A couple of questions if you don't mind.

1. What is the purpose of the [] in line below? Does it mean collect
whatever matches into an array?

categories << [ line.sub(/^category: /,'').chomp ]

2. I've tried to achieve the same result using an existing array, rather
than reading from the file and I'm stuck. I'm using JRuby 1.6RC1 and
getting this error about NilClass. Any ideas?


irb(main):038:0> arr2 = []
irb(main):039:0> arr
=> [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]

irb(main):040:0> arr.map do |item|
irb(main):041:1* if item =~ /^cat/
irb(main):042:2> arr2 << [ item ]
irb(main):043:2> else
irb(main):044:2* arr2.last << item
irb(main):045:2> end
irb(main):046:1> end

NoMethodError: undefined method `<<' for nil:NilClass
from (irb):44:in `evaluate'
from org/jruby/RubyArray.java:2460:in `collect'
from (irb):40:in `evaluate'
from org/jruby/RubyKernel.java:1091:in `eval'
from /opt/jruby/lib/ruby/1.8/irb.rb:158:in `eval_input'
from /opt/jruby/lib/ruby/1.8/irb.rb:271:in `signal_status'
from /opt/jruby/lib/ruby/1.8/irb.rb:270:in `signal_status'
from /opt/jruby/lib/ruby/1.8/irb.rb:155:in `eval_input'
from org/jruby/RubyKernel.java:1421:in `loop'
from org/jruby/RubyKernel.java:1194:in `rbCatch'
from /opt/jruby/lib/ruby/1.8/irb.rb:154:in `eval_input'
from /opt/jruby/lib/ruby/1.8/irb.rb:71:in `start'
from org/jruby/RubyKernel.java:1194:in `rbCatch'
from /opt/jruby/lib/ruby/1.8/irb.rb:70:in `start'

irb(main):047:0> arr
=> [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]
irb(main):048:0> arr2
=> []
irb(main):049:0> arr2.empty?
=> true

irb(main):052:0> arr.each do |item|
irb(main):053:1* if item =~ /^cat/
irb(main):054:2> arr2 << item
irb(main):055:2> else
irb(main):056:2* arr2.last << item
irb(main):057:2> end
irb(main):058:1> end

NoMethodError: undefined method `<<' for nil:NilClass
from (irb):56:in `evaluate'
from org/jruby/RubyArray.java:1671:in `each'
from (irb):52:in `evaluate'
from org/jruby/RubyKernel.java:1091:in `eval'
from /opt/jruby/lib/ruby/1.8/irb.rb:158:in `eval_input'
from /opt/jruby/lib/ruby/1.8/irb.rb:271:in `signal_status'
from /opt/jruby/lib/ruby/1.8/irb.rb:270:in `signal_status'
from /opt/jruby/lib/ruby/1.8/irb.rb:155:in `eval_input'
from org/jruby/RubyKernel.java:1421:in `loop'
from org/jruby/RubyKernel.java:1194:in `rbCatch'
from /opt/jruby/lib/ruby/1.8/irb.rb:154:in `eval_input'
from /opt/jruby/lib/ruby/1.8/irb.rb:71:in `start'
from org/jruby/RubyKernel.java:1194:in `rbCatch'
from /opt/jruby/lib/ruby/1.8/irb.rb:70:in `start'
irb(main):059:0>
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

That works great, thanks Josh. A couple of questions if you don't mind.

1. What is the purpose of the [] in line below? Does it mean collect
whatever matches into an array?

categories << [ line.sub(/^category: /,'').chomp ]
Yes, but not whatever matches. The call to #sub, with the second arg being
an empty string, says to remove "category: " from the string, if it is at
the beginning. And the chomp removes the newline. So if line is "category:
cat1\n", then line.sub(/^category: /,'').chomp will return "cat1". Then we
stick that in the Array

2. I've tried to achieve the same result using an existing array, rather
than reading from the file and I'm stuck. I'm using JRuby 1.6RC1 and
getting this error about NilClass. Any ideas?


irb(main):038:0> arr2 = []
irb(main):039:0> arr
=> [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]

irb(main):040:0> arr.map do |item|
irb(main):041:1* if item =~ /^cat/
irb(main):042:2> arr2 << [ item ]
irb(main):043:2> else
irb(main):044:2* arr2.last << item
irb(main):045:2> end
irb(main):046:1> end
You are right on, here, just getting confused about your data format, again.
Your code will work correctly if arr is an array of the lines of your file,
such as you would get with File.readlines.

In other words, in your irb example,
arr is [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]

but in mine, it was read in straight from the file, so it would be
["cat1", "1", "2", "3", "cat2", "1", "2", "cat3", "1", "2"]

If you fix that, it will work correctly.


As a side note, you are doing arr.map (
http://ruby-doc.org/core/classes/Enumerable.html#M001491), but what you
really mean is arr.each (http://ruby-doc.org/core/classes/Array.html#M000231).
It isn't harming anything, but it is misleading, because map implies you are
trying to create a new array by collecting the results of the blocks for
each element, but really you are just trying to iterate.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top