M
Marc Hoeppner
Hi,
I've been playing around with ruby for a while now, but wouldnt consider
myself an experienced user.
For a new project I want to use a xml parser to extract some information
from a file. I understand that rexml is the tool of choice and that it
has various options to actually perform this task (tree, stream
parsing).
My question concerns how to access multiple children of an element at
one go... I guess that requires some explanation:
This is, in principle, how the xml source looks like
<entry>
<name>...
<feature 1>...
<feature 2>...
<feature 3>...
</entry>
<entry>
<name>...
<feature 1>
<feature 2>
</entry>
and so on.
In reality, we are dealing with a file that holds information about
genes, their name, their location and some other features. Each gene
needs to be dealt with individually (e.g. iterating) as I have some
methods that need to be applied to each entry or rather some of its
features. What I cant figure out is thus:
How do I get the entry (I figure its Element.elements.each('entry')) and
then in the same "go" also some, not all, of its children. These
children are at different levels, too. If I use
Elements.elements.each('entry'), the whole entry gets stored as one
element in an array. That on its own is not a big problem, but at that
point I havent even touched on the children yet. If I try to further
treat them as if I was dealing with XML code (i.e. filter for elements)
it wont work. But isnt there a way other than normal array methods and
simple text parsing to get the children?
Most, if not all the tutorials I found were specifically focusing on
attributes after filtering on the "primary" level, which is no good to
me since I dont have any attributes in my xml file (although having
those would make thinks much easier...).
Or in other words:
How can I filter for an entry and then puts (or store in a variable)
some of its children like
'puts Element.elements.each('entry') do {|output| puts
output('//feature1', '//feature2')}'.
The last bit is obviously nonsense, but in principle what I am looking
for.
Anyhow, I hope someone understands what I am trying to say here and can
point me in the right direction
Cheers,
Marc
I've been playing around with ruby for a while now, but wouldnt consider
myself an experienced user.
For a new project I want to use a xml parser to extract some information
from a file. I understand that rexml is the tool of choice and that it
has various options to actually perform this task (tree, stream
parsing).
My question concerns how to access multiple children of an element at
one go... I guess that requires some explanation:
This is, in principle, how the xml source looks like
<entry>
<name>...
<feature 1>...
<feature 2>...
<feature 3>...
</entry>
<entry>
<name>...
<feature 1>
<feature 2>
</entry>
and so on.
In reality, we are dealing with a file that holds information about
genes, their name, their location and some other features. Each gene
needs to be dealt with individually (e.g. iterating) as I have some
methods that need to be applied to each entry or rather some of its
features. What I cant figure out is thus:
How do I get the entry (I figure its Element.elements.each('entry')) and
then in the same "go" also some, not all, of its children. These
children are at different levels, too. If I use
Elements.elements.each('entry'), the whole entry gets stored as one
element in an array. That on its own is not a big problem, but at that
point I havent even touched on the children yet. If I try to further
treat them as if I was dealing with XML code (i.e. filter for elements)
it wont work. But isnt there a way other than normal array methods and
simple text parsing to get the children?
Most, if not all the tutorials I found were specifically focusing on
attributes after filtering on the "primary" level, which is no good to
me since I dont have any attributes in my xml file (although having
those would make thinks much easier...).
Or in other words:
How can I filter for an entry and then puts (or store in a variable)
some of its children like
'puts Element.elements.each('entry') do {|output| puts
output('//feature1', '//feature2')}'.
The last bit is obviously nonsense, but in principle what I am looking
for.
Anyhow, I hope someone understands what I am trying to say here and can
point me in the right direction
Cheers,
Marc