REXML advice - output

Stuart Clarke · Sep 16, 2010

Hey all,

I would like to pick your brains about Rexml and how to report from it.
For example, I am reading an XML file using references to each XML tag
like so:

doc.root.each_element("/UserData/List/ItemInfo/Title") {|e|
report.puts "Title: #{e.text}"
}
doc.root.each_element("/UserData/List/ItemInfo/Date") {|e|
report.puts "Date: #{e.text}"
}

The 'report.puts' writes this data out to a CSV file. At present I get a
list of all the titles in the XML file followed a list of the dates.
What I need it to get the side by side in a CSV file like so

Title Date
Item1 20th Jan 2009
Item2 12th Feb 2010

Does anyone have any suggestions on a suitable workflow for this?

Many thanks

Robert Klemme · Sep 16, 2010

Hey all,

I would like to pick your brains about Rexml and how to report from it.
For example, I am reading an XML file using references to each XML tag
like so:

doc.root.each_element("/UserData/List/ItemInfo/Title") {|e|
=A0report.puts "Title: #{e.text}"
}
doc.root.each_element("/UserData/List/ItemInfo/Date") {|e|
=A0report.puts "Date: #{e.text}"
}

The 'report.puts' writes this data out to a CSV file. At present I get a
list of all the titles in the XML file followed a list of the dates.
What I need it to get the side by side in a CSV file like so

Title =A0 =A0 =A0 =A0 =A0 =A0 Date
Item1 =A0 =A0 =A0 =A0 =A0 =A0 20th Jan 2009
Item2 =A0 =A0 =A0 =A0 =A0 =A0 12th Feb 2010

Does anyone have any suggestions on a suitable workflow for this?

Just iterate over all "ItemInfo" elements and print values from sub
elements (which you can select via a relative XPath).

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Stuart Clarke · Sep 17, 2010

Robert said:
Just iterate over all "ItemInfo" elements and print values from sub
elements (which you can select via a relative XPath).

Kind regards

robert

Thanks for getting back to me. I will look into this and see how I get
on.

Thanks a lot Robert.

Stuart Clarke · Sep 20, 2010

Robert said:
Just iterate over all "ItemInfo" elements and print values from sub
elements (which you can select via a relative XPath).

Kind regards

robert

To confirm I am following you correctly, I have now got the following:

info = doc.elements.to_a("//UserData/List/ItemInfo/")

Printing out info gives a line per line entry of all children under the
tag ItemInfo.

First of all, is this what you meant? Am I correct to assume that at
this point, you are suggesting I write this data to a CSV file stripping
off the tags with a regex or something? Is this correct?

Many thanks and apologies if I have misunderstood.

brabuhr · Sep 20, 2010

I managed to mess-up clicking "Send" on Friday, so I'm trying again (-:

Stuart said:
Stuart said:

Thanks for getting back to me. I will look into this and see how I get
on.

Click to expand...

I pulled this out of a script I use quite a bit and hacked your XPath int= o it:

ARGV.each do |filename|
=A0 doc =3D REXML:ocument.new( File.new( filename ) )

=A0 doc.elements.each("/UserData/List/ItemInfo"){|e|
=A0 =A0print e.elements["Title"].text, "\t"
=A0 =A0puts e.elements["Date"].text
=A0end
end

Stuart Clarke · Oct 26, 2010

Could anybody help me with an issue you I am having with some XML I am
reading. I am using xpath to read 2 different parts of an XML file,
which looks a lot like this

<Data>
<DoneList><Vector><Count>84</Count>
<FullItemInfo>
<Count>0</Count>
<ItemInfo>
<Title>BLAH LAH</Title>
<Id>12345</Id>
</ItemInfo>
</Vector></DoneList>
<FullItemInfo>
NEXT ITEM AS BOVE

Then I have further data, which is slightly different
<NotDoneList><Vector><Count>84</Count>
<FullItemInfo>
<Count>0</Count>
<ItemInfo>
<Title>BLAH LAH</Title>
<Id>12345</Id>
</ItemInfo>
</Vector></DoneList>
<FullItemInfo>
</Data>

As you can see, the tags are the same but the first is DoneList and the
second NotDoneList. I need to process each set seperately and each set
can contain more than 1 entry. My code to give a CSV file is

doc = REXML:

ocument.new(d) #call REXML to open the XML file
#To get NotDoneList data
doc.elements.each("//NotDoneList/Vector/Count/FullItemInfo") do |e|
detail =
(
e.elements['ItemInfo/Title'].text << "," <<
e.elements['ItemInfo/Id'].text
)
puts detail
end

#To get DoneList data
doc.elements.each("//DoneList/Vector/Count/FullItemInfo") do |e|
detail =
(
e.elements['ItemInfo/Title'].text << "," <<
e.elements['ItemInfo/Id'].text
)
puts detail
end

When I run this, no data in extracted and no errors are given. In
contrast if I do
doc.elements.each("//FullItemInfo") do |e|
I am able to extract all the information for both the NotDoneList and
DoneList, however this is not what I want. I want to address each data
set separately. The eventual idea will be to produce a report of all
items in the NotDoneList and another report for those in the DoneList.
I guess I am doing something wrong but I cannot see it.

Can anyone see what I am doing wrong with this? I would really
appreciate any help as I cannot figure it out.

Many thanks

brabuhr · Oct 27, 2010

<DoneList><Vector><Count>84</Count>
<FullItemInfo>
<Count>0</Count>
<ItemInfo>
<Title>BLAH LAH</Title>
<Id>12345</Id>
</ItemInfo>
</Vector></DoneList>
<FullItemInfo>
NEXT ITEM AS BOVE

Data
DoneList
Vector
Count /Count
FullItemInfo
Count /Count
ItemInfo
Title /Title
Id /Id
/ItemInfo
/Vector
/DoneList
FullItemInfo

The XML example you provided seems to have mismatched tags?

Then I have further data, which is slightly different
<NotDoneList><Vector><Count>84</Count>
<FullItemInfo>
<Count>0</Count>
<ItemInfo>
<Title>BLAH LAH</Title>
<Id>12345</Id>
</ItemInfo>
</Vector></DoneList>
<FullItemInfo>
</Data>

NotDoneList
Vector
Count /Count
FullItemInfo
Count /Count
ItemInfo
Title /Title
Id /Id
/ItemInfo
/Vector
/DoneList
FullItemInfo
/Data

doc.elements.each("//NotDoneList/Vector/Count/FullItemInfo")
doc.elements.each("//DoneList/Vector/Count/FullItemInfo") do |e|

Can you verify and re-post a clean XML snippet? (That may help debug
your XPath.) I'm going to guess:

<Data>
<DoneList>
<Vector>
<Count/>
<FullItemInfo/>
</Vector>
</DoneList>
</Data>

In which case, the XPath might be: '//DoneList/Vector/FullItemInfo'?

Stuart Clarke · Oct 29, 2010

My issue was due to the mis matched tags actually, it was a broken XML
file.

Thanks for identifying that.

XML to CSV with REXML - I'm sure this should be easy...	7	Mar 17, 2009
Errors on REXML reading an HTML.	1	Dec 24, 2010
Connected SQLite to my java program but information are not submitted	2	Aug 2, 2022
ruby rexml stream mode	4	Jun 22, 2010
print header for output	0	Jun 19, 2011
Advice on this code	1	Oct 12, 2005
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007
want to read all elements of xml using jdom ?	2	Sep 28, 2006

REXML advice - output

Stuart Clarke

Robert Klemme

Stuart Clarke

Stuart Clarke

brabuhr

Stuart Clarke

brabuhr

Stuart Clarke

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads