Will XML::Simple work with keys, strings, integers, and dates?

W

Wes Barris

Hi,

I am trying to use XML::Simple to parse an xml file. However, the xml file
that I am trying to parse is not in the same format as any of the XML::Simple
examples that I have seen. In all of the examples I have seen, the xml tags
are specific to their contents. In my xml file, the tag names are generic.
Here is a short sample of the xml that I am trying to parse:

<dict>
<key>35</key>
<dict>
<key>Track ID</key><integer>35</integer>
<key>Name</key><string>Earache My Eye (Full Version)</string>
<key>Artist</key><string>Alice Bowie</string>
<key>Genre</key><string>Specialty Rock</string>
<key>Kind</key><string>MPEG audio file</string>
<key>Size</key><integer>6459519</integer>
<key>Total Time</key><integer>322951</integer>
<key>Date Modified</key><date>2005-02-16T12:03:00Z</date>
<key>Date Added</key><date>2005-02-16T11:59:14Z</date>
<key>Bit Rate</key><integer>160</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Track Type</key><string>File</string>

<key>Location</key><string>file://localhost/M:/1970s/Alice%20Bowie%20-%20Earache%20My%20Eye.mp3/</string>
<key>File Folder Count</key><integer>2</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>
<key>36</key>
<dict>
<key>Track ID</key><integer>36</integer>
<key>Name</key><string>Earache My Eye</string>
<key>Artist</key><string>Cheech & Chong</string>
<key>Genre</key><string>Specialty Rock</string>
<key>Kind</key><string>MPEG audio file</string>
<key>Size</key><integer>1875968</integer>
<key>Total Time</key><integer>156204</integer>
<key>Date Modified</key><date>2005-02-16T12:03:21Z</date>
<key>Date Added</key><date>2005-02-16T11:59:15Z</date>
<key>Bit Rate</key><integer>96</integer>
<key>Sample Rate</key><integer>32000</integer>
<key>Track Type</key><string>File</string>

<key>Location</key><string>file://localhost/M:/1970s/Cheech%20&%20Chong%20-%20Earache%20My%20Eye.mp3/</string>
<key>File Folder Count</key><integer>2</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>

I would like to be able to extract things like the "Name", "Artist", and
"Location" but I don't understand how to associate one of the elements of
the key array with one of the elements of the resulting string array.
 
J

John Bokma

Jim said:
<dict>
<key>35</key>
<dict>
<key>Track ID</key><integer>35</integer>
<key>Name</key><string>Earache My Eye (Full Version)</string>
<key>Artist</key><string>Alice Bowie</string>

[XML lines snipped]

Is that really your XML? You have nested tags with the same name:
<dict>. You also have <key> tags at different levels. That is going to
make parsing more difficult.

Nop, for an XML parser that really doesn't matter.
</dict>
<key>36</key>
<dict>
<key>Track ID</key><integer>36</integer>
<key>Name</key><string>Earache My Eye</string>
<key>Artist</key><string>Cheech & Chong</string>

[more lines snipped]
</dict>

I would like to be able to extract things like the "Name", "Artist",
and "Location" but I don't understand how to associate one of the
elements of the key array with one of the elements of the resulting
string array.

You have some very poorly designed XML there. It would be better if it
were something like

<attribute name="Track ID" value="35"> ...

In some cases that will be considered poor design too. Some people
recommend not to use attributes but elements unless it's meta info on
the element itself.

Better: <track>36</track>
<name>.......</name>
If you cannot change the XML definition, then you are probably better
off using a SAX parser. XML::SAX::purePerl works, but it is slow. For
big files, try XML::parser and the expat library. I found it about 75
times faster in my one use.

In a SAX parser, you define a handler package with callbacks that are
called for each element in the XML. Then, you will be able to
associate the <key> value with the subsequent value attribute because
you will get the callbacks sequentially.

Extremely cumbersome, since, as you already noted the use of elements at
different levels.

I would probably use XML::Twig for this.
 
K

ko

Wes said:
Hi,

I am trying to use XML::Simple to parse an xml file.

XML::LibXML is nice - I use it parse HTML too.

[snip sample data, included below with code]
I would like to be able to extract things like the "Name", "Artist", and
"Location" but I don't understand how to associate one of the elements of
the key array with one of the elements of the resulting string array.

Quick example, using the sample data you posted:

use strict;
use warnings;
use Data::Dumper;
use XML::LibXML;

my $x = do { local $/; <DATA>; };
my $p = XML::LibXML->new;
my $d = $p->parse_string($x);
my $track = 'Track ID';
my %wanted = map { $_ => 1 } ($track, qw[Name Artist Location] );
my $data;
my $current_track = '';
foreach my $n ( $d->findnodes('//key') ) {
my $t = $n->textContent;
next unless $wanted{$t};
my $v = $n->nextSibling->textContent;
if ($t eq $track) {
$data->{$v} = {};
$current_track = $v;
} else {
$data->{$current_track}{$t} = $v;
}
}

print Dumper($data);

__DATA__
<dict>
<key>35</key>
<dict>
<key>Track ID</key><integer>35</integer>
<key>Name</key><string>Earache My Eye (Full
Version)</string>
<key>Artist</key><string>Alice Bowie</string>
<key>Genre</key><string>Specialty Rock</string>
<key>Kind</key><string>MPEG audio file</string>
<key>Size</key><integer>6459519</integer>
<key>Total Time</key><integer>322951</integer>
<key>Date
Modified</key><date>2005-02-16T12:03:00Z</date>
<key>Date
Added</key><date>2005-02-16T11:59:14Z</date>
<key>Bit Rate</key><integer>160</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Track Type</key><string>File</string>

<key>Location</key><string>file://localhost/M:/1970s/Alice%20Bowie%20-%20Earache%20My%20Eye.mp3/</string>
<key>File Folder Count</key><integer>2</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>
<key>36</key>
<dict>
<key>Track ID</key><integer>36</integer>
<key>Name</key><string>Earache My Eye</string>
<key>Artist</key><string>Cheech &
Chong</string>
<key>Genre</key><string>Specialty Rock</string>
<key>Kind</key><string>MPEG audio file</string>
<key>Size</key><integer>1875968</integer>
<key>Total Time</key><integer>156204</integer>
<key>Date
Modified</key><date>2005-02-16T12:03:21Z</date>
<key>Date
Added</key><date>2005-02-16T11:59:15Z</date>
<key>Bit Rate</key><integer>96</integer>
<key>Sample Rate</key><integer>32000</integer>
<key>Track Type</key><string>File</string>

<key>Location</key><string>file://localhost/M:/1970s/Cheech%20&%20Chong%20-%20Earache%20My%20Eye.mp3/</string>
<key>File Folder Count</key><integer>2</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>
</dict>


HTH - keith
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top