XML Parsing

X

xhoster

I'm currently using XML::parser to process some XML. I am wondering what
else is out there. A search of CPAN on 'XML' reveals that there is way
more out there than I can possible evaluate, but that most of it is not
general XML processing modules, but rather for very specific tasks which
just happen to involve XML.

I'd like to be familiar with the major alternatives to XML::parser, without
having to read 3689 different perldoc pages to find those major
alternatives. So, what are your favorite Perl modules for general-purpose
XML processing?

Thanks,

Xho
 
H

Henry Law

I'd like to be familiar with the major alternatives to XML::parser, without
having to read 3689 different perldoc pages to find those major
alternatives. So, what are your favorite Perl modules for general-purpose
XML processing?

And I thought it was just me, being easily confused! For my part I use
XML:Simple occasionally (for config files and straightforward things
like that), but more often XML::Twig.
 
C

ced

I'm currently using XML::parser to process some XML. I am wondering what
else is out there. A search of CPAN on 'XML' reveals that there is way
more out there than I can possible evaluate, but that most of it is not
general XML processing modules, but rather for very specific tasks which
just happen to involve XML.

I'd like to be familiar with the major alternatives to XML::parser, without
having to read 3689 different perldoc pages to find those major
alternatives. So, what are your favorite Perl modules for general-purpose
XML processing?

XML::libXML is an alternative (I seem to recall somewhere that
XML::libXML
had some advantages over XML::parser but details escape me)
 
M

Michel Rodriguez

I'm currently using XML::parser to process some XML. I am wondering what
else is out there. A search of CPAN on 'XML' reveals that there is way
more out there than I can possible evaluate, but that most of it is not
general XML processing modules, but rather for very specific tasks which
just happen to involve XML.

I'd like to be familiar with the major alternatives to XML::parser, without
having to read 3689 different perldoc pages to find those major
alternatives. So, what are your favorite Perl modules for general-purpose
XML processing?

It depends what you are looking for:

XML::Simple is often enough to load XML data into a Perl data structure,
XML::LibXML gives you the speed and power of libxml2
XML::Twig is perlish and convenient (and written by me ;--)
XML::SAX::Machine gives you a framework for stream processing

I guess those are the ones I would recomment these days

The Perl XML FAQ ( http://perl-xml.sourceforge.net/faq/ ) has some more
information. I also have a few articles on my site that show examples of
using the various modules: http://www.xmltwig.com/article/index_wtr.html
 
R

robic0

I'm currently using XML::parser to process some XML. I am wondering what
else is out there. A search of CPAN on 'XML' reveals that there is way
more out there than I can possible evaluate, but that most of it is not
general XML processing modules, but rather for very specific tasks which
just happen to involve XML.

I'd like to be familiar with the major alternatives to XML::parser, without
having to read 3689 different perldoc pages to find those major
alternatives. So, what are your favorite Perl modules for general-purpose
XML processing?

Thanks,

Xho
I guess I'll try to revisit your post even it being 2 weeks later.
It isin't possible to know from your questions what would be good for
you. I use multiple methods to manipulate XML. In Perl, thats just the
way it is right now. Theres simple xml and there is complex "nested
entities" xml. Sometimes knowing what it is your trying to do helps
better. In reality, there is no "one source" xml solution.
Some of the basic concerns are if your reading/writing, validating,
DOM and/or SAX. The current trend for reading is SAX a "roll your
own" event driven solution as opposed to the "node" approach of a
DOM.
With SAX, (XML:parser, I use Expat which is a layer above I think)
you can set handlers for start (with attributes)/end tags as well
as content handlers and every w3c entity you wish. Be careful as this
is easy to process with simple closures (non-nested entities).
For example, lets say a known, compound (nested) structure is
coming down the pipe. After you start filling the container, you know
a certain inner container tag of <BigTag><tag1>content</tag1>
<tag2>content</tag2><tag3>content</tag3><tag4>content</tag4></BigTag>
sequence. As soon as the "BigTag" is trigged you start accumulating
everything between <BigTag> and </BigTag>. When you have the string,
you pass it to Simple to create a hash array that gets embedded
into you containter structure, the "tag#" being the key, the content
the value.
So, you have to break it up this way and use what is known. Simple
will work on simple xml (non-nested entities) to get what your
interested in. Simple has to be tweaked too.
While all this is going on, parser calls have to be wrapped in eval's
and acted upon to trap errors.
For "unknown" casual reading of xml for display purposes, use
Microsoft browser (it dosen't use Perl).
I use ActiveState's perl and use Expat, Simple, Xerces (Apache).
I also use them with Perl2Exe (-tiny) on a commercial level app.
So I guess you have to ask yourself what it is exactly you want
to do. Perl and XML don't fit hand in glove, if your looking for
something "quick & dirty" you will not find the solution in Perl.
Hope this helps a little....
 
R

robic0

I guess I'll try to revisit your post even it being 2 weeks later.
It isin't possible to know from your questions what would be good for
you. I use multiple methods to manipulate XML. In Perl, thats just the
way it is right now. Theres simple xml and there is complex "nested
entities" xml. Sometimes knowing what it is your trying to do helps
better. In reality, there is no "one source" xml solution.
Some of the basic concerns are if your reading/writing, validating,
DOM and/or SAX. The current trend for reading is SAX a "roll your
own" event driven solution as opposed to the "node" approach of a
DOM.
With SAX, (XML:parser, I use Expat which is a layer above I think)
you can set handlers for start (with attributes)/end tags as well
as content handlers and every w3c entity you wish. Be careful as this
is easy to process with simple closures (non-nested entities).
For example, lets say a known, compound (nested) structure is
coming down the pipe. After you start filling the container, you know
a certain inner container tag of <BigTag><tag1>content</tag1>
<tag2>content</tag2><tag3>content</tag3><tag4>content</tag4></BigTag>
sequence. As soon as the "BigTag" is trigged you start accumulating
everything between <BigTag> and </BigTag>. When you have the string,
you pass it to Simple to create a hash array that gets embedded
into you containter structure, the "tag#" being the key, the content
the value.
So, you have to break it up this way and use what is known. Simple
will work on simple xml (non-nested entities) to get what your
interested in. Simple has to be tweaked too.
While all this is going on, parser calls have to be wrapped in eval's
and acted upon to trap errors.
For "unknown" casual reading of xml for display purposes, use
Microsoft browser (it dosen't use Perl).
I use ActiveState's perl and use Expat, Simple, Xerces (Apache).
I also use them with Perl2Exe (-tiny) on a commercial level app.
So I guess you have to ask yourself what it is exactly you want
to do. Perl and XML don't fit hand in glove, if your looking for
something "quick & dirty" you will not find the solution in Perl.
Hope this helps a little....
Followup, you might find the "quick & dirty" in Perl, but it
won't be the complete answer. I have found the complete "read"
answer in Perl including validation (using Xerces). I would
only use Xerces in the "write" answer and not Simple or
any other solution. To use Xerces, you have to scoure the
C++ source code and its hit and miss, mostly miss. As
of this date, schema checking works excelent using Xerces
pm interface. It took me so much time to test interface
(but only as it pertains to schema) I may revisit C proto's
and publish (where author just used a bot creator) actual
implemtation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top