List all elements in xml with full paths

F

fazl.rahman

Hi

I've not been able to find nor concoct a simple filter to convert an
xml file into a list of all its elements (with full paths). This
seems to be a simple requirement...!

As an example I would like this input ...:

<?xml version="1.0"?>
<Person>
<FirstName>Elvis</FirstName>
<LastName>Presley</LastName>
</Person>

....To result in this output :

/Person
/Person/FirstName
/Person/LastName

Any gurus care to share a concise xsl that does this ?

I happened across the 'navigation shell' feature of xmllint during my
fumbling attempts recently - so, lacking a suitbale xslt script, I
guess I could resort to post-processing the output of the shell's "du"
command, which looks like this:

$ xmllint --shell Person.xml
/ > du
/
Person
FirstName
LastName

(If the navigating shell from xmllint incorporated a unix-like 'find'
command, sheesh.. that would be so cool.)

Appreciate (any friendly :) feedback..
Thanks, Fazl
 
M

Martin Honnen

I've not been able to find nor concoct a simple filter to convert an
xml file into a list of all its elements (with full paths). This
seems to be a simple requirement...!

As an example I would like this input ...:

<?xml version="1.0"?>
<Person>
<FirstName>Elvis</FirstName>
<LastName>Presley</LastName>
</Person>

...To result in this output :

/Person
/Person/FirstName
/Person/LastName

Any gurus care to share a concise xsl that does this ?

Well what do you want to generate if there are namespaces used? What do
you want to generate if there are several FirstName and/or LastName
child elements?
 
J

Jürgen Kahrs

I've not been able to find nor concoct a simple filter to convert an
xml file into a list of all its elements (with full paths). This
seems to be a simple requirement...!

As an example I would like this input ...:

<?xml version="1.0"?>
<Person>
<FirstName>Elvis</FirstName>
<LastName>Presley</LastName>
</Person>

...To result in this output :

/Person
/Person/FirstName
/Person/LastName

You might consider using XMLgawk:

$ xgawk -lxml 'XMLSTARTELEM {print XMLPATH}' persons.xml
/Person
/Person/FirstName
/Person/LastName
 
J

Jürgen Kahrs

Correctly indented, it should look like this:

$ xgawk -lxml 'XMLSTARTELEM {print XMLPATH}' persons.xml
/Person
/Person/FirstName
/Person/LastName
 
F

fazl.rahman

Well what do you want to generate if there are namespaces used? What do
you want to generate if there are several FirstName and/or LastName
child elements?

I think I'd like to see this kind of output (if these elements are in
a namespace Foo)

Eg:

/
/Foo:person
/Foo:person/FirstName
/Foo:person/LastName

If there are several child elements, I want to see them all listed in
document order.

-Fazl
 
F

fazl.rahman

<xsl:for-each select="ancestor-or-self::*">
    <xsl:value-of select="concat('/',name())"/>
    <!--xsl:value-of
select="concat('[',count(preceding-sibling::*[name()=name(current())])+1,']­')"/-->

sz.

I tried wrapping this suggestion in an <xsl:transform> tag and invoked
it thus:

$ xsltproc.exe pathify.xsl Person.xml

Well, a blank line comes out.

:-(

Then I noticed that your for-each tag is missing it's closing tag.

Adding that, and adding a <xsl:template> container too also gives no
output.

Is this the kind of xslt script you mean by your code snippet ?

<?xml version="1.0"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:eek:utput method="text"/>
<xsl:template match="*">

<xsl:for-each select="ancestor-or-self::*">
<xsl:value-of select="concat('/',name())"/>
</xsl:for-each>

</xsl:template>
</xsl:transform>


(If you tested it, can you share the actual script with us ?)

What am I doing wrong ?

Fazl
 
F

fazl.rahman

On 20 Jun., 18:15, "szomiz" <[email protected]> [...]
Then I noticed that your for-each tag is missing it's closing tag. [...]
What am I doing wrong ?

After a little tinkering (xsltproc's -v switch is useful), the
transform below delivers what i wanted. (BTW I expect to find this
useful to quickly check whether an element with a given name appears
in a large xml file and if so, at what level of nesting.)

I had to put your code in an xsl:template matching "node()" and
calling itself on the current node's contents after performing it's
ancestral rites. Note the xsl:text element I added to force a new
line for each new element output - Is there a cleaner way to do
this ?

I don't like having to break the indentation of the script, but if I
line up the end tag with the start tag it indents the output by the
same amount of space.

Thanks, Fazl

PS Here's how it looks now:

fazl@ubuntu:~/bin/xslt$ cat Person.xml
<?xml version="1.0"?>
<Person>
<FirstName>Elvis</FirstName>
<LastName>Presley</LastName>
</Person>

fazl@ubuntu:~/bin/xslt$ cat szomiz.xsl
<?xml version="1.0"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:eek:utput method="text" />

<xsl:template match="node()">
<xsl:for-each select="ancestor-or-self::*">
<xsl:value-of select="concat('/',name())"/>
</xsl:for-each>
<xsl:text>
</xsl:text>

<xsl:apply-templates select="*"/>
</xsl:template>

</xsl:transform>

fazl@ubuntu:~/bin/xslt$ xsltproc szomiz.xsl Person.xml
/Person
/Person/FirstName
/Person/LastName
 
F

fazl.rahman

<xsl:template match="*">
    <xsl:param name="parPath"/>
    <xsl:variable name="locPath" select="concat($parPath,'/',name())"/>
    <xsl:value-of select="concat($locPath,'
&#10')"/>
    <xsl:apply-templates>
        <xsl:with-param name="parPath" select="$locPath"/>
    </
</

Well, I don't want to sound too ungrateful but this doesn't really
appeal over the one I managed to put together (with help from your
earlier hint - thanks).

- It's not shorter.
- Its not simpler.
- And its not working...

:)

Beyond the broken (obviously, even to me) closing tags, semicolons
seem to be missing in the newline reference.

In fact, simply replacing my:

<xsl:text>
</xsl:text>

with :

<xsl:text>
</xsl:text>

is enough ( and not only is shorter, it also works :)

[node()] = [* or text() or comment() or processing-instrction()]

This I *didn't* know. So in XSLT, * means something less general than
node() ?

I find this less than intuitive, *surprising* is the word.

But the funniest thing is, to spit out a list of all directories
under / in unix, all you need type is :

find / -type d

Now, XML is basically a tree of elements encoded in text form -- and
XSLT is supposed to be the 'native' way to process XML, yet it takes
an 8-line template (embedded inside a four-line wrapper to make it a
script) to do this simple task.

(Actually I don't mind the number of lines so much as the fact that it
just looks greek compared to "list all element xpaths".)

Am I the only one who suspects there is something rotten in the land
of XSLT ?
(Am I going to get flamed for heretical talk ? :)

Anyway, I'd be interested if anyone can put together a 'better' xslt
version than below ?
(Better includes 'shorter', 'easier to understand', but not 'really
cool AND really cryptic'..).

fazl@ubuntu:~/bin/xslt$ cat szomiz.xsl
<?xml version="1.0"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:eek:utput method="text" />

<xsl:template match="*">
<xsl:for-each select="ancestor-or-self::*">
<xsl:value-of select="concat('/',name())"/>
</xsl:for-each>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="*"/>
</xsl:template>
</xsl:transform>

fazl@ubuntu:~/bin/xslt$ cat Person.xml
<?xml version="1.0" standalone='yes'?>
<Person>
<FirstName>Elvis</FirstName>
<LastName>Presley</LastName>
</Person>

fazl@ubuntu:~/bin/xslt$ xsltproc szomiz.xsl Person.xml
/Person
/Person/FirstName
/Person/LastName


Thanks, Fazl
 
H

Hermann Peifer

I find this less than intuitive, *surprising* is the word.

But the funniest thing is, to spit out a list of all directories
under / in unix, all you need type is :

find / -type d
(...)

Am I the only one who suspects there is something rotten in the land
of XSLT ?
(Am I going to get flamed for heretical talk ? :)

If you feel that XSLT might not be the best choice for this task, you could give it a try with xmlgawk and use the one-liner (half-liner ;-) posted earlier by Juergen:

xgawk -lxml 'XMLSTARTELEM {print XMLPATH}'

For me, this comes closest to your find command example.

Hermann
 
P

Peter Flynn

I think I'd like to see this kind of output (if these elements are in
a namespace Foo)

Eg:

/
/Foo:person
/Foo:person/FirstName
/Foo:person/LastName

If there are several child elements, I want to see them all listed in
document order.

-Fazl

If you don't have xmlgawk, this works with SP and regular awk or gawk:

$ onsgmls -wxml xml.dcl myfile.xml | grep '^[()]' | awk '/\(/ {path=path
"/" substr($0,2)} /\)/ {x=gsub("/[^/]+$","",path)} {print path}'

///Peter
 
J

Joseph J. Kesselman

Martin said:
Well what do you want to generate if there are namespaces used?

The portable answer would be to move the name tests into the predicates,
using something like *[localname()="bar" and
namespaceuri()="http://my-ns"] (or @*[....] for an attribute, of
course). More than I wanted to explain in the original article, and the
editors couldn't be persuaded to let me go back and attach a sidebar later.
What do you want to generate if there are several FirstName and/or LastName
child elements?

The generator I pointed to does the right thing in that regard by
attaching a positional predicate to distinguish which one was intended.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top