Tip: Finding & counting unique nodes in XSL

V

Victor Engmark

When looking for a method to fetch unique elements and counting the
number of occurences of each of them, I found quite a lot of gross
examples of complex XSL. But after realizing the subtle difference
between "." and "current()", I found a neat way of doing the same
without keys or generate-id():

<xsl:template match="/">
<!-- Selects all "new" elements -->
<xsl:for-each select="//Name[not(.=preceding::Name)]">
<!-- Display the element -->
<xsl:value-of select="."/>
<!-- Count the number of occurences of the element -->
<xsl:value-of select="count(//Name[.=current()])"/>
</xsl:for-each>
</xsl:template>

The clue to why the last "value-of" works is that "." refers dynamically
to each of the "Name" elements in the file, while "current()" refers to
the "Name" element in the sorrounding "for-each" element.

If YOU have found a "neat" way of doing something with XSL, it would be
great if you could post it here as well, preferably as a separate thread
with something like "Tip:" in the beginning of the subject.
 
P

Piet Blok

When looking for a method to fetch unique elements and counting the
number of occurences of each of them, I found quite a lot of gross
examples of complex XSL. But after realizing the subtle difference
between "." and "current()", I found a neat way of doing the same
without keys or generate-id():

<xsl:template match="/">
<!-- Selects all "new" elements -->
<xsl:for-each select="//Name[not(.=preceding::Name)]">
<!-- Display the element -->
<xsl:value-of select="."/>
<!-- Count the number of occurences of the element -->
<xsl:value-of select="count(//Name[.=current()])"/>
</xsl:for-each>
</xsl:template>

The clue to why the last "value-of" works is that "." refers dynamically
to each of the "Name" elements in the file, while "current()" refers to
the "Name" element in the sorrounding "for-each" element.

If YOU have found a "neat" way of doing something with XSL, it would be
great if you could post it here as well, preferably as a separate thread
with something like "Tip:" in the beginning of the subject.

Thanks Victor, a very neat solution. Easier to apply than the keys and
generate-id() approach.
 
P

Piet Blok

Piet Blok said:
When looking for a method to fetch unique elements and counting the
number of occurences of each of them, I found quite a lot of gross
examples of complex XSL. But after realizing the subtle difference
between "." and "current()", I found a neat way of doing the same
without keys or generate-id():

<xsl:template match="/">
<!-- Selects all "new" elements -->
<xsl:for-each select="//Name[not(.=preceding::Name)]">
<!-- Display the element -->
<xsl:value-of select="."/>
<!-- Count the number of occurences of the element -->
<xsl:value-of select="count(//Name[.=current()])"/>
</xsl:for-each>
</xsl:template>

The clue to why the last "value-of" works is that "." refers
dynamically to each of the "Name" elements in the file, while
"current()" refers to the "Name" element in the sorrounding
"for-each" element.

If YOU have found a "neat" way of doing something with XSL, it would
be great if you could post it here as well, preferably as a separate
thread with something like "Tip:" in the beginning of the subject.

Thanks Victor, a very neat solution. Easier to apply than the keys and
generate-id() approach.

Victor, I tried to enhance one of my XSL stylesheets that I made awhile
ago to create a birthday calendar. To my own surprise I found that I
already abandoned the use of keys and generate-id() (however not so nice
and clean as you did). My problem was that I had to check on only a part
of some value (the month number that was embedded between year and day
in the same tag). Therefore I had to do some string manipulation.

I changed that implementation today and now I use the preceding:: axes.
So it looks already better now.

However, I just cannot get rid of an ugly xsl:if. In my solution I
select all elements and then I decide if the current element is a "new"
element. See below:


<xsl:template match="/">
<!-- select all Name elements -->
<xsl:for-each select="//Name">
<!-- test if this Name element is new, based on the first character -->
<xsl:if test="not(substring(current(),1,1) = substring
(preceding::Name[substring(.,1,1)=substring(current(),1,1)],1,1))">
<!-- Display the first character -->
<xsl:value-of select="substring(.,1,1)"/>
<!-- Count occurences of element with the same starting character -->
<xsl:value-of select="count(//Name[substring(.,1,1)
=substring(current(),1,1)])"/>
</xsl:if>
</xsl:for-each>
</xsl:template>

Can you think of any way to move the if structure into the select
clause?

In a way the problem seems to be that there are three contexts to deal
with: the context node (in this case the root node), the node that is
currently under investigation for selection and the node that is
compared to (between the square brackets).

Any thoughts would be welcome.

Piet
 
V

Victor

Piet said:
However, I just cannot get rid of an ugly xsl:if. In my solution I
select all elements and then I decide if the current element is a "new"
element. See below:


<xsl:template match="/">
<!-- select all Name elements -->
<xsl:for-each select="//Name">
<!-- test if this Name element is new, based on the first character -->
<xsl:if test="not(substring(current(),1,1) = substring
(preceding::Name[substring(.,1,1)=substring(current(),1,1)],1,1))">
<!-- Display the first character -->
<xsl:value-of select="substring(.,1,1)"/>
<!-- Count occurences of element with the same starting character -->
<xsl:value-of select="count(//Name[substring(.,1,1)
=substring(current(),1,1)])"/>
</xsl:if>
</xsl:for-each>
</xsl:template>

Can you think of any way to move the if structure into the select
clause?

In a way the problem seems to be that there are three contexts to deal
with: the context node (in this case the root node), the node that is
currently under investigation for selection and the node that is
compared to (between the square brackets).

First, if you can split the Name into several strings, that would be a
good start. If you have control over the structure of the XML file, keep
each piece of information separate. Otherwise you'll get into the same
mess I was in when trying to make sense of MS Project XML files (gross!).

Then there is a funny thing about the test (changed the substring
function to a() to make it obvious):
a(current) = a(preceding::Name[a(.)=a(current)])
So first you find a preceding Name which equals the current one, and
then compare it to the current one? This sounds like double work.
a(current) = a(preceding::Name) should be enough.

By changing according to the previous paragraphs, you would have
something like a Name element and a New element, and you can do the
for-each as follows (providing that the New element occurs after the Name):
<xsl:for-each select="//Name[not(./following-sibling::New =
preceding::Name/following-sibling::New)]">

I haven't tested this, but perhaps someone with more XSL experience can
point out any obvious errors.
 
P

Piet Blok

Piet said:
However, I just cannot get rid of an ugly xsl:if. In my solution I
select all elements and then I decide if the current element is a
"new" element. See below:


<xsl:template match="/">
<!-- select all Name elements -->
<xsl:for-each select="//Name">
<!-- test if this Name element is new, based on the first character
-->
<xsl:if test="not(substring(current(),1,1) =
substring
(preceding::Name[substring(.,1,1)=substring(current(),1,1)],1,1))">
<!-- Display the first character -->
<xsl:value-of select="substring(.,1,1)"/>
<!-- Count occurences of element with the same starting character -->
<xsl:value-of
select="count(//Name[substring(.,1,1)
=substring(current(),1,1)])"/>
</xsl:if>
</xsl:for-each>
</xsl:template>

Can you think of any way to move the if structure into the select
clause?

In a way the problem seems to be that there are three contexts to
deal with: the context node (in this case the root node), the node
that is currently under investigation for selection and the node that
is compared to (between the square brackets).

First, if you can split the Name into several strings, that would be a
good start. If you have control over the structure of the XML file,
keep each piece of information separate. Otherwise you'll get into the
same mess I was in when trying to make sense of MS Project XML files
(gross!).

Then there is a funny thing about the test (changed the substring
function to a() to make it obvious):
a(current) = a(preceding::Name[a(.)=a(current)])
So first you find a preceding Name which equals the current one, and
then compare it to the current one? This sounds like double work.
a(current) = a(preceding::Name) should be enough.

By changing according to the previous paragraphs, you would have
something like a Name element and a New element, and you can do the
for-each as follows (providing that the New element occurs after the
Name): <xsl:for-each select="//Name[not(./following-sibling::New =
preceding::Name/following-sibling::New)]">

I haven't tested this, but perhaps someone with more XSL experience
can point out any obvious errors.

Thanks Victor,

To answer your first advise: yes I have full control over the XML files,
so I could decide to split up all information into separate elements
(most of my XML work is just experimental). However, there are reasons
why I sometimes delibarately choose not to do so. In the case I
presented as an example, I wanted to do something with the first
character of some element, typically to create an index or something
alike. In another case I tried to deal with a date, in order to create a
birthday calendar. In again another case I wanted to do something with a
weeknumber derived from a date from some XML element. What I want to
achieve is to construct XML files as simple as possible and then, when
need arise, create XSL transformation sheets for specific purposes.
Actually, when I am designing some XML format, I dont want to take into
account all possible uses I might make of it. It is exactly this what
makes XML so attractive to me. Naturally, since my XML formats are not
optimized on any specific use, my XSL stylesheets may from time to time
be somewhat complex.

The second part of your comment is something that needs more thinking,
so I cannot answer you instantly. I will study it and reply when I have
done so (these are things that I have to do in my spare time, mostly
weekends).

I greatly appreciate the neat solution you found for a common problem.
When I ever find something that may be as usefull as your TIP I
certainly will post it the way you did.

Piet
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top