Disambiguating separated node sets

A

Andy Dingley

I have a publishing application. The database layer queries various
sources and produces an XML document, then XSLT processes this into HTML
or RSS.

In this particular case, there are several queries for the latest
articles from various "chapters" (news, reviews etc), these are then
placed as sub-trees in the XML. I then need to generate a single list of
articles by selecting the articles from each sub-tree.

It's possible for an article to be "multi-homed", so it might appear in
two chapters. My output must filter these, so that an article appears no
more than once.

As the XML document contains duplicated articles (duplicates of single
database articles), it's not possible to use generate-id() here. Instead
I must use the @articleid attribute. I'm also finding it impractical to
use a preceding-sibling:: axis, because these articles come from
disjoint sub-trees and so aren't siblings. For platform portability
reasons, I'm reluctant to use node-set()

What's the best way to diambiguate these ?

At present it works, but the code is a mess. Rather than filtering them
and then passing the filtered set neatly to the output routine, I'm
having to pass the set with duplicates to a named template, then filter
it inside that, using position(). I'd prefer to decouple the filter and
the loop processing, for other reasons of good code structure.

Thanks for any comments


<xsl:variable name="items" select="$item-headline-article
| $items-news [position () &lt;= 4]
| $items-reviews [position () &lt;= 3]
| $items-competition" />

[...]

<xsl:for-each select="$items" >
<xsl:variable name="entry" select="." />
<xsl:variable name="articleid" select="$entry/@articleid" />
<xsl:variable name="idx" select="position ()" />

<xsl:if test="not ($entries [($articleid = ./@articleid)
and (position() &lt; $idx) ] ) " >

[...]
</xsl:if>
</xsl:for-each>
 
G

Gomolyako Eduard

I have the same problem and here is my issue:

xml:
<root>
<item id="1" />
<item id="2" />
<item id="3" />
<item id="2" />
</root>

As i understand you want get a kind of this:
<another_root>
<item id="1" />
<item id="2" />
<item id="3" />
</another_root>

xslt:

<xsl:template match="root">
<another_root>
<xsl:apply-templates select="item[1]">
<xsl:with-param name="handled-items" select="string('')" />
</xsl:apply-templates>
</another_root>
</xsl:template>

<xsl:template match="item">
<xsl:param name="handled-items" />

<xsl:variable name="id" select="concat('.', @id)" />

<xsl:if test="not(contains(string($handled-items), string($id)))">
<xsl:copy-of select="." />

<xsl:variable name="v-handled-items"
select="concat(string($handled-items), string($id))" />

<xsl:apply-templates select="following-sibling::item[1]">
<xsl:with-param name="handled-items"
select="string($v-handled-items)" />
</xsl:apply-templates>
</xsl:if>
</xsl:template>


I hope this helps you.

Best, Ed.


Andy said:
I have a publishing application. The database layer queries various
sources and produces an XML document, then XSLT processes this into HTML
or RSS.

In this particular case, there are several queries for the latest
articles from various "chapters" (news, reviews etc), these are then
placed as sub-trees in the XML. I then need to generate a single list of
articles by selecting the articles from each sub-tree.

It's possible for an article to be "multi-homed", so it might appear in
two chapters. My output must filter these, so that an article appears no
more than once.

As the XML document contains duplicated articles (duplicates of single
database articles), it's not possible to use generate-id() here. Instead
I must use the @articleid attribute. I'm also finding it impractical to
use a preceding-sibling:: axis, because these articles come from
disjoint sub-trees and so aren't siblings. For platform portability
reasons, I'm reluctant to use node-set()

What's the best way to diambiguate these ?

At present it works, but the code is a mess. Rather than filtering them
and then passing the filtered set neatly to the output routine, I'm
having to pass the set with duplicates to a named template, then filter
it inside that, using position(). I'd prefer to decouple the filter and
the loop processing, for other reasons of good code structure.

Thanks for any comments


<xsl:variable name="items" select="$item-headline-article
| $items-news [position () &lt;= 4]
| $items-reviews [position () &lt;= 3]
| $items-competition" />

[...]

<xsl:for-each select="$items" >
<xsl:variable name="entry" select="." />
<xsl:variable name="articleid" select="$entry/@articleid" />
<xsl:variable name="idx" select="position ()" />

<xsl:if test="not ($entries [($articleid = ./@articleid)
and (position() &lt; $idx) ] ) " >

[...]
</xsl:if>
</xsl:for-each>
 
D

Dimitre Novatchev

Why so many people forget to provide a source xml document (as minimal as
possible)?


Cheers,
Dimitre Novatchev.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,737
Latest member
Georgeengab

Latest Threads

Top