XSLT. How do I approach a complex grouping problem?

P

peterwilson_69

How do I approach a complex grouping problem?

The basic idea is to remove duplicates, and move unique items and
configurations to the top of the document.

Example XML Source:
<xml>
<segment>
<laser wavelength=â€532†/>
<detector polarizer=â€true†/>
<data />
<data />
<data />
<data />
<data />
</segment>

<segment>
<laser wavelength=â€532†/>
<detector polarizer=â€false†/>
<data />
<data />
<data />
<data />
<data />
</segment>

<segment>
<laser wavelength=â€846†/>
<detector polarizer=â€true†/>
<data />
<data />
<data />
<data />
<data />
</segment>

</xml>


Desired Result, where <uniqueid> is an auto generated value to provide
temporary links between values.
1) Find unique values for Laser and Detector; easy to do.
2) Find unique combinations of Laser and Detector  this is where I’m
getting stuck!
3) Output segments (and data) using a link to the combination’s unique
ID.

<xml>
<laser id=â€<uniqueid>†wavelength=â€532†/>
<laser id=â€<uniqueid>†wavelength=â€846†/>
<detector id=â€<uniqueid>†polarizer=â€true†/>
<detector id=â€<uniqueid>†polarizer=â€false†/>
<combination id=â€<uniqueid>†laserid=â€<link-to-above>â€
detectorid=â€<link-to-above>†/>
<combination id=â€<uniqueid>†laserid=â€<link-to-above>â€
detectorid=â€<link-to-above>†/>
<combination id=â€<uniqueid>†laserid=â€<link-to-above>â€
detectorid=â€<link-to-above>†/>
<segment using-combination=â€<link-to-above-combination†/>
<data />
<data />
<data />
<data />
<data />
<segment using-combination=â€<link-to-above-combination†/>
<data />
<data />
<data />
<data />
<data />
<segment using-combination=â€<link-to-above-combination†/>
<data />
<data />
<data />
<data />
<data />
</xml>

I am part-way there, but am getting stuck. Any help would be
appreciated.
 
J

Joe Kesselman

How do I approach a complex grouping problem?

You might want to start by reviewing standard solutions, such as those
in both the Grouping and Sorting sections of the XSLT FAQ

http://www.dpawson.co.uk/xsl/sect2/N4486.html
http://www.dpawson.co.uk/xsl/sect2/N6280.html

If one of those doesn't do it for you, post again and I'll take another
look. (I haven't reviewed your question in detail, but it didn't look
particularly unusual at first glance.)

XSLT 2.0 improves some of the sorting capabilities; the FAQ covers how
to accomplish things even if all you have is 1.0.
 
P

Pavel Lepin

(e-mail address removed) < (e-mail address removed)>
wrote in
The basic idea is to remove duplicates, and move unique
items and configurations to the top of the document.

Example XML Source:
[snipped]

Desired Result, where <uniqueid> is an auto generated
value to provide temporary links between values.
generate-id()

1) Find unique values for Laser and Detector; easy to do.
2) Find unique combinations of Laser and Detector  this
is where I’m getting stuck!

Search group archives for a thread with 'xpath select
distinct over 2 elements' in subject. I posted a solution
to a similar problem there.
<segment using-combination=â€<link-to-above-combination†/>
<data />
<data />
<data />
<data />
<data />

Now THAT is a VERY bad idea. You have hierarchical data. XML
is supremely suited to storing hierarchical data. Why would
you want to unroll that hierarchical data into this
abomination imitating flat data file, causing pain and
anguish for whoever is going to be using it?
 
P

peterwilson_69

You might want to start by reviewing standard solutions, such as those
in both the Grouping and Sorting sections of the XSLT FAQ

http://www.dpawson.co.uk/xsl/sect2/N4486.htmlhttp://www.dpawson.co.uk/xsl/sect2/N6280.html

If one of those doesn't do it for you, post again and I'll take another
look. (I haven't reviewed your question in detail, but it didn't look
particularly unusual at first glance.)

XSLT 2.0 improves some of the sorting capabilities; the FAQ covers how
to accomplish things even if all you have is 1.0.

I was unaware of those links, thanks. Some good information there.
 
P

peterwilson_69

 (e-mail address removed) < (e-mail address removed)>
wrote in
The basic idea is to remove duplicates, and move unique
items and configurations to the top of the document.
Example XML Source:
[snipped]

Desired Result, where <uniqueid> is an auto generated
value to provide temporary links between values.
generate-id()

1) Find unique values for Laser and Detector; easy to do.
2) Find unique combinations of Laser and Detector  this
is where I’m getting stuck!

Search group archives for a thread with 'xpath select
distinct over 2 elements' in subject. I posted a solution
to a similar problem there.
<segment using-combination=â€<link-to-above-combination†/>
<data />
<data />
<data />
<data />
<data />

Now THAT is a VERY bad idea. You have hierarchical data. XML
is supremely suited to storing hierarchical data. Why would
you want to unroll that hierarchical data into this
abomination imitating flat data file, causing pain and
anguish for whoever is going to be using it?

Thanks for the help.
The decision to unroll the hierarchy is still under consideration. I
am working with people that are new to XML / XSLT and have still not
grasped how powerful it is. The XML data is the new file format, and I
am writing a transform to place data in the "old-way-of-thinking" [but
newly developed] format.
Cheers,
Peter.
 
P

peterwilson_69

 (e-mail address removed) < (e-mail address removed)>
wrote in
The basic idea is to remove duplicates, and move unique
items and configurations to the top of the document.
Example XML Source:
[snipped]

Desired Result, where <uniqueid> is an auto generated
value to provide temporary links between values.
generate-id()

1) Find unique values for Laser and Detector; easy to do.
2) Find unique combinations of Laser and Detector  this
is where I’m getting stuck!

Search group archives for a thread with 'xpathselectdistinctover2elements' in subject. I posted a solution
to a similar problem there.
<segment using-combination=â€<link-to-above-combination†/>
<data />
<data />
<data />
<data />
<data />

Now THAT is a VERY bad idea. You have hierarchical data. XML
is supremely suited to storing hierarchical data. Why would
you want to unroll that hierarchical data into this
abomination imitating flat data file, causing pain and
anguish for whoever is going to be using it?

Your posted "solution" and I use the term loosely, contains a key()
function within an xsl:key() element, which isn't even valid syntax! I
have a hard enough time trying to work through the official specs
without having to debug your faulty knowledge of XSLT.
 
P

Pavel Lepin

(e-mail address removed) < (e-mail address removed)>
wrote in
Your posted "solution" and I use the term loosely,
contains a key() function within an xsl:key() element,
which isn't even valid syntax! I have a hard enough time
trying to work through the official specs without having
to debug your faulty knowledge of XSLT.

I see no clear indication whether E12 is proposed or
normative. libxslt and Saxon-8 both conform to official
recommendation (see 12.2). Xalan-C++ implements the
recommendation using the wording in errata:

pavel@debian:~/dev/xslt$ xsltproc dist.xsl dist.xml
<?xml version="1.0"?>
<result>
<test cnt="2">32354, 2</test>
<test cnt="1">32354, 4</test>
<test cnt="1">32356, 4</test>
<test cnt="2">32357, 2</test>
<test cnt="1">32358, 5</test>
</result>
pavel@debian:~/dev/xslt$ saxon -t dist.xml dist.xsl
Saxon 8.8J from Saxonica
Java version 1.5.0_11
Warning: at xsl:stylesheet on line 2 of
file:/var/www/dev/xslt/dist.xsl:
Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
Stylesheet compilation time: 518 milliseconds
Processing file:/var/www/dev/xslt/dist.xml
Building tree for file:/var/www/dev/xslt/dist.xml using
class net.sf.saxon.tinytree.TinyBuilder
Tree built in 4 milliseconds
Tree size: 68 nodes, 196 characters, 0 attributes
<?xml version="1.0" encoding="UTF-8"?>
<result>
<test cnt="2">32354, 2</test>
<test cnt="1">32354, 4</test>
<test cnt="1">32356, 4</test>
<test cnt="2">32357, 2</test>
<test cnt="1">32358, 5</test>
</result>Execution time: 99 milliseconds
Memory used: 955104
NamePool contents: 25 entries in 23 chains. 7 prefixes, 8
URIs
pavel@debian:~/dev/xslt$ xalan -in dist.xml -xsl dist.xsl

XSLException Type is: XPathParserException
Message is: The value of either the 'use' attribute or
the 'match' attribute of xsl:key cannot contain a call to
the key() function.
expression = 'count(.|key('k',concat(id,', ',rate))[1])'
Remaining tokens are:
( 'key' '(' ''k'' ',' 'concat' '(' 'id' ',' '', '' ',' 'rate' ')' ')' '[' '1' ']' ')')
(dist.xsl, line 7, column 54)

Another reason to always mention the implementation you're
using. Have a nice day.
 
P

peterwilson_69

Your post isn't quite clear to me, but basically I am getting the same
exception as you when trying to run your solution. Are you trying to
prove to me that your own solution is faulty, or am I missing
something?

Peter


(e-mail address removed) < (e-mail address removed)>
wrote in
Your posted "solution" and I use the term loosely,
contains a key() function within an xsl:key() element,
which isn't even valid syntax! I have a hard enough time
trying to work through the official specs without having
to debug your faulty knowledge of XSLT.

I see no clear indication whether E12 is proposed or
normative. libxslt and Saxon-8 both conform to official
recommendation (see 12.2). Xalan-C++ implements the
recommendation using the wording in errata:

pavel@debian:~/dev/xslt$ xsltproc dist.xsl dist.xml
<?xml version="1.0"?>
<result>
<test cnt="2">32354, 2</test>
<test cnt="1">32354, 4</test>
<test cnt="1">32356, 4</test>
<test cnt="2">32357, 2</test>
<test cnt="1">32358, 5</test>
</result>
pavel@debian:~/dev/xslt$ saxon -t dist.xml dist.xsl
Saxon 8.8J from Saxonica
Java version 1.5.0_11
Warning: at xsl:stylesheet on line 2 of
file:/var/www/dev/xslt/dist.xsl:
Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
Stylesheet compilation time: 518 milliseconds
Processing file:/var/www/dev/xslt/dist.xml
Building tree for file:/var/www/dev/xslt/dist.xml using
class net.sf.saxon.tinytree.TinyBuilder
Tree built in 4 milliseconds
Tree size: 68 nodes, 196 characters, 0 attributes
<?xml version="1.0" encoding="UTF-8"?>
<result>
<test cnt="2">32354, 2</test>
<test cnt="1">32354, 4</test>
<test cnt="1">32356, 4</test>
<test cnt="2">32357, 2</test>
<test cnt="1">32358, 5</test>
</result>Execution time: 99 milliseconds
Memory used: 955104
NamePool contents: 25 entries in 23 chains. 7 prefixes, 8
URIs
pavel@debian:~/dev/xslt$ xalan -in dist.xml -xsl dist.xsl

XSLException Type is: XPathParserException
Message is: The value of either the 'use' attribute or
the 'match' attribute of xsl:key cannot contain a call to
the key() function.
expression = 'count(.|key('k',concat(id,', ',rate))[1])'
Remaining tokens are:
( 'key' '(' ''k'' ',' 'concat' '(' 'id' ',' '', '' ',' 'rate' ')' ')' '[' '1' ']' ')')
(dist.xsl, line 7, column 54)

Another reason to always mention the implementation you're
using. Have a nice day.
 
P

Pavel Lepin

(e-mail address removed) < (e-mail address removed)>
wrote in

That should be E13, of course. I suppose it's my
triskaidekaphobia rearing its ugly head again.
Your post isn't quite clear to me, but basically I am
getting the same exception as you when trying to run your
solution. Are you trying to prove to me that your own
solution is faulty, or am I missing something?

All I'm saying is that the official W3C recommendation does
not stipulate the limitation you've run into with your
processor, that the errata item describing this limitation
does not clearly indicate whether its official status
is 'proposed' or 'normative', and that different vendors
chose different interpretations of the
recommendation+errata for their implementations. What of
the above you're unable to comprehend?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top