XSLT to "normalize" weight attribute

arnold · Mar 2, 2006

Hi,

I've been knocking my head against the wall trying to create an
XSL transform to perform "normalizations" of a set of XML files
that have a common structure.

% XML file before transform

<base>
<foo>
<bar weight="20">
<elementOne>asfd</elementOne>
<elementTwo>qwer</elementTwo>
</bar>
<bar weight="5">
<elementOne>asfd</elementOne>
<elementTwo>qwer</elementTwo>
</bar>
<bar weight="30">
<elementOne>asfd</elementOne>
<elementTwo>qwer</elementTwo>
</bar>
</foo>
</base>

% XML file after transform

<base>
<foo weightSum="55">
<bar weight="20" lower="1" upper="20">
<elementOne>asfd</elementOne>
<elementTwo>qwer</elementTwo>
</bar>
<bar weight="5" lower="21" upper="25">
<elementOne>asfd</elementOne>
<elementTwo>qwer</elementTwo>
</bar>
<bar weight="30" lower="26" upper="55">
<elementOne>asfd</elementOne>
<elementTwo>qwer</elementTwo>
</bar>
</foo>
</base>

The idea is that a random number between 1 and weightSum would be
selected, and then the child element of the element w/ the weightSum
attribute that has lower<=randomNum<=upper would be selected. This
is a transformation that would only be run when the underlying xml
files
have been updated, so speed of transformation is not an issue.

Constraints: I have many files with this 'weightSum'-'weight' pattern,
and the element names ('foo' and 'bar' in the example above) differ
from file to file. Furthermore, it would be great if the transform
worked on nested 'weightSum'-'weight' patterns, such as the following.

% XML file before transform

<base>
<foo>
<bar weight="20">
<elementOne weight="3">asfd</elementOne>
<elementTwo weight="8">qwer</elementTwo>
</bar>
...

</foo>
</base>

% XML file after transform

<base>
<foo weightSum="55">
<bar weight="20" lower="1" upper="20" weightSum="11">
<elementOne weight="3" lower="1" upper="3">asfd</elementOne>
<elementTwo weight="8" lower="4" upper="11">qwer</elementTwo>
</bar>
...
</foo>
</base>

Any help would be appreciated.

- Arnold

arnold · Mar 5, 2006

Here is a solution my earlier post. I used the Saxon8.7b parser.
I don't know if the solution relies on any XSLT 2.0 capabilities,
I need to test it with a XSLT 1.0 compliant parser.

The setup is as follows: A parent "container" element holds a
number of children elements with the same tag name. You want to
make it easy for a program to randomly select a child element with
a frequency that varies for each child. So in the example 'XML
input file' below, the first parent "container" element is named
'people' and thre are three children with the tag name 'person'.
The weights for the three children are '80, '10' and '40'. So
80/(80+10+40)% of the time I want to select the first 'person'
element. Likewise, within the first person element, I want to
select the first 'given' element 35/(35+25+10)% of the time.

Notes:
- The solution seems to work on nested weightSum-weight
combinations.
- For reasons I don't understand, simply applying the
transformation to the XML input file results in extra blank
lines. I use awk in a shell script to get rid of the blank
lines.
- Referring to the 'XML output file', a program would randomly
select (say) a 'person' by
1- reading the value of the 'weightSum'attribute for the
parent element 'persons'
2- randomly drawing between 0 and weightSum-1
3- locating the 'person' element s.t. the random number is

= the 'lower' attribute value and < the 'upper'

attribute value.

%------------------- XML input file ------------------------------
<?xml version="1.0"?>
<people weightSum="100">
<person weight="80">
<givens weightSum="0">
<given weight="35">Alfred</given>
<given weight="25">Fred</given>
<given weight="10">Wilfred</given>
</givens>
<family>Newman</family>
</person>
<person weight="10">
<givens>
<given>Leslie</given>
</givens>
<family>Newman</family>
</person>
<person weight="40">
<givens>
<given>Maria</given>
</givens>
<family>Newman</family>
</person>
</people>

%------------------- XML output file -----------------------------
<?xml version="1.0" encoding="UTF-8"?>
<people weightSum="130">
<person weight="80" lower="0" upper="80">
<givens weightSum="70">
<given weight="35" lower="0" upper="35">Alfred</given>
<given weight="25" lower="35" upper="60">Fred</given>
<given weight="10" lower="60" upper="70">Wilfred</given>
</givens>
<family>Newman</family>
</person>
<person weight="10" lower="80" upper="90">
<givens>
<given>Leslie</given>
</givens>
<family>Newman</family>
</person>
<person weight="40" lower="90" upper="130">
<givens>
<given>Maria</given>
</givens>
<family>Newman</family>
</person>
</people>

%------------------- XSLT file -----------------------------------
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl

utput method="xml" indent="yes"/>



<xsl:template match="@*|node()">
<xsl:choose>
<xsl:when test="@weight"></xsl:when>
<xsl

therwise>
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl

therwise>
</xsl:choose>
</xsl:template>


<xsl:template match="attribute::weightSum">
<xsl:attribute name="weightSum">
<xsl:value-of select="sum(../child::*/attribute::weight)" />
</xsl:attribute>

<xsl:for-each select="../child::*">
<xsl:variable name="weight" select="attribute::weight" />
<xsl:variable name="from"
select="sum(./preceding-sibling::*/attribute::weight)" />
<xsl:variable name="to"
select="sum(./preceding-sibling::*/attribute::weight)+$weight" />

<xsl:copy>
<xsl:attribute name="weight" >
<xsl:value-of select="$weight" />
</xsl:attribute>
<xsl:attribute name="lower">
<xsl:value-of select="$from" />
</xsl:attribute>
<xsl:attribute name="upper">
<xsl:value-of select="$to" />
</xsl:attribute>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:for-each>

</xsl:template>

</xsl:stylesheet>

%------------- Script to remove extra blank lines ----------------
#!/bin/bash

argc="$#"

if [ $ "$argc" -lt 1 $ -o $ "$argc" -gt 2 $ ]; then
printf "\n\n"
printf " Usage: NormalizeWeights.sh data.xml [output_file]"
printf "\n\n"
exit 1
fi

if [ "$argc" -eq 1 ]; then
inputXmlFname=$1;
/usr/bin/java -jar $HOME/sbox/software/lib/saxon8.7/saxon8.jar -t
$inputXmlFname NormalizeWeights.xsl | /usr/bin/awk '!/^( )+$/{print
$0;}'
elif [ "$argc" -eq 2 ]; then
inputXmlFname=$1;
outputXmlFname=$2;
if [ -f "$outputXmlFname" ]; then
backupName=$(printf "%s%s" $outputXmlFname ".bac" )
echo "File $outputXmlFname exists, making backup named
$backupName"
/bin/cp $outputXmlFname $backupName
fi
/usr/bin/java -jar $HOME/sbox/software/lib/saxon8.7/saxon8.jar -t
-o $outputXmlFname $inputXmlFname NormalizeWeights.xsl
/bin/cat $outputXmlFname | /usr/bin/awk '!/^( )+$/{print $0;}' >
tmp$$
/bin/mv tmp$$ $outputXmlFname
/bin/rm -f tmp$$
fi

Why is Python telling me variable is local not global?	3	Sep 2, 2023
XSLT: Normalizing namespaces	5	Aug 30, 2007
Light-weight/very-simple version control under Windows using Python?	2	Jul 23, 2010
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
obtain element name, or attribute and value of the document name itself, and some elemnts and attrib	4	Oct 30, 2008
XSLT - Extracting name-value pairs	11	Nov 7, 2008
Ajax + XML + MySQL	1	Dec 13, 2006
XSLT Newbie: Attribute Value transformed to Attribute Name	1	Aug 23, 2006

XSLT to "normalize" weight attribute

arnold

arnold

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads