M
Malcolm Dew-Jones
(First - my terminology may well be bogus, I hope you understand me
though.)
I have some ms word documents that will be used as the input for a
different purpose in a database. To ease this process I want to take
certain adjacent sections of the documents and "congeal" them into single
sections.
The format within the xml output of msword is simple to see, but I haven't
played with xslt for a while, (and not much at that) so am looking for
examples or suggestions of how to do the following.
For example, I have the following two "sections"
<w:r wsp:rsidR="00A5105E" wsp:rsidRPr="00EC0118">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART </w:t>
</w:r>
<w:r wsp:rsidR="00A61057">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>AND THE SECOND PART</w:t>
</w:r>
I want to end up with a single section that has the FIRST PART AND THE
SECOND PART combined. I don't think I need to care about the id numbers,
but even if I do I will worry about that later. The result would then look
like this
<w:r wsp:rsidR="*"> (the value of * doesn't matter to me yet)
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART AND THE SECOND PART</w:t>
</w:r>
The thing that makes each section the same is that the <w:rPr>...</w:rPr>
are the same in the adjacent sections, and I only care about the sections
that have that exact formatting shown above (i.e. w:ascii="Arial" etc.) so
a tranform could have those values hard coded if it makes it easier.
Anyway, as I said, examples or suggestions for setting up an xslt to do
this would be appreciated.
Thanks
though.)
I have some ms word documents that will be used as the input for a
different purpose in a database. To ease this process I want to take
certain adjacent sections of the documents and "congeal" them into single
sections.
The format within the xml output of msword is simple to see, but I haven't
played with xslt for a while, (and not much at that) so am looking for
examples or suggestions of how to do the following.
For example, I have the following two "sections"
<w:r wsp:rsidR="00A5105E" wsp:rsidRPr="00EC0118">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART </w:t>
</w:r>
<w:r wsp:rsidR="00A61057">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>AND THE SECOND PART</w:t>
</w:r>
I want to end up with a single section that has the FIRST PART AND THE
SECOND PART combined. I don't think I need to care about the id numbers,
but even if I do I will worry about that later. The result would then look
like this
<w:r wsp:rsidR="*"> (the value of * doesn't matter to me yet)
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART AND THE SECOND PART</w:t>
</w:r>
The thing that makes each section the same is that the <w:rPr>...</w:rPr>
are the same in the adjacent sections, and I only care about the sections
that have that exact formatting shown above (i.e. w:ascii="Arial" etc.) so
a tranform could have those values hard coded if it makes it easier.
Anyway, as I said, examples or suggestions for setting up an xslt to do
this would be appreciated.
Thanks