Q: using xlst to "congeal" adjecent sections

M

Malcolm Dew-Jones

(First - my terminology may well be bogus, I hope you understand me
though.)

I have some ms word documents that will be used as the input for a
different purpose in a database. To ease this process I want to take
certain adjacent sections of the documents and "congeal" them into single
sections.

The format within the xml output of msword is simple to see, but I haven't
played with xslt for a while, (and not much at that) so am looking for
examples or suggestions of how to do the following.

For example, I have the following two "sections"


<w:r wsp:rsidR="00A5105E" wsp:rsidRPr="00EC0118">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART </w:t>
</w:r>
<w:r wsp:rsidR="00A61057">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>AND THE SECOND PART</w:t>
</w:r>

I want to end up with a single section that has the FIRST PART AND THE
SECOND PART combined. I don't think I need to care about the id numbers,
but even if I do I will worry about that later. The result would then look
like this

<w:r wsp:rsidR="*"> (the value of * doesn't matter to me yet)
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART AND THE SECOND PART</w:t>
</w:r>

The thing that makes each section the same is that the <w:rPr>...</w:rPr>
are the same in the adjacent sections, and I only care about the sections
that have that exact formatting shown above (i.e. w:ascii="Arial" etc.) so
a tranform could have those values hard coded if it makes it easier.

Anyway, as I said, examples or suggestions for setting up an xslt to do
this would be appreciated.

Thanks
 
J

Joseph Kesselman

Off-the-cuff answer:


The usual approach for this sort of thing is to write two templates to
handle the two distinct cases.

Start by figuring out a match pattern that selects all the elments
you're interested in.

Modify that to create two match patterns: one that matches the first
such instance (one with no preceeding matching siblings) and one that
matches all the others. (Or the last and all-the-rest; either way.)

Make a template fired by the first pattern that gathers the contents of
it and its adjacent matching siblings.

Make a template fired by the second pattern which discards the elements
which match it, since they were handled by the other template.

Plug those two into a stylesheet which handles the rest of the document,
typically the identity transformation.

Done.


The XSLT FAQ websiteshould have some examples. I suspect that, given the
complexity of what you're matching on, you'll want to take advantage of
keys.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top