xsl to retrieve text in table elements in xml

F

Figo 775

Hi,
I have an xml document which is as below:
<?xml version="1.0" encoding="UTF-8"?>
<fragment name="htmlPart">
<value>&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;Feature&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b
&gt;Benefit&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Now I am here &lt;/td&gt;&lt;td&gt;Now what am I
going to do here&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some stuff lies here too&lt;/td&gt;&lt;td&gt;Please
give suggestions&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
</value>
</fragment>

3 "fragment name ="htmlpart" elements are present in this document..

For each fragment name ="htmlpart", the xsl I need should create an
element "<fragment name="tr"> and map the tr values from the text above
in its "value" element.
The same needs to be done for "td".
Please provide an answer to this.

Thanks,
figo
 
M

Martin Honnen

Figo 775 wrote:

I have an xml document which is as below:
<?xml version="1.0" encoding="UTF-8"?>
<fragment name="htmlPart">
<value>&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;Feature&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b
&gt;Benefit&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Now I am here &lt;/td&gt;&lt;td&gt;Now what am I
going to do here&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some stuff lies here too&lt;/td&gt;&lt;td&gt;Please
give suggestions&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
</value>
</fragment>

3 "fragment name ="htmlpart" elements are present in this document..

For each fragment name ="htmlpart", the xsl I need should create an
element "<fragment name="tr"> and map the tr values from the text above
in its "value" element.
The same needs to be done for "td".
Please provide an answer to this.

I am afraid as your XML input doesn't contain <tr> elements or <td>
elements but just text with such tags there is no easy way to process
the input with XSLT/XPath.
All you could do is use the XPath text processing but there are
languages which have more power for text processing than XSLT/XPath 1.0
have.
 
J

Joris Gillis

I have an xml document which is as below:
<?xml version="1.0" encoding="UTF-8"?>
<fragment name="htmlPart">
<value>&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;Feature&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b
&gt;Benefit&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Now I am here &lt;/td&gt;&lt;td&gt;Now what am I
going to do here&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some stuff lies here too&lt;/td&gt;&lt;td&gt;Please
give suggestions&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
</value>
</fragment>

The html encoded as text looks like valid XHMLT, so you should include the html as elements rather than raw text. So the first step would be to unescape the &lt; and &gt; entities. (It can even be done with XSLT)

Once you have this XML:
<fragment name="htmlPart">
<value><table> <tr><td><b>Feature</b></td><td><b >Benefit</b></td></tr> <tr><td>some sample text runs here</td><td>some sample text runs here too</td></tr> <tr><td>some sample text runs here</td><td>some sample text runs here too</td></tr> <tr><td>Now I am here </td><td>Now what am I going to do here</td></tr> <tr><td>some stuff lies here too</td><td>Please give suggestions</td></tr></table>
</value>
</fragment>

the job is easily done with XSLT like this:
<xsl:template match="fragment[@name='htmlPart']">
<value>
<xsl:apply-templates select="value/*"/>
</value>
</xsl:template>

<xsl:template match="tr|td">
<fragment name="{local-name()}">
<value>
<xsl:apply-templates/>
</value>
</fragment>
</xsl:template>


I'm not sure though, if this is really what you mean with "mapping the tr values from the text its "value" element". Please explain a bit clearer what is your goal.

regards,
 
F

Figo 775

Hi Joris,

I think I will be more clearer now though this is a small portion of a
bigger doc. Please let me know if I am not.

<?xml version="1.0" encoding="UTF-8"?>
<fragment name="htmlPart">
<value>&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;Feature&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b
&gt;Benefit&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Now I am here &lt;/td&gt;&lt;td&gt;Now what am I
going to do here&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some stuff lies here too&lt;/td&gt;&lt;td&gt;Please
give suggestions&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
</value>
</fragment>

3 "fragment name ="htmlpart" elements are present(similar to the above
format.

My XSL should convert the "htmlpart" element into three elements
<fragment name="tr">
<value>
<fragment name="td">
<value>
<fragment name="tablecell">
<value>
This should contain text between the first '&lt;tr&gt;&lt;tr&gt;'
'&lt;/td&gt;&lt;/tr&gt;'
</value>
</fragment>
....
This structure should be repeated for each content text between
'&lt;tr&gt;&lt;tr&gt;' '&lt;/td&gt;&lt;/tr&gt;' .
So I guess I need to use substring recursively to search for the above
patterns and map content. Hope this gives a better picture. Any help
would be greatly appreciated.
Thanks,
figo
 
F

Figo 775

Hi khp,

I think I will be more clearer now though this is a small portion of a
bigger doc.

<?xml version="1.0" encoding="UTF-8"?>
<fragment name="htmlPart">
<value>&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;Feature&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b
&gt;Benefit&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some sample text runs here&lt;/td&gt;&lt;td&gt;some
sample text runs here too&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Now I am here &lt;/td&gt;&lt;td&gt;Now what am I
going to do here&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;some stuff lies here too&lt;/td&gt;&lt;td&gt;Please
give suggestions&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
</value>
</fragment>

3 "fragment name ="htmlpart" elements are present(similar to the above
format.

My XSL should convert the "htmlpart" element into three elements
<fragment name="tr">
<value>
<fragment name="td">
<value>
<fragment name="tablecell">
<value>
This should contain text between the first '&lt;tr&gt;&lt;tr&gt;'
'&lt;/td&gt;&lt;/tr&gt;'
</value>
</fragment>
....
This structure should be repeated for each content text between
'&lt;tr&gt;&lt;tr&gt;' '&lt;/td&gt;&lt;/tr&gt;' .
Hope this gives a better picture. Any help would be greatly appreciated.
Thanks,
figo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top