C
Chris Gallagher
Hi,
Im trying to convert part of a HTML document to REXML and its throwing
it back at me due to the fact that the html document isnt valid XML due
to one closing tag which doesnt have an opener.
heres the xml:
<table class="index" width="100%">
<thead class="index-header">
<tr class="header-row">
<td><a class="sorted" href="/index.jsp?sort=status">Status
<em>(sinc
e)</em></a></td>
<td><a class="sorted" href="/index.jsp?sort=last
failure">Last failu
re</a></td>
<td><a class="sorted" href="/index.jsp?sort=last
successful">Last su
ccessful</a></td>
<td>Label</td>
<td></td>
</tr>
</thead>
<tbody>
<tr class="odd-row ">
<td class="data"><a
href="buildresults/henry-mobile-server">henry-
mobile-server</a></td>
<td class="data date status-dull">? <em>(10:05)</em></td>
<td class="data date failure"></td>
<td class="data date">09:31</td>
<td class="data">build.19</td>
<td class="data"><input id="force_henry-mobile-server"
type="butto
n"
onclick="callServer('http://etab-va:8000/i
nvoke?operation=build&objectname=CruiseControl+Project%3Aname%3Dhenry-mobile-ser
ver', 'henry-mobile-server')"
value="Build"/></td>
</tr>
</tbody>
<tr class="even-row ">
<td class="data"><a
href="buildresults/henry-mobile-server-nightly
-build">henry-mobile-server-nightly-build</a></td>
<td class="data date status-dull">? <em>(10:05)</em></td>
<td class="data date failure"></td>
<td class="data date"></td>
<td class="data"> </td>
<td class="data"><input
id="force_henry-mobile-server-nightly-buil
d" type="button"
onclick="callServer('http://etab-va:8000/i
nvoke?operation=build&objectname=CruiseControl+Project%3Aname%3Dhenry-mobile-ser
ver-nightly-build', 'henry-mobile-server-nightly-build')"
value="Build"/></td>
</tr>
</tbody>
</table>
What i need to do is to strip out the final "</tbody>" tag from the
file.
The html is being fethced using net/http and then that part of the page
is being extracted from the full page with the following line:
gathered_data = response.body[table_start_pos,height]
any ideas on how I should remove that tag?
Cheers,
Chris
Im trying to convert part of a HTML document to REXML and its throwing
it back at me due to the fact that the html document isnt valid XML due
to one closing tag which doesnt have an opener.
heres the xml:
<table class="index" width="100%">
<thead class="index-header">
<tr class="header-row">
<td><a class="sorted" href="/index.jsp?sort=status">Status
<em>(sinc
e)</em></a></td>
<td><a class="sorted" href="/index.jsp?sort=last
failure">Last failu
re</a></td>
<td><a class="sorted" href="/index.jsp?sort=last
successful">Last su
ccessful</a></td>
<td>Label</td>
<td></td>
</tr>
</thead>
<tbody>
<tr class="odd-row ">
<td class="data"><a
href="buildresults/henry-mobile-server">henry-
mobile-server</a></td>
<td class="data date status-dull">? <em>(10:05)</em></td>
<td class="data date failure"></td>
<td class="data date">09:31</td>
<td class="data">build.19</td>
<td class="data"><input id="force_henry-mobile-server"
type="butto
n"
onclick="callServer('http://etab-va:8000/i
nvoke?operation=build&objectname=CruiseControl+Project%3Aname%3Dhenry-mobile-ser
ver', 'henry-mobile-server')"
value="Build"/></td>
</tr>
</tbody>
<tr class="even-row ">
<td class="data"><a
href="buildresults/henry-mobile-server-nightly
-build">henry-mobile-server-nightly-build</a></td>
<td class="data date status-dull">? <em>(10:05)</em></td>
<td class="data date failure"></td>
<td class="data date"></td>
<td class="data"> </td>
<td class="data"><input
id="force_henry-mobile-server-nightly-buil
d" type="button"
onclick="callServer('http://etab-va:8000/i
nvoke?operation=build&objectname=CruiseControl+Project%3Aname%3Dhenry-mobile-ser
ver-nightly-build', 'henry-mobile-server-nightly-build')"
value="Build"/></td>
</tr>
</tbody>
</table>
What i need to do is to strip out the final "</tbody>" tag from the
file.
The html is being fethced using net/http and then that part of the page
is being extracted from the full page with the following line:
gathered_data = response.body[table_start_pos,height]
any ideas on how I should remove that tag?
Cheers,
Chris