Hi,
I'm using this chunk of code to extract some content from a piece of
html.
($parse) = ($html =~ /<tbody>(.*?)<\/tbody>/sg);
so that I can grab everything between <tbody> and </tbody>
Now, I have a some <tr> tags I'd like to parse as follows:
<tr class="odd"></tr>
<tr class="even"></tr>
Yet, I want to skip the attribute. I am a newbie with reg exp and I am stuck
at this:
(@parse) = ($parse =~ /<tr (\W+)>(.*?)<\/tr>/sg);
but it's not working..how should I go about?
thanks
Something like this maybe.
-sln
-------------------
use strict;
use warnings;
## Requires 5.10 or above
## OP: Now, I have a some <tr> tags I'd like to parse as follows:
## <tr class="odd"></tr>
## <tr class="even"></tr>
## Yet, I want to skip the attribute.
##
my $xml = join '', <DATA>;
##
my $open = q{ <tr\s*( [^>]*? )(?<!\/)> };
my $close = q{ <\/tr\s*> };
my $regx = qr/
<script\s*[^>]*?(?<!\/)> .*? <\/script\s*>
|
(?-i: <!(?:\[CDATA\[.*?\]\]|--.*?--|\[[A-Z][A-Z\ ]*\[.*?\]\])> )
|
( #1
(?: $open ) #2
( #3
(?:
(?>
(?:
(?-i: <!(?:\[CDATA\[.*?\]\]|--.*?--|\[[A-Z][A-Z\ ]*\[.*?\]\])> )
| (?! $open | $close ) .
)+
)
| (?1)
)*
)
$close
)
/ixs;
##
my @records;
while ( $xml =~ /$regx/g )
{
if (defined $1) {
print "-->\$2 = '$2'\n";
print "-->\$3 = '$3'\n";
push @records, $3;
}
}
print "---------\nDone!\n";
exit;
__DATA__
<![CDATA[
<tr class="odd">
trodd
</tr>
<tr class="even">
treven
</tr>
]]>
<script>
function search0(){
document.forms[0].submit()
}
function Upper()
{
var up = document.getElementById("h_sn");
return up.value = up.value.toUpperCase();
}
</script>
<TABLE width="800" BORDER=1 align="center" CELLPADDING=1 CELLSPACING=0>
<TR VALIGN="TOP" >
<TD colspan="11" align="left" valign="middle" class="style31"><img
src="image1/Home/Export.png" width="45" height="13" /></TD>
</TR>
<TR VALIGN="TOP" >
<TD WIDTH="33" align="center" class="style25">Item</TD>
<TD width="73" align="center" class="style25">AWB No </TD>
<TD WIDTH="69" align="center" class="style25">Flight No </TD>
<TD WIDTH="87" align="center" class="style25">Flight Date</TD>
<TD WIDTH="42" align="center" class="style25">Origin</TD>
<TD WIDTH="42" align="center" class="style25">Dest</TD>
<TD WIDTH="99" align="center" class="style25">ULD No </TD>
<TD WIDTH="105" align="center" class="style25">Status</TD>
<TD WIDTH="50" align="center" class="style25"> Pieces </TD>
<TD WIDTH="58" align="center" class="style25">Weight </TD>
<TD WIDTH="96" align="center" class="style25">Time </TD>
</TR>
<TR bgcolor='#99CCFF' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">1</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 419</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 15 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Flight Change </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Export
Transshipment</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 5:37PM </TD>
</TR>
<TR bgcolor='#99FFCC' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">2</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 419</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 15 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12"> </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Accepted</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 5:37PM </TD>
</TR>
<TR bgcolor='#99CCFF' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 373</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 15 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Flight Change </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Export
Transshipment</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 6:12PM </TD>
</TR>
<TR bgcolor='#99FFCC' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">4</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 373</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 15 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">SHC </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Export
Transshipment</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 6:12PM </TD>
</TR>
<TR bgcolor='#99CCFF' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">5</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 373</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 14 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Flight Change </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Export
Transshipment</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 6:42PM </TD>
</TR>
<TR bgcolor='#99FFCC' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">6</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 373</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 14 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">PMC31131EK </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Manifested</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 6:57PM </TD>
</TR>
<TR bgcolor='#99CCFF' >
<TD ALIGN="center" NOWRAP="TRUE" class="style12">7</TD>
<TD ALIGN="left" NOWRAP="TRUE" class="style12">176-75064953</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">EK 373</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Oct 14 2010 </TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">BKK</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12">DXB</TD>
<TD ALIGN="center" NOWRAP="TRUE" class="style12"> </TD>
<!--// This is check status -->
<TD ALIGN="center" NOWRAP="TRUE" class="style12">Departed</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">3</TD>
<TD ALIGN="RIGHT" NOWRAP="TRUE" class="style12">743.00</TD>
<TD ALIGN="right" NOWRAP="TRUE" class="style12">Oct 14 2010 9:54PM </TD>
</TR>
</TABLE>
<script>
function show_adv(){
var ko = document.getElementById("showadv");
//var ko2 = document.getElementById("showadv2");
ko.style.display="";
//ko2.style.display="";
}
</script>