Geoff said:
Gunnar said:
if ( $line =~ /Head\s+Teacher.+?<TD[^>]+>([^<]+)
.+?
Address.+?<TD[^>]+>([^<]+)
/isx ) {
print "Name: $1\nAddress: $2\n";
}
the above is not working for me at the moment - if you have the
time (and patience!) it would really help me if you could "talk" me
through it ...
I'd prefer not to. Besides the character classes, which we now have
explained, and a couple of modifiers, whose meaning you can read about
in 'perldoc perlre', it doesn't include anything that was not included
in the regex you posted yourself.
OK - will do - I follow above except I would have expected that
if ( $line =~ /Head\s+Teacher.+?<TD[^>]+>([^<]+)
would need
if ( $line =~ /Head\s+Teacher.+?<TD[^>]+>([^<]+)<
and
Address.+?<TD[^>]+>([^<]+)
would need
Address.+?<TD[^>]+>([^<]+)<
ie the "<" to signify where the ([^<]+) ends - as you do have a "<" in
the .+?<TD[^>]+> section?! I must be missing something?
My code is as follows but it does not work!
---------------------------
use strict;
print ("name of html file?\n");
my $namehtml = <STDIN>;
print ("name of email list file?\n");
my $newhtml = <STDIN>;
open(IN, "$namehtml");
open(OUT, ">>$newhtml");
my $line = <IN>;
while (defined($line=<IN>)) {
# if ($line =~ / (.*?)<\/H6>/i) {
# print OUT ("$1 \n");
# }
if ( $line =~ /Head\s+Teacher.+?<TD[^>]+>([^<]+)
.+?
Address.+?<TD[^>]+>([^<]+)
/isx ) {
print OUT ("Name: $1\nAddress: $2\n");
}
}
close (IN);
close (OUT);
-----------------------------
which is working on for example
<TD align=left width="20%" colSpan=2><B>Head Teacher</B></TD>
<TD vAlign=top width="80%" colSpan=2>Fred Green</TD></TR>
<TR>
<TD align=left width="20%" colSpan=2><B>Address</B></TD>
<TD vAlign=top width="80%" colSpan=2>Park Road, Northgate,
London N88 5XX</TD></TR>
Cheers
Geoff