J
j.vimal
Hi
I would like to extract the anchors from a page. This is the simple
pattern I wrote:
/(<[aA]\\s[^>]*>[^<]*<\/a>)/
Note that it is to be used with a programming language, say php, but
the syntax is same that of Perl (almost) except for escape sequences.
Now, after I have got all the anchors, I want to parse them, to get the
href and title attributes.
For the href, I wrote
\\bhref\\s*=\\s*(["'])([^\\1])\\1
I search for href at the start of a word boundary, then skip spaces,
then the equal to, then skip spaces, then, I get the quotes. This is
reference 1. Now, I want to continue till I dont encounter the same
reference 1. Then, the last character is again reference 1.
So, is this syntax right? It doesnt seem to work for me ...
And, ofcourse, the quotes need not be there. I will change it
Thanks!
I would like to extract the anchors from a page. This is the simple
pattern I wrote:
/(<[aA]\\s[^>]*>[^<]*<\/a>)/
Note that it is to be used with a programming language, say php, but
the syntax is same that of Perl (almost) except for escape sequences.
Now, after I have got all the anchors, I want to parse them, to get the
href and title attributes.
For the href, I wrote
\\bhref\\s*=\\s*(["'])([^\\1])\\1
I search for href at the start of a word boundary, then skip spaces,
then the equal to, then skip spaces, then, I get the quotes. This is
reference 1. Now, I want to continue till I dont encounter the same
reference 1. Then, the last character is again reference 1.
So, is this syntax right? It doesnt seem to work for me ...
And, ofcourse, the quotes need not be there. I will change it
Thanks!