Can you help with this Regular Expression?

D

Danny

Hello

here is a pattern I am using in regexp module:
href(.*?)[\x22>][^\x22]*\x22

This string seems to work in that it will return
href = "http://www.domain.com"

from href tags such as this.
<a href = "http://www.domain.com">This is a link</a>

it handles the spaces that may exist before and after equal sign.
(basically I am trying to extract links out of my url)

is there a way I can take this pattern a bit further to return just the
http://www.domain.com

I am new to regexp but have managed to get this far.

Thanks in advance

Danny
 
C

Chris Hohmann

Danny said:
Hello

here is a pattern I am using in regexp module:
href(.*?)[\x22>][^\x22]*\x22

This string seems to work in that it will return
href = "http://www.domain.com"

from href tags such as this.
<a href = "http://www.domain.com">This is a link</a>

it handles the spaces that may exist before and after equal sign.
(basically I am trying to extract links out of my url)

is there a way I can take this pattern a bit further to return just the
http://www.domain.com

I am new to regexp but have managed to get this far.

Thanks in advance

Danny
Wrap the part you want in parenthesis and reference the submatches
collection

href(.*?)[\x22>]([^\x22]*)\x22

In this case it's the second submatch your interested in. Here's the
documentation for the submatches collection which includes a code example.
http://msdn.microsoft.com/library/en-us/script56/html/vscolSubMatches.asp
 
D

Danny

That is great, and it worked for me.

Thanks very much


Chris Hohmann said:
Danny said:
Hello

here is a pattern I am using in regexp module:
href(.*?)[\x22>][^\x22]*\x22

This string seems to work in that it will return
href = "http://www.domain.com"

from href tags such as this.
<a href = "http://www.domain.com">This is a link</a>

it handles the spaces that may exist before and after equal sign.
(basically I am trying to extract links out of my url)

is there a way I can take this pattern a bit further to return just the
http://www.domain.com

I am new to regexp but have managed to get this far.

Thanks in advance

Danny
Wrap the part you want in parenthesis and reference the submatches
collection

href(.*?)[\x22>]([^\x22]*)\x22

In this case it's the second submatch your interested in. Here's the
documentation for the submatches collection which includes a code example.
http://msdn.microsoft.com/library/en-us/script56/html/vscolSubMatches.asp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,150
Messages
2,570,853
Members
47,394
Latest member
Olekdev

Latest Threads

Top