Regular Expression for URL

J

JJ

I can get a set of matches of URL's by using a match expression that has a
named capture group in it, i.e.:

Regex reg_linkTags = new Regex("(?:.....long resular
expression......)(?<url>\\w+|\"[^\"]*\"|'[^']*')(?:(?:\\s+\\w+\\s*=\\s*)(?:
......long regular expression......)", RegexOptions.IgnoreCase |
RegexOptions.Compiled | RegexOptions.Multiline);
MatchCollection tagMatches = reg_linkTags.Matches(myString);

Then get the url named capture group by:
for (int i = 0; i <= (tagMatches.Count - 1); i++)
{

CurrentUrl = tagMatches.Result("${url}");


BUT how do I _replace_ the url in the original string ('myString'). I can
match them , but I cannot seem to replace them. I tried using regex.replace,
but I don't think I can use this with either 'non capturing groups' (which I
wrap the named capture group 'url' in) or with named capture groups can I?

Thanks,
JJ
 
G

Guest

I can get a set of matches of URL's by using a match expression that has a
named capture group in it, i.e.:

Regex reg_linkTags = new Regex("(?:.....long resular
expression......)(?<url>\\w+|\"[^\"]*\"|'[^']*')(?:(?:\\s+\\w+\\s*=\\s*)(?:
.....long regular expression......)", RegexOptions.IgnoreCase |
RegexOptions.Compiled | RegexOptions.Multiline);
MatchCollection tagMatches = reg_linkTags.Matches(myString);

Then get the url named capture group by:
for (int i = 0; i <= (tagMatches.Count - 1); i++)
{

CurrentUrl = tagMatches.Result("${url}");


What about simple replace?

CurrentUrl = tagMatches.Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);
 
J

JJ

What about simple replace?

CurrentUrl = tagMatches.Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);


Ah thats what I did in the first place. However, when the href and the src
parts of the tag had identical beginning sections, both were replaced - in
my case I didn't want that to happen.

What I ended up doing was creating another reg expression to pull out just
the href url so that I could replace it. Just thought there may be any
easier way to use named capture groups to replace (not just capture) text.
Maybe there is, but as yet I've not a clue how to do it.

Thanks,

JJ
 
G

Guest

What about simple replace?
CurrentUrl = tagMatches.Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);


Ah thats what I did in the first place. However, when the href and the src
parts of the tag had identical beginning sections, both were replaced - in
my case I didn't want that to happen.

What I ended up doing was creating another reg expression to pull out just
the href url so that I could replace it. Just thought there may be any
easier way to use named capture groups to replace (not just capture) text.
Maybe there is, but as yet I've not a clue how to do it.

Thanks,

JJ


I think there is one more possibility

Change your pattern to return all content in a groups

(?<textbefore>...)(?<url>...)(?<textafter>...)

In this case you could use the following replacement statement

string newUrl = "http://.......";
string result = reg_linkTags.Replace(text, "${textbefore}" + newUrl +
"${textafter}");

You can also use a MatchEvaluator delegate to custom string function
that can be called at each match to evaluate the replacement value.

string result = reg_linkTags.Replace(text, new
MatchEvaluator(NewUrl));

string NewUrl(Match m)
{
string x = m.Groups["Domain"].ToString();
....
return "something";
}
 
J

JJ

I didn't think of that. Thanks,

JJ

Anon User said:
What about simple replace?
CurrentUrl = tagMatches.Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);


Ah thats what I did in the first place. However, when the href and the
src
parts of the tag had identical beginning sections, both were replaced -
in
my case I didn't want that to happen.

What I ended up doing was creating another reg expression to pull out
just
the href url so that I could replace it. Just thought there may be any
easier way to use named capture groups to replace (not just capture)
text.
Maybe there is, but as yet I've not a clue how to do it.

Thanks,

JJ


I think there is one more possibility

Change your pattern to return all content in a groups

(?<textbefore>...)(?<url>...)(?<textafter>...)

In this case you could use the following replacement statement

string newUrl = "http://.......";
string result = reg_linkTags.Replace(text, "${textbefore}" + newUrl +
"${textafter}");

You can also use a MatchEvaluator delegate to custom string function
that can be called at each match to evaluate the replacement value.

string result = reg_linkTags.Replace(text, new
MatchEvaluator(NewUrl));

string NewUrl(Match m)
{
string x = m.Groups["Domain"].ToString();
...
return "something";
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top