M
Martin Evans
Sorry, yet another REGEX question. I've been struggling with trying to get
a regular expression to do the following example in Python:
Search and replace all instances of "sleeping" with "dead".
This parrot is sleeping. Really, it is sleeping.
to
This parrot is dead. Really, it is dead.
But not if part of a link or inside a link:
This parrot <a href="sleeping.htm" target="new">is sleeping</a>. Really, it
is sleeping.
to
This parrot <a href="sleeping.htm" target="new">is sleeping</a>. Really, it
is dead.
This is the full extent of the "html" that would be seen in the text, the
rest of the page has already been processed. Luckily I can rely on the
formating always being consistent with the above example (the url will
normally by much longer in reality though). There may though be more than
one link present.
I'm hoping to use this to implement the automatic addition of links to other
areas of a website based on keywords found in the text.
I'm guessing this is a bit too much to ask for regex. If this is the case,
I'll add some more manual Python parsing to the string, but was hoping to
use it to learn more about regex.
Any pointers would be appreciated.
Martin
a regular expression to do the following example in Python:
Search and replace all instances of "sleeping" with "dead".
This parrot is sleeping. Really, it is sleeping.
to
This parrot is dead. Really, it is dead.
But not if part of a link or inside a link:
This parrot <a href="sleeping.htm" target="new">is sleeping</a>. Really, it
is sleeping.
to
This parrot <a href="sleeping.htm" target="new">is sleeping</a>. Really, it
is dead.
This is the full extent of the "html" that would be seen in the text, the
rest of the page has already been processed. Luckily I can rely on the
formating always being consistent with the above example (the url will
normally by much longer in reality though). There may though be more than
one link present.
I'm hoping to use this to implement the automatic addition of links to other
areas of a website based on keywords found in the text.
I'm guessing this is a bit too much to ask for regex. If this is the case,
I'll add some more manual Python parsing to the string, but was hoping to
use it to learn more about regex.
Any pointers would be appreciated.
Martin