Pattern matching with string and list

O

olaufr

Hi,

I'd need to perform simple pattern matching within a string using a
list of possible patterns. For example, I want to know if the substring
starting at position n matches any of the string I have a list, as
below:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find($)
# here I need to find whether what's after 'pos' matches any of the
strings of my 'patterns' list
bmatch = ismatching( sentence[pos:], patterns)

Is an equivalent of this ismatching() function existing in some Python
lib?

Thanks,

Olivier.
 
M

Michael Spencer

Hi,

I'd need to perform simple pattern matching within a string using a
list of possible patterns. For example, I want to know if the substring
starting at position n matches any of the string I have a list, as
below:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find($)
# here I need to find whether what's after 'pos' matches any of the
strings of my 'patterns' list
bmatch = ismatching( sentence[pos:], patterns)

Is an equivalent of this ismatching() function existing in some Python
lib?

Thanks,

Olivier.
As I think you define it, ismatching can be written as:
... re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
... return bool(re_pattern.match(sentence))
...
>>> ismatching(sentence[pos+1:], patterns) True
>>> ismatching(sentence[pos+1:], ["green", "blue"]) False
>>>
(For help with regular expressions, see: http://www.amk.ca/python/howto/regex/)


or, you can ask the regexp engine to starting looking at a point you specify:
... re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
... return bool(re_pattern.match(sentence, startingpos))
...

but, you may be able to save the separate step of determining pos, by including
it in the regexp, e.g.,
... re_pattern = re.compile("\$(%s)" % "|".join(patterns))
... return bool(re_pattern.search(sentence))
...
>>> matching(patterns, sentence) True
>>> matching(["green", "blue"], sentence) False
>>>

then, it might be more general useful to return the match, rather than the
boolean value - you can still use it in truth testing, since a no-match will
evaluate to False
... re_pattern = re.compile("\$(%s)" % "|".join(patterns))
... return re_pattern.search(sentence)
... ...
Match

Finally, if you are going to be doing a lot of these it would be faster to take
the pattern compilation out of the function, and simply use the pre-compiled
regexp, or as below, its bound method: search:

HTH

Michael
 
T

Tom Anderson

I'd need to perform simple pattern matching within a string using a list
of possible patterns. For example, I want to know if the substring
starting at position n matches any of the string I have a list, as
below:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find($)

I assume that's a typo for "sentence.find('$')", rather than some new
syntax i've not learned yet!
# here I need to find whether what's after 'pos' matches any of the
strings of my 'patterns' list
bmatch = ismatching( sentence[pos:], patterns)

Is an equivalent of this ismatching() function existing in some Python
lib?

I don't think so, but it's not hard to write:

def ismatching(target, patterns):
for pattern in patterns:
if target.startswith(pattern):
return True
return False

You don't say what bmatch should be at the end of this, so i'm going with
a boolean; it would be straightforward to return the pattern which
matched, or the index of the pattern which matched in the pattern list, if
that's what you want.

The tough guy way to do this would be with regular expressions (in the re
module); you could do the find-the-$ and the match-a-pattern bit in one
go:

import re
patternsRe = re.compile(r"\$(blue)|(red)|(yellow)")
bmatch = patternsRe.search(sentence)

At the end, bmatch is None if it didn't match, or an instance of re.Match
(from which you can get details of the match) if it did.

If i was doing this myself, i'd be a bit cleaner and use non-capturing
groups:

patternsRe = re.compile(r"\$(?:blue)|(?:red)|(?:yellow)")

And if i did want to capture the colour string, i'd do it like this:

patternsRe = re.compile(r"\$((?:blue)|(?:red)|(?:yellow))")

If this all looks like utter gibberish, DON'T PANIC! Regular expressions
are quite scary to begin with (and certainly not very regular-looking!),
but they're actually quite simple, and often a very powerful tool for text
processing (don't get carried way, though; regular expressions are a bit
like absinthe, in that a little helps your creativity, but overindulgence
makes you use perl).

In fact, we can tame the regular expressions quite neatly by writing a
function which generates them:

def regularly_express_patterns(patterns):
pattern_regexps = map(
lambda pattern: "(?:%s)" % re.escape(pattern),
patterns)
regexp = r"\$(" + "|".join(pattern_regexps) + ")"
return re.compile(regexp)

patternsRe = regularly_express_patterns(patterns)

tom
 
B

BartlebyScrivener

Taking you literally, I'm not sure you need regex. If you know or can
find position n, then can't you just:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find("$")
for x in patterns:
if x==sentence[pos+1:]:
print x, pos+1

But maybe I'm oversimplifying.

rpd
 
B

BartlebyScrivener

Even without the marker, can't you do:

sentence = "the fabric is red"
colors = ["red", "white", "blue"]

for color in colors:
if (sentence.find(color) > 0):
print color, sentence.find(color)
 
B

Brett g Porter

BartlebyScrivener said:
Even without the marker, can't you do:

sentence = "the fabric is red"
colors = ["red", "white", "blue"]

for color in colors:
if (sentence.find(color) > 0):
print color, sentence.find(color)
That depends on whether you're only looking for whole words:
>>> colors = ['red', 'green', 'blue']
>>> def findIt(sentence):
.... for color in colors:
.... if sentence.find(color) > 0:
.... print color, sentence.find(color)
....
It's easy to see all the cases that this approach will fail for...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top