template strings for matching?

J

Joe Strout

Catching up on what's new in Python since I last used it a decade ago,
I've just been reading up on template strings. These are pretty
cool! However, just as a template string has some advantages over %
substitution for building a string, it seems like it would have
advantages over manually constructing a regex for string matching.

So... is there any way to use a template string for matching? I
expected something like:

templ = Template("The $object in $location falls mainly in the
$subloc.")
d = templ.match(s)

and then d would either by None (if s doesn't match), or a dictionary
with values for 'object', 'location', and 'subloc'.

But I couldn't find anything like that in the docs. Am I overlooking
something?

Thanks,
- Joe
 
P

Peter Otten

Joe said:
Catching up on what's new in Python since I last used it a decade ago,
I've just been reading up on template strings. These are pretty
cool!

I don't think they've gained much traction and expect them to be superseded
by PEP 3101 (see http://www.python.org/dev/peps/pep-3101/ )
However, just as a template string has some advantages over %
substitution for building a string, it seems like it would have
advantages over manually constructing a regex for string matching.

So... is there any way to use a template string for matching? I
expected something like:

templ = Template("The $object in $location falls mainly in the
$subloc.")
d = templ.match(s)

and then d would either by None (if s doesn't match), or a dictionary
with values for 'object', 'location', and 'subloc'.

But I couldn't find anything like that in the docs. Am I overlooking
something?

I don't think so. Here's a DIY implementation:

import re

def _replace(match):
word = match.group(2)
if word == "$":
return "[$]"
return "(?P<%s>.*)" % word

def extract(template, text):
r = re.compile(r"([$]([$]|\w+))")
r = r.sub(_replace, template)
return re.compile(r).match(text).groupdict()


print extract("My $$ is on the $object in $location...",
"My $ is on the biggest bird in the highest tree...")

As always with regular expressions I may be missing some corner cases...

Peter
 
P

Paul McGuire

Pyparsing makes building expressions with named fields pretty easy.

from pyparsing import Word, alphas

wrd = Word(alphas)

templ = "The" + wrd("object") + "in" + wrd("location") + \
"stays mainly in the" + wrd("subloc") + "."

tests = """\
The rain in Spain stays mainly in the plain.
The snake in plane stays mainly in the cabin.
In Hempstead, Haverford and Hampshire hurricanes hardly ever
happen.
""".splitlines()
for t in tests:
t = t.strip()
try:
match = templ.parseString(t)
print match.object
print match.location
print match.subloc
print "Fields are: %(object)s %(location)s %(subloc)s" % match
except:
print "'" + t + "' is not a match."
print

Read more about pyparsing at http://pyparsing.wikispaces.com.
-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top