Hi Duncan
This of course gives priority to colours and only looks for garments or
footwear if the it hasn't matched on a prior pattern. If you actually
wanted to match the first occurrence of any of these (or if the condition
was re.match instead of re.search) then named groups can be a nice way of
simplifying the code:
A good point. And a good example when to use named
capture group references. This is easily extended
for 'spitting out' all other occuring categories
(see below).
This is one nice thing in Pythons Regex Syntax,
you have to emulate the ?P-thing in other
Regex-Systems more or less 'awk'-wardly ;-)
For something this simple the titles and group names could be the
same, but I'm assuming real code might need a bit more.
Non no, this is quite good because it involves
some math-generated table-code lookup.
I managed somehow to extend your example in order
to spit out all matches and their corresponding
import re
(?P<c>blue |white |red )
| (?P<g>socks|tights )
| (?P<f>boot |shoe |trainer)
PATTERN = re.compile(PATTERN , re.VERBOSE)
TITLES = { 'c': 'Colour', 'g': 'Garment', 'f': 'Footwear' }
t = 'blue socks and red shoes'
for match in PATTERN.finditer(t):
grp = match.lastgroup
print "%s: %s" %( TITLES[grp], match.group(grp) )
which writes out the expected:
Colour: blue
Garment: socks
Colour: red
Footwear: shoe
The corresponding Perl-program would look like this:
$PATTERN = qr/
(blue |white |red )(?{'c'})
| (socks|tights )(?{'g'})
| (boot |shoe |trainer)(?{'f'})
%TITLES = (c =>'Colour', g =>'Garment', f =>'Footwear');
$t = 'blue socks and red shoes';
print "$TITLES{$^R}: $^N\n" while( $t=~/$PATTERN/g );
and prints the same:
Colour: blue
Garment: socks
Colour: red
Footwear: shoe
You don't have nice named match references (?P<..>)
in Perl-5, so you have to emulate this by an ordinary
code assertion (?{..}) an set some value ($^R) on
the fly - which is not that bad in the end (imho).
(?{..}) means "zero with code assertion",
this sets Perl-predefined $^R to its evaluated
value from the {...}
As you can see, the pattern matching related part
reduces from 4 lines to one line.
If you wouldn't need dictionary lookup and
get away with associated categories, all
you'd have to do would be this:
$PATTERN = qr/
(blue |white |red )(?{'Colour'})
| (socks|tights )(?{'Garment'})
| (boot |shoe |trainer)(?{'Footwear'})
$t = 'blue socks and red shoes';
print "$^R: $^N\n" while( $t=~/$PATTERN/g );
What's the point of all that? IMHO, Python's
Regex support is quite good and useful, but
won't give you an edge over Perl's in the end.
Thanks & Regards