J
John Nagle
Here's a large Perl regular expression, from a Perl address parser in CPAN:
use re 'eval';
$Addr_Match{street} = qr/
(?:
# special case for addresses like 100 South Street
(?$Addr_Match{direct})\W+ (?{ $_{street} = $^N })
($Addr_Match{type})\b (?{ $_{type} = $^N }))
|
(?$Addr_Match{direct})\W+ (?{ $_{prefix} = $^N }))?
(?:
([^,]+) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
|
([^,]*\d) (?{ $_{street} = $^N })
($Addr_Match{direct})\b (?{ $_{suffix} = $^N })
|
([^,]+?) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))?
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
)
)
/ix;
I'm trying to convert this to Python.
Those entries like "$(Addr_Match{direct}) are other regular expressions,
being used here as subexpressions. Those have already been converted
to forms like "Addr_Match.direct" in Python. But how to call them?
Is that possible in Python, and if so, where is it documented?
John Nagle
use re 'eval';
$Addr_Match{street} = qr/
(?:
# special case for addresses like 100 South Street
(?$Addr_Match{direct})\W+ (?{ $_{street} = $^N })
($Addr_Match{type})\b (?{ $_{type} = $^N }))
|
(?$Addr_Match{direct})\W+ (?{ $_{prefix} = $^N }))?
(?:
([^,]+) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
|
([^,]*\d) (?{ $_{street} = $^N })
($Addr_Match{direct})\b (?{ $_{suffix} = $^N })
|
([^,]+?) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))?
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
)
)
/ix;
I'm trying to convert this to Python.
Those entries like "$(Addr_Match{direct}) are other regular expressions,
being used here as subexpressions. Those have already been converted
to forms like "Addr_Match.direct" in Python. But how to call them?
Is that possible in Python, and if so, where is it documented?
John Nagle