Alan Kennedy
The possibility of the same feature occurred to me. However, I'm still
not sure if this would solve the problem. How would the "pivot point"
be recognised by such an augmented regex engine? i.e. how would it
recognise the point at which it should stop capturing, reverse the
sequence and start matching again?
Back-tracking. Suppose there was such a thing as
(.*).?\~1
where the \~1 matches the reverse of the first group.
Now match that pattern against
NOON
The (.*) matches all the way up to "NOON" then tries and
failes to match both the .? and the \~1
It backtracks one so that
(.*) matches "NOO"
and the N remains. The .? matches the N but the \~1 does not
so it backtracks so that the .? matches nothing, but still the \~1
does not match the N.
It backtracks another so that
(.*) matches "NO"
and the "ON" remains. The .? first matches with the "O" leaving
the "N", but \~1 doesn't match that, so it backtracks so that
the .? matchs nothing, leaving "ON" for the \~1. This matches!
In other words, it goes through a lot of work to do the matching,
and would take O(N) backtracks to work.
But that's not quite correct. An implementation is free to
analyze the pattern and notice that it's being asked to search
for a palindrome then insert special case code for that pattern.
Perhaps one would need to also implement a feature whereby the length
of the entire string could be made available within expressions, so
that the size of a capture group could be limited to the first half of
the string? I.E. Something along the lines of
^(.{strlen/2}).?\~1$
Yup. That would work as well. There are all sorts of
special things one could capture in a regex-like language.
Perl 'solves' it by allowing any match to call out to embedded
Perl code, which is free to tell the matcher to stop or
continue matching.
One of these days I'll find the time to dig out my old course notes
and books :#)
Friedl's regexp book (from ORA) is quite good.
Andrew
(e-mail address removed)