negative lookbehind and \.

I

Isaac Councill

Hello,

I am trying to extract capitalized words from text that are not the
first words in a sentence. I thought it would be straightforward
using negative lookbehind for periods:

m/((?:(?<!\. )|(?<!\. ))[A-Z][a-z]+)/g

but not the case.

The above regexp will match all capitalized words not preceded by one
or two spaces. For instance, if I use a lookbehind with \. followed
by one space, I get no results unless there are capitalized words not
preceded by a space (e.g. "(Something" - I can extract "Something").
It seems that the matcher is simply ignoring the "\.". What am I
missing here?

Thanks,
Isaac
 
M

Malcolm Dew-Jones

Isaac Councill ([email protected]) wrote:
: Hello,

: I am trying to extract capitalized words from text that are not the
: first words in a sentence. I thought it would be straightforward
: using negative lookbehind for periods:

: m/((?:(?<!\. )|(?<!\. ))[A-Z][a-z]+)/g

: but not the case.

: The above regexp will match all capitalized words not preceded by one
: or two spaces.

I don't think so. I think the double negative has confused you.

E.g.

`one space. Word'

(?<!\. ) is false but (?<!\. ) is true, therefore the | is true,
therefore Word will be matched.

`two spaces. Word'

(?<!\. ) is now true even though (?<!\. ) is false, therefore once again
the | is true, therefore Word is matched.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top