Regular Expression Problem

K

Kryten

Hi,
I'd be grateful for assistance with a regular expression problem I
have.

My texttakes the format of:-

aaabbbcccorangesbananashhhjjjapples
qqqqyy333366sssssorangesbananasuuu555449

What I want is to match bananas but only when immediately preceded by
oranges.
So ggggggggeeeeeeeorangestbananasrrrrr would not match. Neither would
ffffiiiiiiiiiiiqa2332332orangebananasjjj33yyttti

is this positive lookbehind? Or am I getting my regex terms mixed up?

Thanks

Stuart
 
D

Dave B

Kryten said:
Hi,
I'd be grateful for assistance with a regular expression problem I
have.

My texttakes the format of:-

aaabbbcccorangesbananashhhjjjapples
qqqqyy333366sssssorangesbananasuuu555449

What I want is to match bananas but only when immediately preceded by
oranges.
So ggggggggeeeeeeeorangestbananasrrrrr would not match. Neither would
ffffiiiiiiiiiiiqa2332332orangebananasjjj33yyttti

is this positive lookbehind? Or am I getting my regex terms mixed up?

Yes, you need something like

/(?<=oranges)bananas/

(tweak as needed)
 
K

Kryten

Hi,

To be completely truthful with you...nothing at all.

Actually I'm learning Powershell but had noticed that the denizens of
this group seemed to really
know their Regular Expressions so I thought I'd ask here as the
concepts seem pretty similar.

I'm trying to figure out a way to break a text string into smaller
text strings, where a simple .split() method won't do it.

Eg. If I have (forgive the $ prefix, but it's the only way I know to
express it!)

$a = "tty01 17 07 08 15:45 sch0999 some more text here"

I want to find a way to carve off "tty01 17 07 08 15:45" into one
variable.
"sch0999" into another variable (this is the important one) and lastly
"some more text here" into the last variable.

So that I can then go back and process the "sch0999" string later.

Now, I will never know what sch0999 wil look like exactly but I can
describe what it will look like as the regex : "\w{3}\d{4}"
and while I will never know what is going to precede it exactly, I do
know that it will be the time plus whitespace : "\d{2}:\d{2}.+".

I don't know much about regex at all! But I had *heard* of positive
lookbehind and it sounded about right for what I'll need to do, that
is do a positive lookbehind for the first regex on the basis the
second can be matched as well.

I'm haven't finished "Mastering Regular Expressions" yet .. so if I'm
getting this all wrong please be gentle!

Thanks,

Stuart
 
T

Tad J McClellan

Leon Timmermans said:
LOL. Yeah, since there is no regular expressions newsgroup this would
probably be the one where you'll get most response on such a question.


The easiest regexp for that would be: /^(.*?) (\w{3}\d{4}) (.*)/ . No
lookbehinds needed for that.


But what if the pattern matches "too early", as with:

tty0001 17 07 08 15:45 sch0999 other stuff

....then we _are_ back to needing look-around.
 
D

Dave B

Kryten said:
Eg. If I have (forgive the $ prefix, but it's the only way I know to
express it!)

$a = "tty01 17 07 08 15:45 sch0999 some more text here"

I want to find a way to carve off "tty01 17 07 08 15:45" into one
variable.
"sch0999" into another variable (this is the important one) and lastly
"some more text here" into the last variable.

So that I can then go back and process the "sch0999" string later.

Now, I will never know what sch0999 wil look like exactly but I can
describe what it will look like as the regex : "\w{3}\d{4}"
and while I will never know what is going to precede it exactly, I do
know that it will be the time plus whitespace : "\d{2}:\d{2}.+".

I don't know much about regex at all! But I had *heard* of positive
lookbehind and it sounded about right for what I'll need to do, that
is do a positive lookbehind for the first regex on the basis the
second can be matched as well.

Ok. In your first post you mentioned lookbehind, so I gave you an example.
But as others have already said, you may be able to do what you want to
without using lookaround, which is certainly more efficient.
Also, if it turns out that you absolutely need lookbehind (seems unlikely),
keep in mind that Perl (AFAIK) does not support variable-length lookbehind,
so you can't say for instance

/(?<=\d+)bananas/

but again, you probably don't need lookbehind at all in your case.
 
B

Ben Morrow

Quoth Dave B said:
Ok. In your first post you mentioned lookbehind, so I gave you an example.
But as others have already said, you may be able to do what you want to
without using lookaround, which is certainly more efficient.
Also, if it turns out that you absolutely need lookbehind (seems unlikely),
keep in mind that Perl (AFAIK) does not support variable-length lookbehind,
so you can't say for instance

/(?<=\d+)bananas/

but again, you probably don't need lookbehind at all in your case.

With 5.10, or if you use Regexp::Keep, you can say

/\d+\Kbananas/

to get the most useful case of variable-length look-behind (at the
beginning of the match).

Ben
 
D

Dave B

Ben said:
With 5.10, or if you use Regexp::Keep, you can say

/\d+\Kbananas/

to get the most useful case of variable-length look-behind (at the
beginning of the match).

Ah thanks, I haven't looked into 5.10 yet, and I didn't know Regexp::Keep
either.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,209
Messages
2,571,089
Members
47,687
Latest member
IngridXxj

Latest Threads

Top