random underslashes and single regex

Bill · Jun 16, 2004

Anyone know of a _single regex_ for the following:

Match a string which has a single '_' character placed at random
within it. The string without the '_' is known, the position of the
'_' is not known.

I know one can strip out the '_' and then match, but I'd like to do it
with one m// statement.

Thanks.

Gunnar Hjalmarsson · Jun 16, 2004

Bill said:
Anyone know of a _single regex_ for the following:

Match a string which has a single '_' character placed at random
within it. The string without the '_' is known, the position of the
'_' is not known.

I know one can strip out the '_' and then match, but I'd like to do
it with one m// statement.

Why?? Please explain.

Without having seen the regex, I suppose that character classes is the
key.

Glenn Jackman · Jun 16, 2004

Bill said:
Anyone know of a _single regex_ for the following:

Match a string which has a single '_' character placed at random
within it. The string without the '_' is known, the position of the
'_' is not known.

I know one can strip out the '_' and then match, but I'd like to do it
with one m// statement.

my $regex = join('_?', split('', $known_string));
if ($some_text =~ /$regex/) {
# some text contains the known string possibly with underscore(s)
}

Bill · Jun 17, 2004

Gunnar said:
Why?? Please explain.

Without having seen the regex, I suppose that character classes is the
key.

It is for a mail filter: a spammer mutates the From by moving a _
character at random.

Gunnar Hjalmarsson · Jun 17, 2004

Bill said:
It is for a mail filter: a spammer mutates the From by moving a _
character at random.

Aha, thanks, but I meant why are you anxious to do it in one
statement? To me, using tr/// to remove it seems both more straight
forward and more efficient.

Bill · Jun 17, 2004

Gunnar Hjalmarsson said:
Aha, thanks, but I meant why are you anxious to do it in one
statement? To me, using tr/// to remove it seems both more straight
forward and more efficient.

Yes, probably, but the underlying program wants a list of regexp.

The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities

.

Jeff 'japhy' Pinyan · Jun 17, 2004

Yes, probably, but the underlying program wants a list of regexp.

The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities .

Yeah. Here's an irregular expression:

if (subset("alphabet", "ape", my($extra)) {
print "'ape' + '$extra' = 'alphabet'\n";
}
# or
# my $is_subset = subset("alphabet", "ape");

sub subset {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?

.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}

It works with Perl 5.8.4; before that I can't be sure.

Jeff 'japhy' Pinyan · Jun 18, 2004

The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities .

Click to expand...

Yeah. Here's an irregular expression:

if (subset("alphabet", "ape", my($extra)) {
print "'ape' + '$extra' = 'alphabet'\n";
}
# or
# my $is_subset = subset("alphabet", "ape");

sub subset {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?
.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}

It's not just a "subset", it's a subset in order. Anyway, here are three
functions that do the same thing (as subset() above). The first is the
above, the other two are variations. in1() is the slowest, in3() is the
fastest. They're all the same number of bytes. Creepy. (I worked them
that way -- I know they can be golfed shorter, but I wanted all of them to
be the same length, and to be a nice rectangle.)

sub in1 {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?

.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}

sub in2 {
@_=(our$i=\$_[2],@_);my$R;$R=qr/(?{substr$_[2],$%,1,''})(
(??{''ne$^R&&"[^\Q$^R\E]*\Q$^R"}))(?{''ne$^R&&chop($$i.=$
1);$^R})(??{''ne$^R&&$R})(.*)(?{$$i.=$2})/sx;$_[1]=~/^$R/
}

sub in3 {
@_=(our$i=\$_[2],@_);$_[1]=~m{\A(?>\z|(?{"\Q@{[substr$_[2
],0,1,'']}"})((??{"[^$^R]*$^R"}))(?{''ne$1&&chop($$i.=$1)
})(?(?{$_[2]eq''})(?{$$i.=$'}).*))+(?(?{$_[2]ne''}).^)}xs
}

Jeff 'japhy' Pinyan · Jun 18, 2004

sub in3 {
@_=(our$i=\$_[2],@_);$_[1]=~m{\A(?>\z|(?{"\Q@{[substr$_[2
],0,1,'']}"})((??{"[^$^R]*$^R"}))(?{''ne$1&&chop($$i.=$1)
})(?(?{$_[2]eq''})(?{$$i.=$'}).*))+(?(?{$_[2]ne''}).^)}xs
}

It turns out this one is broken.

Regex: match double OR single quote	4	Jul 12, 2012
Clickable link conversion regex?	0	Nov 30, 2012
My regex kung-fu is not strong =(	0	Apr 4, 2020
SQL Connection string regex pattern to parse sections	1	May 9, 2024
read and parse a single line file	21	Apr 1, 2014
Pyautogui, cv2 and cannot find image	0	Feb 7, 2023
Is there a way to get a single mode using all the points within a 2D array?	2	Oct 17, 2022
regex function driving me nuts	6	Oct 23, 2012

random underslashes and single regex

Bill

Gunnar Hjalmarsson

Glenn Jackman

Bill

Gunnar Hjalmarsson

Bill

Jeff 'japhy' Pinyan

Jeff 'japhy' Pinyan

Jeff 'japhy' Pinyan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads