random underslashes and single regex

B

Bill

Anyone know of a _single regex_ for the following:

Match a string which has a single '_' character placed at random
within it. The string without the '_' is known, the position of the
'_' is not known.

I know one can strip out the '_' and then match, but I'd like to do it
with one m// statement.

Thanks.
 
G

Gunnar Hjalmarsson

Bill said:
Anyone know of a _single regex_ for the following:

Match a string which has a single '_' character placed at random
within it. The string without the '_' is known, the position of the
'_' is not known.

I know one can strip out the '_' and then match, but I'd like to do
it with one m// statement.

Why?? Please explain.

Without having seen the regex, I suppose that character classes is the
key.
 
G

Glenn Jackman

Bill said:
Anyone know of a _single regex_ for the following:

Match a string which has a single '_' character placed at random
within it. The string without the '_' is known, the position of the
'_' is not known.

I know one can strip out the '_' and then match, but I'd like to do it
with one m// statement.

my $regex = join('_?', split('', $known_string));
if ($some_text =~ /$regex/) {
# some text contains the known string possibly with underscore(s)
}
 
B

Bill

Gunnar said:
Why?? Please explain.

Without having seen the regex, I suppose that character classes is the
key.

It is for a mail filter: a spammer mutates the From by moving a _
character at random.
 
G

Gunnar Hjalmarsson

Bill said:
It is for a mail filter: a spammer mutates the From by moving a _
character at random.

Aha, thanks, but I meant why are you anxious to do it in one
statement? To me, using tr/// to remove it seems both more straight
forward and more efficient.
 
B

Bill

Gunnar Hjalmarsson said:
Aha, thanks, but I meant why are you anxious to do it in one
statement? To me, using tr/// to remove it seems both more straight
forward and more efficient.

Yes, probably, but the underlying program wants a list of regexp.

The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities :).
 
J

Jeff 'japhy' Pinyan

Yes, probably, but the underlying program wants a list of regexp.

The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities :).

Yeah. Here's an irregular expression:

if (subset("alphabet", "ape", my($extra)) {
print "'ape' + '$extra' = 'alphabet'\n";
}
# or
# my $is_subset = subset("alphabet", "ape");

sub subset {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?:(
.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}

It works with Perl 5.8.4; before that I can't be sure.
 
J

Jeff 'japhy' Pinyan

The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities :).

Yeah. Here's an irregular expression:

if (subset("alphabet", "ape", my($extra)) {
print "'ape' + '$extra' = 'alphabet'\n";
}
# or
# my $is_subset = subset("alphabet", "ape");

sub subset {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?:(
.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}

It's not just a "subset", it's a subset in order. Anyway, here are three
functions that do the same thing (as subset() above). The first is the
above, the other two are variations. in1() is the slowest, in3() is the
fastest. They're all the same number of bytes. Creepy. (I worked them
that way -- I know they can be golfed shorter, but I wanted all of them to
be the same length, and to be a nice rectangle.)

sub in1 {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?:(
.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}

sub in2 {
@_=(our$i=\$_[2],@_);my$R;$R=qr/(?{substr$_[2],$%,1,''})(
(??{''ne$^R&&"[^\Q$^R\E]*\Q$^R"}))(?{''ne$^R&&chop($$i.=$
1);$^R})(??{''ne$^R&&$R})(.*)(?{$$i.=$2})/sx;$_[1]=~/^$R/
}

sub in3 {
@_=(our$i=\$_[2],@_);$_[1]=~m{\A(?>\z|(?{"\Q@{[substr$_[2
],0,1,'']}"})((??{"[^$^R]*$^R"}))(?{''ne$1&&chop($$i.=$1)
})(?(?{$_[2]eq''})(?{$$i.=$'}).*))+(?(?{$_[2]ne''}).^)}xs
}
 
J

Jeff 'japhy' Pinyan

sub in3 {
@_=(our$i=\$_[2],@_);$_[1]=~m{\A(?>\z|(?{"\Q@{[substr$_[2
],0,1,'']}"})((??{"[^$^R]*$^R"}))(?{''ne$1&&chop($$i.=$1)
})(?(?{$_[2]eq''})(?{$$i.=$'}).*))+(?(?{$_[2]ne''}).^)}xs
}

It turns out this one is broken. :(
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,158
Messages
2,570,882
Members
47,414
Latest member
djangoframe

Latest Threads

Top