The replies I have seen tend to support a suspicion of mine: _regular_
expressions have problems dealing with even minor amounts of truly
_random_ variation. I guess that is probably due to the fact they deal
with regularities
.
Yeah. Here's an irregular expression:
if (subset("alphabet", "ape", my($extra)) {
print "'ape' + '$extra' = 'alphabet'\n";
}
# or
# my $is_subset = subset("alphabet", "ape");
sub subset {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?
.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}
It's not just a "subset", it's a subset in order. Anyway, here are three
functions that do the same thing (as subset() above). The first is the
above, the other two are variations. in1() is the slowest, in3() is the
fastest. They're all the same number of bytes. Creepy. (I worked them
that way -- I know they can be golfed shorter, but I wanted all of them to
be the same length, and to be a nice rectangle.)
sub in1 {
$_[0]=~/^(?{[("")x2]})(.*)(?{[$^R->[0].$1,$^R->[1]]})(?
.)(?{[$^R->[0],$^R->[1].$2]})(.*)(?{[$^R->[0].$3,$^R->[1]
]}))*\z(?(?{$^R->[0]eq$_[1]})(?{$_[2]=$^R->[1]})|(?!))/sx
}
sub in2 {
@_=(our$i=\$_[2],@_);my$R;$R=qr/(?{substr$_[2],$%,1,''})(
(??{''ne$^R&&"[^\Q$^R\E]*\Q$^R"}))(?{''ne$^R&&chop($$i.=$
1);$^R})(??{''ne$^R&&$R})(.*)(?{$$i.=$2})/sx;$_[1]=~/^$R/
}
sub in3 {
@_=(our$i=\$_[2],@_);$_[1]=~m{\A(?>\z|(?{"\Q@{[substr$_[2
],0,1,'']}"})((??{"[^$^R]*$^R"}))(?{''ne$1&&chop($$i.=$1)
})(?(?{$_[2]eq''})(?{$$i.=$'}).*))+(?(?{$_[2]ne''}).^)}xs
}