partially matching a regexp

T

Thomas Koenig

Assume I have a regexp, /^(hello)|(goodbye)$/ for example.

I want to see wether a particular string matches part of that
particular regexp, so "", "h", "he", "hel", "hell", "hello", "g",
"go", "goo" "good", "goodb", "goodby" and "goodbye" would be ok,
and anything else wouldn't.

I could hand-craft this example easily enough, but it grows
tedious and error-prone for more general regular expressions,
and automation would be much preferred.

Ideas?
 
B

Brian McCauley

Thomas said:
Assume I have a regexp, /^(hello)|(goodbye)$/ for example.

I want to see wether a particular string matches part of that
particular regexp, so "", "h", "he", "hel", "hell", "hello", "g",
"go", "goo" "good", "goodb", "goodby" and "goodbye" would be ok,
and anything else wouldn't.

I could hand-craft this example easily enough, but it grows
tedious and error-prone for more general regular expressions,
and automation would be much preferred.

Ideas?

Take a look at the source of File::Stream. A very similar problem is
solved in File::Stream::find and the solution to that problem could
probably be easily be adapted (simplified!) to solve your problem.
 
A

Andrew Palmer

Thomas Koenig said:
Assume I have a regexp, /^(hello)|(goodbye)$/ for example.

I want to see wether a particular string matches part of that
particular regexp, so "", "h", "he", "hel", "hell", "hello", "g",
"go", "goo" "good", "goodb", "goodby" and "goodbye" would be ok,
and anything else wouldn't.

I could hand-craft this example easily enough, but it grows
tedious and error-prone for more general regular expressions,
and automation would be much preferred.

Ideas?

Something like:

if(test($input,"hello") || test($input,"goodbye"))
{
# stuff
}

sub test
{
my($s1,$s2)=@_;
for my $len(0..length($s2))
{
return 1 if(substr($s2,0,$len) eq $s1);
}
return 0;
}
 
C

Charles DeRykus

Assume I have a regexp, /^(hello)|(goodbye)$/ for example.

I want to see wether a particular string matches part of that
particular regexp, so "", "h", "he", "hel", "hell", "hello", "g",
"go", "goo" "good", "goodb", "goodby" and "goodbye" would be ok,
and anything else wouldn't.

I could hand-craft this example easily enough, but it grows
tedious and error-prone for more general regular expressions,
and automation would be much preferred.

Ideas?

Sounds as if 'index' (perldoc -f index) might be easier and
in some cases faster than a regex:


my $substring = ...
for ( qw/hello goodbye/ ) {
print "$substring matched $_\n" if index($_, $substring) != -1;
}
 
J

Jay Tilton

: Assume I have a regexp, /^(hello)|(goodbye)$/ for example.
:
: I want to see wether a particular string matches part of that
: particular regexp, so "", "h", "he", "hel", "hell", "hello", "g",
: "go", "goo" "good", "goodb", "goodby" and "goodbye" would be ok,
: and anything else wouldn't.
:
: I could hand-craft this example easily enough, but it grows
: tedious and error-prone for more general regular expressions,
: and automation would be much preferred.

This sounds like an opportunity to abuse Perl's
(?(condition)pattern) regex feature.

#!/perl
use strict;
use warnings;

my $pat_h = buildpattern( 'hello' );
my $pat_g = buildpattern( 'goodbye' );

while(<DATA> ) {
chomp;
print "matched '$1' in '$_'\n"
if /$pat_h/ or /$pat_g/;
}

sub buildpattern {
my @lets = $_[0] =~ /./g;
my $pat;
for( 0 .. $#lets ) {
$pat .= $_ == 0 ? $lets[$_] :
$_ == 1 ? "($lets[$_])?" :
"((?($_)$lets[$_]))?"
}
return qr/(^$pat)/;
}

__DATA__
hello
helpme
haveaniceday
goodbye
goodgrief
gabbagabbahey
 
T

Thomas Koenig

Take a look at the source of File::Stream. A very similar problem is
solved in File::Stream::find and the solution to that problem could
probably be easily be adapted (simplified!) to solve your problem.

I'm currently doing that, and trying to understand what YAPE::regexp
does (which isn't too easy :)

If I get anywhere, I'll let the newsgroup know.
 
J

Jeff 'japhy' Pinyan

I'm currently doing that, and trying to understand what YAPE::regexp
does (which isn't too easy :)

I'd suggest switching over to Regexp::parser. I'm hoping to get
Regexp::Explain out in the near future.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
RPI Corporation Secretary % have long ago been overpaid?
http://japhy.perlmonk.org/ %
http://www.perlmonks.org/ % -- Meister Eckhart
 
A

Andrew Palmer

Andrew Palmer said:
Something like:

if(test($input,"hello") || test($input,"goodbye"))
{
# stuff
}

sub test
{
my($s1,$s2)=@_;
for my $len(0..length($s2))
{
return 1 if(substr($s2,0,$len) eq $s1);
}
return 0;
}

Clearly the above code is retarded. The equivalent using index() is simply:

if(index("hello",$input)==0 || index("goodbye",$input)==0)
{
# stuff
}

The regex tokenizing modules seem unnecessary for the example you posted. Is
your actual test significantly more complicated?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top