How do I turn a list of strings into a list of regexps?

M

Mahurshi Akilla

I want to turn a list of strings into a list of regexps. Is there a
build in module/easy algorithm to do this?

******original_list******

string1
string2
string23
string24
anotherstring2
anothertstring5

******processed_compressed_list******

string*
anotherstring*
 
J

Jürgen Exner

Mahurshi Akilla said:
I want to turn a list of strings into a list of regexps. Is there a
build in module/easy algorithm to do this?

I don't understand your question.

Regular expressions are (double-quoted) strings. The fact that they are
regular expressions comes from them being used as specific argument in
some specific function, like the first argument in s, m. or split.
But there is no property or scalar class of "regexp" and you cannot
"turn a string into a regexp".

jue
 
T

Tad J McClellan

Mahurshi Akilla said:
I want to turn a list of strings into a list of regexps. Is there a
build in module/easy algorithm to do this?

******original_list******

string1
string2
string23
string24
anotherstring2
anothertstring5 ^
^

******processed_compressed_list******

string*
anotherstring*


All of these also match all of the strings above...

/^string/
/^another/

/string/

/s/

/\d/

/.*/

Which is "more correct"?

Why?
 
M

Mahurshi Akilla

Quoth Mahurshi Akilla <[email protected]>:






Those regexes you list will match a lot more than just the strings you
have provided. How are you choosing which bits are important?

If you have a set of regexes and you want to combine them,
Regexp::Assemble may be what you want.

Ben- Hide quoted text -

- Show quoted text -

Thanks. This is similar to what I am looking for but not quite the
same. I do find this module interesting though.

http://search.cpan.org/dist/Regexp-Assemble/Assemble.pm#LIMITATIONS
Regexp::Assemble does not attempt to find common substrings. For
instance, it will not collapse /cabababc/ down to /c(?:ab){3}c/. If
there's a module out there that performs this sort of string analysis
I'd like to know about it. But keep in mind that the algorithms that
do this are very expensive: quadratic or worse.

Basically what I am looking for something that can do this string
"collapsing" for a list of strings. I am fine if it just puts a "*"
for the most common stuff. It doesn't have to be all that accurate.
It is okay if it can match other stuff as well.

If there is nothing out there, I will probably have to write my own
algorithm. I was just hoping I don't have to reinvent the wheel. :)
 
M

Mahurshi Akilla

Quoth Tad J McClellan <[email protected]>:





In fact, looking at it again, neither of the patterns given will match
any of the given strings... they only match things like

    strin
    string
    stringg
    stringgg

:)

Ben


Assume there is a ^ in front of every regexp

string
string1
string2
anotherstring2
anotherstring4234

should return something like:

string*
anotherstring*


Again, for "*" .. I don't care if it returns .* or * or [0-9]+ .. I am
only looking for a "collapsed" list of fewer elements. I believe
simply returning a "*" would be easier and is good enough for what I
am trying to do. :)
 
B

Bart Lateur

Mahurshi said:
I want to turn a list of strings into a list of regexps. Is there a
build in module/easy algorithm to do this?

******original_list******

string1
string2
string23
string24
anotherstring2
anothertstring5

******processed_compressed_list******

string*
anotherstring*

That looks more like a list of glob patterns, than like a regexp.

Anyway, there's more than one module that does something somewhat like
what you want, for example, Regex::preSuf.

Test output:

use Regex::preSuf;
print presuf(qw(
string1
string2
string23
string24
anotherstring2
anotherstring5
));

Output:

(?:anotherstring[25]|string(?:2[34]|[12]))


You could choose to postprocess it, replacing the character classes with
a simpler pattern, or you could use the source of the module as a basis
to write your own routine.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top