regexp to sql wildcard conversion

N

Neil Shadrach

I have a regular expression to filter results from an sql query.
It would help improve the overall efficiency if the result set could
be reduced by adding constraints to the query based on the regular
expression.
For example given the expression /^a (fatal|critical) error has
occurred$/
add "like 'a % error has occurred'" to the sql.
Are there any modules [1] or techniques which can derive an sql
wildcarded string from a regular expression? The regular expression
will still be applied to the results so it doesn't matter that the
wildcard string will match more.

The context is searching a historical database of error messages so
fixed strings with a few embedded variables are the norm.

Apologies if this message appears twice - I first posted it via an
internal news server last week but it doesn't appear to have got out
into the wider world.

Thanks

Neil Shadrach

[1] I have looked on CPAN but didn't find anything
 
P

pkent

For example given the expression /^a (fatal|critical) error has
occurred$/
add "like 'a % error has occurred'" to the sql.

Interestingly (as I often say...) I've got some code at work that could
do with this kind of "best effort" regex->SQL conversion but doesn't yet
do it. I'm assuming your database doesn't support any kind of regexes,
and that it supports the % and _ wildcards.

My idea would be this sort of conversion (pseudocode):

if the first token is a '^'
remove that token
else
sql = '%'
end unless

if the last token is a '$'
remove that token
else
sqlend = '%'
end unless

foreach token in the regex
if the token is a constant character
sql .= escaped form of that character
else if the token represents exactly one unknown character
sql .= '_'
else
sql .= '%'
end if
end foreach
sql .= sqlend


Now, I haven't tested that at all but it looks like it would do the
right thing. The tricky bit is turning the regex into a string of tokens
- maybe this would be of use there:

http://search.cpan.org/~pinyan/Regexp-Parser-0.10/lib/Regexp/Parser.pm
(although you might not need to walk the whole regex tree, just to a
depth of one level, to determine what to do to each token?)

P
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top