Regexp.escape with un-escapes

I

Intransition

Hi--

I want to translate a string into a regular expression, but I want to
"un-escape" portions as raw regexp. For example:

"here is a setting 'a' equal to ((\d+))"

So I want to Regexp.escape the string, before I pass it to Regexp.new,
but I want what's in the (( )) to stay exactly the same with the
double-parens removed.

I know that has to be a fairly concise way to do this, but all I've
come up with is some very ugly brute force code that iterates back and
forth using index '((' and index '))'.

Any suggestions?
 
M

Marnen Laibow-Koser

Thomas said:
Hi--

I want to translate a string into a regular expression, but I want to
"un-escape" portions as raw regexp. For example:

"here is a setting 'a' equal to ((\d+))"

So I want to Regexp.escape the string, before I pass it to Regexp.new,
but I want what's in the (( )) to stay exactly the same with the
double-parens removed.

I know that has to be a fairly concise way to do this, but all I've
come up with is some very ugly brute force code that iterates back and
forth using index '((' and index '))'.

Any suggestions?

Why, yes! Use a regexp to find the (( )) bits, and then extract them,
don't escape them, and paste the whole regexp together. It's a bit
ironic to me that in a regexp-handling routine, you're doing brute-force
index searches. :)

Best,
-- 
Marnen Laibow-Koser
http://www.marnen.org
(e-mail address removed)
 
I

Intransition

Why, yes! =A0Use a regexp to find the (( )) bits, and then extract them,
don't escape them, and paste the whole regexp together. =A0It's a bit
ironic to me that in a regexp-handling routine, you're doing brute-force
index searches. :)

You are right, it is ironic! Your suggestion helped some. I came up
with:

def __when_string_to_regexp(str)
rexps =3D []
str =3D str.gsub(/\(\((.*?)\)\)/) do |m|
rexps << ['(' + $1 + ')']
"%s"
end
str =3D Regexp.escape(str)
str =3D str % rexps
str =3D str.gsub(/(\\\ )+/, '\s+')
Regexp.new(str, Regexp::IGNORECASE)
end

It's still not perfect b/c it means end users won't be able to use
"%s" in a string without it messing up --stitching it back together
seems to be the hard part. But I'll keep working on it.

Thanks.
 
B

Brian Candler

There is a feature in String#split (documented in 1.9, undocumented in
1.8.6 but still present) whereby if the split RE contains capture
groups, those capture groups will be included in the resulting array.
=> /here\ is\ \(a\)\ setting\ 'a'\ equal\ to\ (\d+)/
 
I

Intransition

There is a feature in String#split (documented in 1.9, undocumented in
1.8.6 but still present) whereby if the split RE contains capture
groups, those capture groups will be included in the resulting array.


=3D> "here is (a) setting 'a' equal to ((\\d+))">> Regexp.new(s.split(/(\=
(\(.*?\)\))/).map { |x| x =3D~ /\A\((\(.*\))\)\z/ ? $1 : Regexp.escape(x) }=
join)
=3D> /here\ is\ \(a\)\ setting\ 'a'\ equal\ to\ (\d+)/

Weird, but cool. Does the trick. Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top