Regexp riddle; escaping escapes

P

Phlip

Rubies:

Someone didn't escape their & in their HTML correctly. Let's fix it.

This regexp correctly does not escape &dude, because we only want to escape
raw & markers:

p "yo &dude".gsub(/&([^a-z])/i, '&\1')

That passed "yo &dude" thru unchanged. (I am aware "dude" has no ; on the
end; we are leaving that optional, for whatever reason...)

Now escape & followed by a non-alphabetic character:

p "yo & dude".gsub(/&([^a-z])/i, '&\1')

That correctly provides: "yo & dude"

Now how to escape "yo && dude"? Note that the ([^a-z]) consumes the second
&, leading to this incorrect output:

"yo && dude"

The only workaround I can think of is to run the Regexp twice:

x = "yo && dude"
2.times{ x.gsub!(/&([^a-z])/i, '&\1') }
p x

Can someone help my feeb Regexp skills and get a "yo && dude" in one
line?
 
T

Tim Pease

Rubies:

Someone didn't escape their & in their HTML correctly. Let's fix it.

This regexp correctly does not escape &dude, because we only want to escape
raw & markers:

p "yo &dude".gsub(/&([^a-z])/i, '&\1')

That passed "yo &dude" thru unchanged. (I am aware "dude" has no ; on the
end; we are leaving that optional, for whatever reason...)

Now escape & followed by a non-alphabetic character:

p "yo & dude".gsub(/&([^a-z])/i, '&\1')

That correctly provides: "yo & dude"

Now how to escape "yo && dude"? Note that the ([^a-z]) consumes the second
&, leading to this incorrect output:

"yo && dude"

The only workaround I can think of is to run the Regexp twice:

x = "yo && dude"
2.times{ x.gsub!(/&([^a-z])/i, '&\1') }
p x

Can someone help my feeb Regexp skills and get a "yo && dude" in one
line?

str = "yo && dude"
str.gsub!( %r/&(?=[^a-z])/i, '&')
p str
=> "yo && dude"


The regular expression trick here is the (?=re) That's called the
"zero-width positive lookahead". It matches, but it does not consume
the string; so the gsub! will only replace the characters that are NOT
inside (?=re).

Blessings,
TwP
 
T

Tim Pease

Tim said:
str.gsub!( %r/&(?=[^a-z])/i, '&')
Thanks!

"zero-width positive lookahead"

Man, that was right there, but I was blocking on it. (-;

I had to pull my pickaxe off the shelf and look it up, too. Page 327
in the second edition if you're interested in reading about it. It's
in the first edition, too, that is available online.

Blessings,
TwP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top