Concatenating Regex Smartly

Shak Shak · Apr 8, 2009

Is there a way of quickly concatenating two full string patterns in a
way that takes into account the boundaries? So for example:

\A\d+\Z and \A[a-z]+\Z

would give:

\A\d+[a-z]+\Z

?

Or is this a context sensitive situation where I'd have to parse and
join it myself? If so, what is the best way to "tokenise" a pattern?

Shak

Robert Klemme · Apr 8, 2009

Is there a way of quickly concatenating two full string patterns in a
way that takes into account the boundaries? So for example:

\A\d+\Z and \A[a-z]+\Z

IIRC the "Z" must be lower case.

would give:

\A\d+[a-z]+\Z

?

Or is this a context sensitive situation where I'd have to parse and
join it myself? If so, what is the best way to "tokenise" a pattern?

Why do you have to parse them? There is a bit of context missing but
without further facts I would recommend to keep individual patterns
without the start and end anchors and only apply those after
constructing the full regexp that you want to use. My 0.02 EUR...

Kind regards

robert

Brian Candler · Apr 8, 2009

Shak said:
\A\d+\Z and \A[a-z]+\Z

These are two regular expressions both anchored to the start and end of
the string.

If you want to match one or the other:

re1 = /\A\d+\z/
re2 = /\A[a-z]+\z/

re3 = /#{re1}|#{re2}/
=> /(?-mix:\A\d+\z)|(?-mix:\A[a-z]+\z)/

But to "concatenate" in the sense of making a regexp which matches
digits followed by letters, you need to remove the anchors.

re1 = /\d+/
re2 = /[a-z]+/

re3 = /\A#{re1}#{re2}\z/
=> /\A(?-mix:\d+)(?-mix:[a-z]+)\z/

Note that #{re1} and #{re2} are each surrounded by a non-capturing group
(?...) when they are interpolated into re3. So it should also work
properly for more complex REs, e.g.

re1 = /a|b/
re2 = /c|d/
re3 = /\A#{re1}#{re2}\z/

But if you want to be extra-certain that it's done correctly, you can
always add your own additional layer of grouping:

re3 = /\A(?:#{re1}#{re2})\z/

Robert Dober · Apr 8, 2009

Is there a way of quickly concatenating two full string patterns in a
way that takes into account the boundaries? So for example:

\A\d+\Z =A0and \A[a-z]+\Z

Click to expand...

Not that I am aware of, their semantics however is slightly different:

irb(main):001:0> "abc\n" =3D~ /.\Z/ # \Z matches the \n
=3D> 2
irb(main):002:0> "abc\n" =3D~ /.\z/ # \z does not match the \n and neither =
does .
=3D> nil
irb(main):003:0> "abc\n" =3D~ /.\z/m # Now, in multiline mode, the .
matches the \n
=3D> 3

Now this is for 1.9 maybe this does not hold for 1.8.
Cheers
R.

--=20
If you want to build a ship, don=92t herd people together to collect
wood and don=92t assign them tasks and work, but rather teach them to
long for the endless immensity of the sea.
-- Antoine de Saint-Exupery

printing regex results	9	Mar 4, 2011
My regex kung-fu is not strong =(	0	Apr 4, 2020
Regex ^ beginning not strong?	2	Jul 26, 2010
detail on a regex?	3	May 13, 2008
regex dynamic count modifier {min, max} ?	6	Feb 8, 2008
Questions that "Idiot Guides" don't start with..	3	Jan 16, 2024
Twitter Bot for Series recommendations help please	1	Oct 2, 2024
Why is regex so slow?	21	Jun 18, 2013

Concatenating Regex Smartly

Shak Shak

Robert Klemme

Brian Candler

Robert Dober

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads