Odd behavior of String#scan

W

Warren Brown

First off, the problem I am trying to solve can be simplified down
to:

'abcSTARTdef,ghi,jkl,ENDmno'.scan(/START([^,]*,)*END/)

What I want is [["def,"], ["ghi,"], ["jkl,"]] (or the same thing
without the commas), and I still need a way to achieve this. I can
accomplish it with:

'abcSTARTdef,ghi,jkl,ENDmno'.scan(/START(.*)END/)[0][0].split(/,/)

But this does two operations where it seems like one should suffice.
Does someone know of a way to do this in a single operation?


Back to the odd behavior, the first expression actually returns
[["jkl,"]]. I can't figure out how that is the correct answer by any
reasonable definition of "scan". However, the equivalent String#match
does the same kind of thing, so I must be missing something. Can
someone please explain this behavior?


Thanks,

- Warren Brown
 
B

Bob Showalter

Warren said:
First off, the problem I am trying to solve can be simplified down
to:

'abcSTARTdef,ghi,jkl,ENDmno'.scan(/START([^,]*,)*END/)

What I want is [["def,"], ["ghi,"], ["jkl,"]] (or the same thing
without the commas), and I still need a way to achieve this. I can
accomplish it with:

'abcSTARTdef,ghi,jkl,ENDmno'.scan(/START(.*)END/)[0][0].split(/,/)

But this does two operations where it seems like one should suffice.
Does someone know of a way to do this in a single operation?

I would suggest:

'abcSTARTdef,ghi,jkl,ENDmno'.match(/START(.*)END/)[1].split(',')

I don't know of a way to accomplish it in one step. String#scan attempts
to match the whole regex at multiple places within the string; but you
need the START and END to delimit the substring over which scan operates.
Back to the odd behavior, the first expression actually returns
[["jkl,"]]. I can't figure out how that is the correct answer by any
reasonable definition of "scan". However, the equivalent String#match
does the same kind of thing, so I must be missing something. Can
someone please explain this behavior?

Because you have this:

([^,]*,)*

The final * allows the group to match multiple times. The MatchData will
hold only the last match however, which is "jkl,".
 
D

Dan Diebolt

--0-2044349648-1135204258=:31378
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Will this do?
=20
s=3D"abcSTARTdef,ghi,jkl,ENDmno"
s.scan(/START(.*)END/).to_s.split(",")
=20
=3D> ["def", "ghi", "jkl"]

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20
--0-2044349648-1135204258=:31378--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,697
Latest member
AugustNabo

Latest Threads

Top