DRY gsub...

Jan Svitok · Jan 12, 2007

Josselin said:
Josselin said:

I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

Click to expand...

What's wrong with:

a = d.split(/\r\n|[;, ]/)

Or do you need d to be mangled as before?

Although I probably would do something even shorter like this:

a = d.split(/[;,\s]+/)

However, for certain inputs that won't give exactly the same as your
initial multi-step procedure.

Also, any time you write:

d = d.gsub(...)

You're probably better off with:

d.gsub!(...)

...unless you don't want to modify the original object passed as
argument (I'm not sure if this is proper English construct ;-) I mean
in that case the caller will see the modifications as well)

Simon Strandgaard · Jan 12, 2007

(?: ) is a non-capturing group

example if you want to match a repeating pattern,
but don't want the repeating stuff in your output

"abcde xyx xyx xyx abcde".scan(/(?:xyx ){2,}.*(b.*d)/)
#=> [["bcd"]]

if you use ( ) then it shows up in the output

"abcde xyx xyx xyx abcde".scan(/(xyx ){2,}.*(b.*d)/)
#=> [["xyx ", "bcd"]]

James Edward Gray II · Jan 12, 2007

Phrogz said:
Phrogz said:

James said:

a = d.split(/(?:\r\n|[;, ])/)

Click to expand...

Way more elegant. Way to see beyond the step-by-step process to
the end
goal.

Click to expand...

Except that there's no need for the non-capturing group, so
(simplifying, not golfing):

a = d.split( /\r\n|[;, ]/ )

You're right, it's not needed. I'm just in the habit of always
surrounding | options of a regex with grouping to control their
scope. I guess I've been bitten by those matching issues one time
too many.

James Edward Gray II

James Britt · Jan 12, 2007

Phrogz said:
BTW, that is already reasonably DRY, in my opinion. Calling the same
method repeatedly but with different parameters is not "repeating
yourself".

Looking at this, and some of the suggested alternatives, I can see how
it would get tedious to add more characters to the "replace with space"
set.

The use of compact regular expressions doesn't make the code easier to
read or maintain.

It may be useful to define the set of special characters, then use that
to drive a string transformation.

REPLACE_WITH_SPACE = %w{
\r\n
;
,
}.map{ |c| Regexp.new(c) }

class String
def swap_to_spaces
s = self.dupe
REPLACE_WITH_SPACE.each do |re|
s.gsub!( re, ' ')
end
s
end
end

a = d.swap_to_spaces.split( ' ' )

Or something along those lines.

--
James Britt

http://www.ruby-doc.org - Ruby Help & Documentation
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

James Edward Gray II · Jan 12, 2007

I know you got lots of answers but what about

a = d.gsub(/;|,/," ").split

No need for a Regexp there:

a = d.tr(";,", " ").split

James Edward Gray II

Andy Lester · Jan 12, 2007

There's nothing in these four lines of code that violates the idea of
DRY. There is no repeated code. Multiple calls to the same method
are perfectly OK.

xoa

Henrik Schmidt · Jan 12, 2007

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

tfyl

Joss

I would probably go with

a = d.chop.split(/[\s,;]/)

Best regards,
Henrik Schmidt

Henrik Schmidt · Jan 12, 2007

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

tfyl

Joss

I would probably go with

a = d.chomp.split(/[\s,;]/)

Best regards,
Henrik Schmidt

Mike Harris · Jan 12, 2007

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

tfyl

Joss

Specific to this example, everyone else is right, and the best way is to
consolidate the regex or simply use a condensed split call. However, in
the general case, you could do this

[ /\r\n/ , /;/ , /,/].inject(d) { |s,reg| s.gsub(reg,' ') }.split(' ')

Gregory Seidman · Jan 12, 2007

]

Looking at this, and some of the suggested alternatives, I can see how
it would get tedious to add more characters to the "replace with space"
set.

The use of compact regular expressions doesn't make the code easier to
read or maintain.

It may be useful to define the set of special characters, then use that
to drive a string transformation.

REPLACE_WITH_SPACE = %w{
\r\n
;
,
}.map{ |c| Regexp.new(c) }

class String
def swap_to_spaces
s = self.dupe
REPLACE_WITH_SPACE.each do |re|
s.gsub!( re, ' ')
end
s
end
end

a = d.swap_to_spaces.split( ' ' )

Or something along those lines.

Cleaned up:

DELIMITERS = Regexp.new([
" ",
"\r\n",
";",
","
].map{ |c| Regexp.escape(c) }.join("|"))

a = d.split(DELIMITERS)

James Britt

--Greg

James Britt · Jan 13, 2007

Gregory said:
Cleaned up:

The whole point was *not* to clean it up, but to make obvious what and
why something was happening in the code.

Brevity is the soul of wit, but it can play havoc with code maintenance.

DELIMITERS = Regexp.new([
" ",
"\r\n",
";",
","
].map{ |c| Regexp.escape(c) }.join("|"))

a = d.split(DELIMITERS)

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what's driving the split.

Gregory Seidman · Jan 14, 2007

Gregory said:
Gregory said:

Cleaned up:

Click to expand...

The whole point was *not* to clean it up, but to make obvious what and
why something was happening in the code.

Brevity is the soul of wit, but it can play havoc with code maintenance.

DELIMITERS = Regexp.new([
" ",
"\r\n",
";",
","
].map{ |c| Regexp.escape(c) }.join("|"))

a = d.split(DELIMITERS)

Click to expand...

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what's driving the split.

The cleaned up version includes the delimiters in an array of individual
strings. Your original complaint was about readability and code
maintenance. While I agree that a long literal Regexp can be hard to read
and hard to maintain, you can achieve the same efficiency of that Regexp
without sacrificing readability using the solution above. Perhaps the
following would make you happier?

module Whatever
DELIMITERS = [
" ",
"\r\n",
";",
","
]

def split_string(str)
@delimiter_regexp ||= Regexp.new(DELIMITERS.map{ |c| Regexp.escape(c) }.join("|"))
str.split(@delimiter_regexp)
end
extend self
end

a = Whatever.split_string(d)

(If you want to make it even fancier so you can modify DELIMITERS at
runtime you'll have to do something clever with hashes.)

If the code above does not fulfill what you were intending, please do
explain why; if I've missed the point, I'd like to know it and to try again
at understanding.

James Britt

--Greg

gsub help	7	Dec 13, 2009
Yet another gsub question.	5	Jan 20, 2005
gsub problem	13	Mar 2, 2006
escaping single quotes in a string with gsub	5	Nov 3, 2004
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
gsub!, replace with \'	13	May 30, 2004
FW: Fml status report (ruby-talk ML)	1	Dec 19, 2010
DateTime about the zone	3	Jun 12, 2010

DRY gsub...

Jan Svitok

Simon Strandgaard

James Edward Gray II

James Britt

James Edward Gray II

Andy Lester

Henrik Schmidt

Henrik Schmidt

Mike Harris

Gregory Seidman

James Britt

Gregory Seidman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads