DRY gsub...

J

Jan Svitok

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

What's wrong with:

a = d.split(/\r\n|[;, ]/)

Or do you need d to be mangled as before?

Although I probably would do something even shorter like this:

a = d.split(/[;,\s]+/)

However, for certain inputs that won't give exactly the same as your
initial multi-step procedure.

Also, any time you write:

d = d.gsub(...)

You're probably better off with:

d.gsub!(...)

...unless you don't want to modify the original object passed as
argument (I'm not sure if this is proper English construct ;-) I mean
in that case the caller will see the modifications as well)
 
S

Simon Strandgaard

(?: ) is a non-capturing group

example if you want to match a repeating pattern,
but don't want the repeating stuff in your output

"abcde xyx xyx xyx abcde".scan(/(?:xyx ){2,}.*(b.*d)/)
#=> [["bcd"]]



if you use ( ) then it shows up in the output

"abcde xyx xyx xyx abcde".scan(/(xyx ){2,}.*(b.*d)/)
#=> [["xyx ", "bcd"]]
 
J

James Edward Gray II

Phrogz said:
James said:
a = d.split(/(?:\r\n|[;, ])/)

Way more elegant. Way to see beyond the step-by-step process to
the end
goal.

Except that there's no need for the non-capturing group, so
(simplifying, not golfing):

a = d.split( /\r\n|[;, ]/ )

You're right, it's not needed. I'm just in the habit of always
surrounding | options of a regex with grouping to control their
scope. I guess I've been bitten by those matching issues one time
too many.

James Edward Gray II
 
J

James Britt

Phrogz said:
BTW, that is already reasonably DRY, in my opinion. Calling the same
method repeatedly but with different parameters is not "repeating
yourself".


Looking at this, and some of the suggested alternatives, I can see how
it would get tedious to add more characters to the "replace with space"
set.

The use of compact regular expressions doesn't make the code easier to
read or maintain.

It may be useful to define the set of special characters, then use that
to drive a string transformation.

REPLACE_WITH_SPACE = %w{
\r\n
;
,
}.map{ |c| Regexp.new(c) }

class String
def swap_to_spaces
s = self.dupe
REPLACE_WITH_SPACE.each do |re|
s.gsub!( re, ' ')
end
s
end
end


a = d.swap_to_spaces.split( ' ' )



Or something along those lines.

--
James Britt

http://www.ruby-doc.org - Ruby Help & Documentation
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys
 
J

James Edward Gray II

I know you got lots of answers but what about

a = d.gsub(/;|,/," ").split

No need for a Regexp there:

a = d.tr(";,", " ").split

James Edward Gray II
 
A

Andy Lester

There's nothing in these four lines of code that violates the idea of
DRY. There is no repeated code. Multiple calls to the same method
are perfectly OK.

xoa
 
H

Henrik Schmidt

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

tfyl

Joss

I would probably go with

a = d.chop.split(/[\s,;]/)

Best regards,
Henrik Schmidt
 
H

Henrik Schmidt

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

tfyl

Joss
I would probably go with

a = d.chomp.split(/[\s,;]/)

Best regards,
Henrik Schmidt
 
M

Mike Harris

Josselin said:
I wrote the following ruby statements.. I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

d = d.gsub(/\r\n/,' ') # get rid of carriage return
d = d.gsub(/;/,' ') # replace column by space
d = d.gsub(/,/,' ') # replace comma by space
a = d.split(' ') # split into component , space as divider

tfyl

Joss
Specific to this example, everyone else is right, and the best way is to
consolidate the regex or simply use a condensed split call. However, in
the general case, you could do this

[ /\r\n/ , /;/ , /,/].inject(d) { |s,reg| s.gsub(reg,' ') }.split(' ')
 
G

Gregory Seidman

]
Looking at this, and some of the suggested alternatives, I can see how
it would get tedious to add more characters to the "replace with space"
set.

The use of compact regular expressions doesn't make the code easier to
read or maintain.

It may be useful to define the set of special characters, then use that
to drive a string transformation.

REPLACE_WITH_SPACE = %w{
\r\n
;
,
}.map{ |c| Regexp.new(c) }

class String
def swap_to_spaces
s = self.dupe
REPLACE_WITH_SPACE.each do |re|
s.gsub!( re, ' ')
end
s
end
end


a = d.swap_to_spaces.split( ' ' )



Or something along those lines.

Cleaned up:

DELIMITERS = Regexp.new([
" ",
"\r\n",
";",
","
].map{ |c| Regexp.escape(c) }.join("|"))

a = d.split(DELIMITERS)
James Britt
--Greg
 
J

James Britt

Gregory said:
Cleaned up:

The whole point was *not* to clean it up, but to make obvious what and
why something was happening in the code.

Brevity is the soul of wit, but it can play havoc with code maintenance.
DELIMITERS = Regexp.new([
" ",
"\r\n",
";",
","
].map{ |c| Regexp.escape(c) }.join("|"))

a = d.split(DELIMITERS)

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what's driving the split.
 
G

Gregory Seidman

Gregory said:
Cleaned up:

The whole point was *not* to clean it up, but to make obvious what and
why something was happening in the code.

Brevity is the soul of wit, but it can play havoc with code maintenance.
DELIMITERS = Regexp.new([
" ",
"\r\n",
";",
","
].map{ |c| Regexp.escape(c) }.join("|"))

a = d.split(DELIMITERS)

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what's driving the split.

The cleaned up version includes the delimiters in an array of individual
strings. Your original complaint was about readability and code
maintenance. While I agree that a long literal Regexp can be hard to read
and hard to maintain, you can achieve the same efficiency of that Regexp
without sacrificing readability using the solution above. Perhaps the
following would make you happier?

module Whatever
DELIMITERS = [
" ",
"\r\n",
";",
","
]

def split_string(str)
@delimiter_regexp ||= Regexp.new(DELIMITERS.map{ |c| Regexp.escape(c) }.join("|"))
str.split(@delimiter_regexp)
end
extend self
end

a = Whatever.split_string(d)

(If you want to make it even fancier so you can modify DELIMITERS at
runtime you'll have to do something clever with hashes.)

If the code above does not fulfill what you were intending, please do
explain why; if I've missed the point, I'd like to know it and to try again
at understanding.
James Britt
--Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top