A
Aaron D. Gifford
Hi,
I find I periodically need to iterate over slices of a string.
Enumerable has the useful each_slice method, but in Ruby 1.9, I don't
see an equivalent for the String class.
So I've monkey-patched String a bit like this:
## Monkeypatch String to add some each_*slice* methods:
class String
## Like Enumerable#each_slice() only it yields a string
## of chars characters (the slice):
def each_slice(chars)
self.scan(/.{1,#{chars}}/m).each do |s|
yield s
end
end
## Like Enumerable#each_slice() only it yields an array
## of Fixnum bytes from the string (the slice):
def each_byteslice(bytes)
self.bytes.to_a.each_slice(bytes) do |s|
yield s
end
end
## Like Enumerable#each_slice() only it yields a binary
## string of specified bytes (the slice):
def each_bslice(bytes)
if encoding == Encoding::BINARY
str = self
else
str = self.dup.force_encoding(Encoding::BINARY)
end
str.scan(/.{1,#{bytes}}/m).each do |s|
yield s
end
end
end
So now for the question. Is there a better way to accomplish
something similar? I'm not debating whether to do it as a monkey
patch or not--that's irrelevant to me. But is there a more efficient
way to slice up strings and iterate over fixed sized chunks?
One alternative each_bslice implementation I tried used
str.bytes.to_a.map(&:chr).each_slice(x){|c| p c.join} but it was a bit
slower in benchmarks versus the str.scan method.
Aaron out.
I find I periodically need to iterate over slices of a string.
Enumerable has the useful each_slice method, but in Ruby 1.9, I don't
see an equivalent for the String class.
So I've monkey-patched String a bit like this:
## Monkeypatch String to add some each_*slice* methods:
class String
## Like Enumerable#each_slice() only it yields a string
## of chars characters (the slice):
def each_slice(chars)
self.scan(/.{1,#{chars}}/m).each do |s|
yield s
end
end
## Like Enumerable#each_slice() only it yields an array
## of Fixnum bytes from the string (the slice):
def each_byteslice(bytes)
self.bytes.to_a.each_slice(bytes) do |s|
yield s
end
end
## Like Enumerable#each_slice() only it yields a binary
## string of specified bytes (the slice):
def each_bslice(bytes)
if encoding == Encoding::BINARY
str = self
else
str = self.dup.force_encoding(Encoding::BINARY)
end
str.scan(/.{1,#{bytes}}/m).each do |s|
yield s
end
end
end
So now for the question. Is there a better way to accomplish
something similar? I'm not debating whether to do it as a monkey
patch or not--that's irrelevant to me. But is there a more efficient
way to slice up strings and iterate over fixed sized chunks?
One alternative each_bslice implementation I tried used
str.bytes.to_a.map(&:chr).each_slice(x){|c| p c.join} but it was a bit
slower in benchmarks versus the str.scan method.
Aaron out.