linefilter - looking for suggestions

M

Martin DeMello

I'm writing a small wrapper around ruby, meant to be used as part of a
unix pipeline filter - e.g.

ls -l | rbx 'cols(8,4).mapf:)to_s).formatrow(" ", 30, 8).endl'

It basically consists of a small 'rbx' executable, and a 'linefilter'
library, which contains useful extensions to String - ideally, it'll let
rbx be used as a convenient replacement for awk, sed and perl for quick
one-liners (code below).

Any suggestions for improvements or useful additional methods? One thing
I'm considering is an Array#apply, which invokes a differnt method on
each member of an array (for instance the above example could have it
inserted as .mapf:)to_s).apply:)ljust, :rjust).formatrow(...)) - do
people like the name?

martin

rbx (for want of a better short name) is

#----------------------------------------------------------------------------
#!/usr/bin/ruby

class Array
def take_while!
r = []
while yield(at(0))
r << shift
end
r
end
end

# -- MAIN --

flags = ARGV.take_while! {|i| i =~ /^-/}
command = "'print \$_.to_s.instance_eval {#{ARGV.shift}}'"
files = ARGV.dup

system("ruby -rlinefilter #{flags.join(" ")} -ne #{command} #{files.join(" ")}")

#----------------------------------------------------------------------------

and linefilter contains the following methods (mostly from standard
class extensions):

module Enumerable
def map_with_index
a = []
each_with_index {|e, i| a << yield(e,i)}
a
end

def mapf(method, *args)
collect do |value|
value.send(method, *args)
end
end

end


class String

def endl
concat("\n")
end

def cols(*args)
a = split(/\s+/)
args.map {|i| a}
end

def spjoin
join(" ")
end

# from http://www.rubygarden.org/ruby?StringSub

# 'number' leftmost chars
def left(number = 1)
self[0..number-1]
end

# 'number' rightmost chars
def right(number = 1)
self[-number..-1]
end

# 'number' chars starting at position 'from'
def mid(from, number=1)
self[from..from+number-1]
end

# chars from beginning to 'position'
def head(position = 0)
self[0..position]
end

# chars following 'position'
def tail(position = 0)
self[position+1..-1]
end

# Tabs left or right by n chars, using spaces
def tab(n)
if n >= 0
gsub(/^/, ' ' * n)
else
gsub(/^ {0,#{-n}}/, "")
end
end

alias_method :indent, :tab

# Preserves relative tabbing.
# The first non-empty line ends up with n spaces before nonspace.
def tabto(n)
if self =~ /^( *)\S/
tab(n - $1.length)
else
self
end
end

end

class Array
def formatrow(separator, *widths)
map_with_index {|a,i|
w = widths
(a.to_s).slice(0..(w-1)).ljust(w)
}.join(separator)
end
end
 
R

Robert Klemme

Martin DeMello said:
I'm writing a small wrapper around ruby, meant to be used as part of a
unix pipeline filter - e.g.

ls -l | rbx 'cols(8,4).mapf:)to_s).formatrow(" ", 30, 8).endl'

I'd perfer to have something in front of cols() that does the splitting -
otherwise it's always space separated which might limit usefulness.

You could do

class String
alias split_old split

def split(x)
case x
when Regexp
split_old(x)
when :space
split_old( /\s+/ )
when String
split_old( x )
else
split_old( x.to_s )
end
end
end
It basically consists of a small 'rbx' executable, and a 'linefilter'
library, which contains useful extensions to String - ideally, it'll let
rbx be used as a convenient replacement for awk, sed and perl for quick
one-liners (code below).

Any suggestions for improvements or useful additional methods? One thing
I'm considering is an Array#apply, which invokes a differnt method on
each member of an array (for instance the above example could have it
inserted as .mapf:)to_s).apply:)ljust, :rjust).formatrow(...)) - do
people like the name?

martin

rbx (for want of a better short name) is

#-------------------------------------------------------------------------
---
#!/usr/bin/ruby

class Array
def take_while!

This name is misleading since it suggests that array is manipulated in
place. Better remove the "!".

Even better use optionparse to process options.
r = []
while yield(at(0))
r << shift
end
r
end
end

# -- MAIN --

flags = ARGV.take_while! {|i| i =~ /^-/}
command = "'print \$_.to_s.instance_eval {#{ARGV.shift}}'"
files = ARGV.dup

system("ruby -rlinefilter #{flags.join(" ")} -ne #{command}
#{files.join(" ")}")

Why do you spawn an extra process here? IMHO that's superfluous. If you
put the command into a block, your main loop will look like this:

while ( line = gets )
line.chomp!
command.call line
end

To do that you just need

command = eval %Q{ lambda {|line| puts line.instance_eval(
'#{ARGV.shift}' ) } }

Regards

robert


#-------------------------------------------------------------------------
---

and linefilter contains the following methods (mostly from standard
class extensions):

module Enumerable
def map_with_index
a = []
each_with_index {|e, i| a << yield(e,i)}
a
end

def mapf(method, *args)
collect do |value|
value.send(method, *args)
end
end

end


class String

def endl
concat("\n")
end

def cols(*args)
a = split(/\s+/)
args.map {|i| a}
end

def spjoin
join(" ")
end

# from http://www.rubygarden.org/ruby?StringSub

# 'number' leftmost chars
def left(number = 1)
self[0..number-1]
end

# 'number' rightmost chars
def right(number = 1)
self[-number..-1]
end

# 'number' chars starting at position 'from'
def mid(from, number=1)
self[from..from+number-1]
end

# chars from beginning to 'position'
def head(position = 0)
self[0..position]
end

# chars following 'position'
def tail(position = 0)
self[position+1..-1]
end

# Tabs left or right by n chars, using spaces
def tab(n)
if n >= 0
gsub(/^/, ' ' * n)
else
gsub(/^ {0,#{-n}}/, "")
end
end

alias_method :indent, :tab

# Preserves relative tabbing.
# The first non-empty line ends up with n spaces before nonspace.
def tabto(n)
if self =~ /^( *)\S/
tab(n - $1.length)
else
self
end
end

end

class Array
def formatrow(separator, *widths)
map_with_index {|a,i|
w = widths
(a.to_s).slice(0..(w-1)).ljust(w)
}.join(separator)
end
end
 
M

Martin DeMello

Robert Klemme said:
I'd perfer to have something in front of cols() that does the splitting -
otherwise it's always space separated which might limit usefulness.

Hm - I was trying to avoid an extraneous 'split', since cols always
requires one. Maybe check if the first argument is a string and split on
that, otherwise split on space.
This name is misleading since it suggests that array is manipulated in
place. Better remove the "!".

It is - I'm using 'shift'. I figured optionparse was overkill since I
just wanted to pass the options along to ruby.
Why do you spawn an extra process here? IMHO that's superfluous. If you
put the command into a block, your main loop will look like this:

Mostly because it started life as a shellscript, and then migrated over
when I coldn't figure out how to do option parsing properly :)
while ( line = gets )
line.chomp!
command.call line
end

To do that you just need

command = eval %Q{ lambda {|line| puts line.instance_eval(
'#{ARGV.shift}' ) } }

But how will that let us pass options to the ruby interpreter?

martin
 
R

Robert Klemme

Martin DeMello said:
Hm - I was trying to avoid an extraneous 'split', since cols always
requires one. Maybe check if the first argument is a string and split on
that, otherwise split on space.

I'd find it more clean if it was not a parameter to cols but an extra
operation before. You can leave it as it is with two changes if you just
add method cols to Array (or enumerable). Then you can use the old
behavior (i.e. implicit split by white space) and additionally you can use
String#split to do the splitting.
It is - I'm using 'shift'. I figured optionparse was overkill since I
just wanted to pass the options along to ruby.

Oh, yeah. Sorry, I overlooked that.
Mostly because it started life as a shellscript, and then migrated over
when I coldn't figure out how to do option parsing properly :)

"Historic reasons". :)
But how will that let us pass options to the ruby interpreter?

Which options do you need to pass on? If it's not too esoteric, you might
be able to set them via global variables.

Regards

robert
 
M

Martin DeMello

Robert Klemme said:
I'd find it more clean if it was not a parameter to cols but an extra
operation before. You can leave it as it is with two changes if you just
add method cols to Array (or enumerable). Then you can use the old
behavior (i.e. implicit split by white space) and additionally you can use
String#split to do the splitting.

Good point - and it's the nicely polymorphic way to do it too. I'll have
to see if the array/string idea can extend to other methds too. A third
option is to simply use -F, though if you're writing a one-liner going
back and adding it in if you realise you need it at the time you're
typing in 'cols' will be a pain.
Which options do you need to pass on? If it's not too esoteric, you might
be able to set them via global variables.

The main ones I've found useful in practice are -i and -rwhatever
(ideally I'd like an interpreter switch for the $_.instance_eval loop,
parallel to -n and -p, but when I RCRd it it wasn't too popular). You're
right, they probably could be set via globals, which would save us the
extra process spawn. I might think up some options that make sense for
rbx but not for ruby too, in which case optparse is definitely the way
to go.

martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
anuragag27

Latest Threads

Top