Ruby script to Module/Class refactor

J

James B. Byrne

I have written my first live ruby script that actually performs useful
work as an exercise in syntax discovery. It is very much a straight
line utility that could just as easily be written in bash. What I would
like is for someone to look at the code reproduced below and guide me
through the steps to convert this into something approximating the ruby
OOP idiom. I come from a 4GL universe working on mid-range and high end
iron and OS rather than unix/linux and OOP. So, I have a major
challenge getting my head out of first gear.

Looking at this I wondered what can be placed into separate classes. It
seems to me that the check for environment variables and defaluts and
then creating the resulting directory strings and regexp objects is
either one functional group or two, depending if one considers the
search for environmental settings totally separate from the directory
validation process or not. I tend to see two classes; GetDirs and
CheckDirs. Another class would likely be MoveFiles.

module mv2csv

class GetDirs(envlist[],arglist[]=nil)

def initalize
...
end

def envlist.each do |env|
...
end
end

My difficulty is that I do not see how all this hangs together after the
classes and methods are written. Does the script end up as a ruby
module? Then what do I do? Do I have a "main" procedure below the class
definitions in the module? I would really appreciate it if someone
could invest the time to walk me through this because once I see how to
map what I presently do to the ruby way I expect that many things I am
unsure about will suddenly become clear. I am particularly interested
in how to redo this so that Test::Unit can be used to drive it for
testing.

Regards,
Jim



The working code:

#!/usr/bin/env ruby
#
#--
# Put RDOC comments between ++ and -- comment tags.
#--
#++
# hll-qpxfr.rb - Byrne, James B. - JBB8 -
# version: 1.0 2006 Feb 15 - Initial draught.
#
# A ruby script to transfer csv data files delivered via FTP to
# an anonymous world-readable site to a secure data directory
# and append .csv extension to each.
#
# The files originate on an HP3000 and are ascii csv files
# sans header line. Their names take the form QP?????#
#--
#
#### Standard code layout follows.
#
## require <library> # loads library first time encountered.

require 'find'
require 'fileutils'
include FileUtils::DryRun
require 'syslog'

## GLOBAL CONSTANTS
#
# Run unit tests and traces?
VERBOSE_ON = TRUE

# If the appropriate environmental variables are not set then
# find files matching this pattern:
DFLT_FILE_RE = "^QPCCCMR[[:digit:]]$" # alternate "^QPCCCMR[1-3]$"

# using these options:
DFLT_FILE_RE_OPT = ""

# and add this extension to them:
DFLT_FILE_EXT = ".csv"

# while searching this directory:
DFLT_DIR_IN = "./ruby/data/source"

# and move the result to here.
DFLT_DIR_OUT = "./ruby/data/target"

# Munge output filenames to lowercase? uppercase? Ignore? (niu:
2006-02-10)
#DFLT_FILE_OUT_CASE = "IGNORE"
#DFLT_FILE_OUT_CASE = "DNCASE" # down-shift case
#DFLT_FILE_OUT_CASE = "UPCASE" # up-shift case

#
## module <Modulename>
#
## class <Classname>
#
## include <library> # add library code to class defintion
#
## def <methodname>
#
## load <library> # loads library each time encountered.
#
## main line code after here
#
begin

#
# Set up SYSLOG
log = Syslog.open($0)
log.info("Searching for updated files from HP3000 in ftp drop.")

# ENV check for: HLL_QP_DIR_IN, HLL_QP_DIR_OUT, HLL_QP_REGEX,
# HLL_QP_EXT and overide DEFAULT values if found.
# Regardless, verify that a valid directory is provided.

f_dir_in = ENV["HLL_QP_DIR_IN"] ? ENV["HLL_QP_DIR_IN"].to_s :
DFLT_DIR_IN
if !FileTest.directory?(f_dir_in) then
$stderr.print "Invalid source directory " + f_dir_in + " specified.\n"
log.err(("Invalid source directory " + f_dir_in + " specified"))
if ENV["HLL_QP_DIR_IN"] then
$stderr.print "Check setting of ENV: HLL_QP_DIR_IN\n"
log.err(("Check setting of ENV: HLL_QP_DIR_IN"))
end
Process.exit
else
log.info(("Source directory is: " + f_dir_in))
if !File.readable?(f_dir_in) then
$stderr.print f_dir_in + " is NOT readable. Process aborting.\n"
log.err((f_dir_in + " is NOT readable. Process aborting"))
Process.exit
end
end

f_dir_out = ENV["HLL_QP_DIR_OUT"] ? ENV["HLL_QP_DIR_OUT"].to_s :
DFLT_DIR_OUT
if !FileTest.directory?(f_dir_out) then
$stderr.print "Invalid target directory " + f_dir_out + "
specified.\n"
log.err(("Invalid target directory " + f_dir_out + " specified"))
if ENV["HLL_QP_DIR_OUT"] then
$stderr.print "Check setting of ENV: HLL_QP_DIR_OUT\n"
log.err(("Check setting of ENV: HLL_QP_DIR_OUT"))
end
Process.exit
else
log.info(("Target directory is: " + f_dir_out))
if !File.writable?(f_dir_out) then
$stderr.print f_dir_out + " is NOT writable. Process aborting.\n"
log.err((f_dir_out + " is NOT writable. Process aborting"))
Process.exit
end
end

# set the file extension value if present. May be nil.
f_nam_ext = ENV["HLL_QP_EXT"] ? ENV["HLL_QP_EXT"].to_s : DFLT_FILE_EXT

# create a valid RE for search. Note that setting options in
# Regexp.new(RE[,opt]) requires a bit pattern expressed as an integer or
the
# equivilent library CONSTANTS (REGEXP::INSENSITIVE, etc.) not, as might
# be expected, a character string. Further, if any non-nil value is
# passed as an option then the /i switch is turned on even if the option
# value is invalid. (So much for the principle of least surprise!)
f_nam_re_s = ENV["HLL_QP_REGEX"] ? ENV["HLL_QP_REGEX"].to_s :
DFLT_FILE_RE
f_nam_re = Regexp.new(f_nam_re_s)

# check that any files in the source directory are read-write
# enabled for current process gid or uid.
Find.find(f_dir_in) do |filename|
if VERBOSE_ON then
puts filename + " " + File.basename(filename) + " " +
f_nam_re.to_s
end
if FileTest.file?(filename) and
File.basename(filename) =~ f_nam_re
then
if VERBOSE_ON then
puts filename
puts FileTest.readable?(filename) ? "\treadable" \
: "\tnot readable"
puts FileTest.writable?(filename) ? "\twritable" \
: "\tnot writable"
puts filename + " moves to " + \
(File.join(f_dir_out, (File.basename(filename) +
f_nam_ext)))
else
# Move oldfile to same name in target dir with extension
appended.
# If we ever enable case munging then this is where it will
happen.
FileUtils.move((filename), \
(File.join(f_dir_out, (File.basename(filename) +
f_nam_ext))))
log.info(("Moving " + filename + " to " + \
(File.join(f_dir_out, (File.basename(filename) +
f_nam_ext)))))
end
next
else
# do not recurse directories.
if FileTest.directory?(filename)
if VERBOSE_ON then puts filename + " is a directory." end
if File.basename(filename)!= File.basename(f_dir_in)
if VERBOSE_ON then
puts File.basename(filename) + " is not a regular file."
end
Find.prune
else
next
end
end
end
end

#
## catch problems below here
rescue
if VERBOSE_ON then
puts "rescue block entered: " + $!.to_s
puts
end

#
## put finalizer code below here
ensure
if VERBOSE_ON then
puts "ensure block entered: " + $!.to_s
puts
end

##-- put test code below here
if VERBOSE_ON

#
## Display working data set.

puts "\t Input directory: " + f_dir_in.to_s
puts "\tTarget directory: " + f_dir_out.to_s
puts "\tSet extension to: " + f_nam_ext.to_s
puts "\nThe passed regexp is: \n\t" + f_nam_re_s
puts "The regexp used is: "
p f_nam_re
puts "$= is: " + $=.to_s

# Set up a set of strings for possible file names
qpa =
["qpcccmr1","QPCCCMR1","QPCCMMR1","qpCCCMR1","QPCCCMRC","QPCCCMR1.csv"]
qpa += ["QPCCCMR9","XXQPCCCMR9","QPCCCMR9yyy","XyZQPCCCMR3AbC"]

# Test to see which names match
qpa.each do |a|
case a
when f_nam_re then puts a + " matches"
else puts a + " does not match"
end
end

puts "Ruby run of " + __FILE__.to_s + " as " + $0.to_s + \
" ends at line: " + __LINE__.to_s

#end VERBOSE_ON block
end

log.info(($0 + " finished"))
#end main
end
 
E

Eric Schwartz

James B. Byrne said:
I have written my first live ruby script that actually performs useful
work as an exercise in syntax discovery. It is very much a straight
line utility that could just as easily be written in bash. What I would
like is for someone to look at the code reproduced below and guide me
through the steps to convert this into something approximating the ruby
OOP idiom. I come from a 4GL universe working on mid-range and high end
iron and OS rather than unix/linux and OOP. So, I have a major
challenge getting my head out of first gear.

OOP is a different mindset. But given that you work on big iron, you
might have an advantage that you don't know about. I find that when I
describe OOP to people unfamiliar with it, it's easiest to understand
if you start with the data, and add the methods later. Given that
your universe revolves around data, this might not be so hard.
Looking at this I wondered what can be placed into separate classes. It
seems to me that the check for environment variables and defaluts and
then creating the resulting directory strings and regexp objects is
either one functional group or two, depending if one considers the
search for environmental settings totally separate from the directory
validation process or not. I tend to see two classes; GetDirs and
CheckDirs.

Here, your focus is on what your code is *doing*. Think, instead,
about what it *is*. That is, what data are you collecting, and what
are you doing with it? I would argue that instead what you need is a
Configuration object that stores data like, "What directories should I
be fetching from which machines, and where should I put them?", and
give that object methods like validate_directories() and so on.

Then think of what other things you want to store data about. Say,
your machines. You could define a Machine class that stored generic
data, such as its name, location, and other generic info. You then
could subclass that into HP3000, ZOS_box, and so on. Each of those
would contain methods like get_file(), which would fetch a file from
that machine to the local directory, and maybe reboot(), which would
annoy everybody else using that machine. ;-)

A very loose description might be something like this:

class Configuration
def initialize()
# set all variables to their default values,
# overriding with ENV vars as required.
end

def validate_directories()
# ensure all directories you rely on are present
end
end

class Machine
def intialize(name, location)
# set generic info about Machine here
end
end

class HP3000
def initialize(name, location)
super(); # let Machine handle this
end

def get_file(filename, destination)
# do stuff to put filename on the hp3000 into
# destination
end
end

Notice again how my focus is on the nouns (what things *are*) as
opposed to the verbs (what things *do*). Also notice that the verbs
are attached to the thing that does them in most cases. You could
make an argument that get_file doesn't belong with the HP3000, that it
should be its own top-level subroutine. My reason for placing it
there is this: there is probably a different procedure for each type
of machine you deal with for getting files off of it. By putting the
get_file() method in the machine-specific class, you are putting all
the machine class-specific information in one place, instead of
scattering it all around.

My two major principles when breaking things down are:

* What is the data?
* What does it do?

(In that order)
My difficulty is that I do not see how all this hangs together after the
classes and methods are written. Does the script end up as a ruby
module?

A module is a chunk of code you want to re-use elsewhere. If this
script is supposed to be invoked on its own, then no, you don't want
to make it a module. You may, however, want to put the various
classes you create into their own files so you can re-use them
elsewhere.
Then what do I do? Do I have a "main" procedure below the class
definitions in the module?

Any code you do not put in a method or other limited scope gets
executed in the order you see it. So basically, there is no main(),
you just start doing stuff (much as in a shell script).
I would really appreciate it if someone could invest the time to
walk me through this because once I see how to map what I presently
do to the ruby way I expect that many things I am unsure about will
suddenly become clear. I am particularly interested in how to redo
this so that Test::Unit can be used to drive it for testing.

I'm in a bit of a rush right now, but maybe someone else will explain
how a method such as I proposed makes it easy to test using
Test::Unit-- otherwise, I'll get back to you later and try and explain
it.

-=Eric
 
J

James b. Byrne

Eric said:
Here, your focus is on what your code is *doing*. Think, instead,
about what it *is*. That is, what data are you collecting, and what
are you doing with it? I would argue that instead what you need is a
Configuration object that stores data like, "What directories should I
be fetching from which machines, and where should I put them?", and
give that object methods like validate_directories() and so on.

...
Then think of what other things you want to store data about. Say,
your machines. You could define a Machine class that stored generic
data, such as its name, location, and other generic info. You then
could subclass that into HP3000, ZOS_box, and so on. Each of those
would contain methods like get_file(), which would fetch a file from
that machine to the local directory, and maybe reboot(), which would
annoy everybody else using that machine. ;-)

...
I'm in a bit of a rush right now, but maybe someone else will explain
how a method such as I proposed makes it easy to test using
Test::Unit-- otherwise, I'll get back to you later and try and explain
it.

-=Eric

Many thanks. I have clue where to begin now. I will try and redo this
more or less along the lines you recommend and post the result back
here. Just to clarify one point, the HP3000 drops off the files onto
the host that is running the Ruby script. There is no service running
on the HP3000 that will respond to a file transfer request, although, as
you point out, that does not preclude a scenario where the drop machine
is not the script runner.

Regards,
Jim
 
E

Eric Schwartz

James b. Byrne said:
Many thanks. I have clue where to begin now. I will try and redo this
more or less along the lines you recommend and post the result back
here. Just to clarify one point, the HP3000 drops off the files onto
the host that is running the Ruby script. There is no service running
on the HP3000 that will respond to a file transfer request, although, as
you point out, that does not preclude a scenario where the drop machine
is not the script runner.

Exactly. HP3000#get_file might in that case look something like:

class HP3000
def initialize(hostname)
@drop_dir = "/path/to/dropdir/#{hostname}"
end

def get_file(filename, destination)
system("cp #{@dropdir}/#{filename} #{destination}")
end
end

Or some such silliness-- and yes, there are more Rubyish ways to do
that, but, like Perl, one of the things I like about Ruby is that
you can do more or less in pure Ruby, depending on your taste and your
application.

-=Eric
 
D

Dave Cantrell

James said:
I have written my first live ruby script that actually performs useful
work as an exercise in syntax discovery. It is very much a straight
line utility that could just as easily be written in bash. What I would
like is for someone to look at the code reproduced below and guide me
through the steps to convert this into something approximating the ruby
OOP idiom. I come from a 4GL universe working on mid-range and high end
iron and OS rather than unix/linux and OOP. So, I have a major
challenge getting my head out of first gear.

Looking at this I wondered what can be placed into separate classes. It
seems to me that the check for environment variables and defaluts and
then creating the resulting directory strings and regexp objects is
either one functional group or two, depending if one considers the
search for environmental settings totally separate from the directory
validation process or not. I tend to see two classes; GetDirs and
CheckDirs. Another class would likely be MoveFiles.

James,

Try reading the RubyGarden wiki page WhatIsAnObject[0] and Allen Holub's
"Why Getter and Setter Methods are Evil" article and you should have a
better idea of what OO is supposed to look and feel like. *

And I just found "Tell, Don't Ask"[2] by the Pragmatic Programmers,
which helps me better understand OO design.

HTH,
-dave

* Both links were recently posted in other threads, and I learned a lot
from them. Thanks! :)

[0] http://www.rubygarden.org/ruby/ruby?WhatIsAnObject
[1] http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html
[2] http://www.pragmaticprogrammer.com/ppllc/papers/1998_05.html
 
J

James B. Byrne

James,

Try reading the RubyGarden wiki page WhatIsAnObject[0] and Allen Holub's
"Why Getter and Setter Methods are Evil" article and you should have a
better idea of what OO is supposed to look and feel like. *

And I just found "Tell, Don't Ask"[2] by the Pragmatic Programmers,
which helps me better understand OO design.

HTH,
-dave

* Both links were recently posted in other threads, and I learned a lot
from them. Thanks! :)

[0] http://www.rubygarden.org/ruby/ruby?WhatIsAnObject
[1] http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html
[2] http://www.pragmaticprogrammer.com/ppllc/papers/1998_05.html

Thanks, I will check these out directly. In the mean time I have come
up with this as my initial take on things. It does not yet actually
move things about but it seems to encompass most of the directory stuff
contained in the original script.

------------------------------------------------------------------------------
#++
# mv2dir.rb
# version 1.0 - 2006 Feb 16 - James B. Byrne -
#
# find, move and optionally rename data files that match glob expression
# provided.
#
# ruby mv2dir.rb source_dir glob_exp [target_dir=./] [ext]
# or
# ruby mv2dir.rb -s source_dir -g glob_exp [-t target_dir] [-e ext]
#
# TODO:
# Options:
# -e --ext <str> Append <str> as suffix to each
# filename in source.
# -g --glob <str> Use <str> as glob pattern to
# select source files.
# -h --help Display version, help text and exit.
# -l --log Log activity to syslog (DEFAULT).
# -L --nolog Do not log activity to syslog.
# -v --version Display version and exit.
# -V --verbose Display activity to STDOUT.
# -s --source <str> Use <str> as path to source directory.
# -t --target <str> Use <str> as path to target directory.
#<<TODO:
#--

class Mv2Dir

attr_reader :path, :message

def initialize(path)
@path = path
@message = nil

if !File.exists?(path) then
@message = "#{path} does not exist on this system."
elsif
!File.directory?(path) then
@message = "#{path} exists but is not a directory."
else
@message = "#{path} is a valid directory."
end
end

def log
#TODO add singleton check for open logging process
# open one if none exists. Log message to syslog.
puts "Log this message #{@message}"
end

def readable?
if !File.readable?(@path) then
@message = "#{@path} is not readable."
FALSE
else
@message = "#{@path} is readable."
TRUE
end
end

def writable?
if !File.writable?(@path) then
@message = "#{@path} is not writeable."
FALSE
else
@message = "#{@path} is writeable."
TRUE
end
end

def writeable?
self.writable?
end

end # End class Mv2Dir

# Run as script after here:

if __FILE__ == $0 then
# Run as script after here:

require 'optparse'

options = OptionParser.new

puts ARGV.to_s

options.on("-h", "-?", "--help", "--about") do |opt|
puts "Help text goes here."
Process.exit
end

# define input and output directories with test data files:
dir_data_in = "/home/byrnejb/ruby/data/source"
dir_data_out = "/home/byrnejb/ruby/data/target"
fnm_glob = "QPCCCMR[1-3]"

# reference these here lest they go out of scope below.
test_in = nil
test_out = nil

# place these last in test case so that they are available later:
["/","/tmp/","/tmp","no such place/",dir_data_in].each do |din|
puts "testing input dir: #{din}"
test_in = Mv2Dir.new(din)
puts "#{test_in.path.to_s} #{test_in.message.to_s}"
puts "#{test_in.readable?.to_s} #{test_in.message.to_s}"
puts "#{test_in.writeable?.to_s} #{test_in.message.to_s}"
end

["./",dir_data_out].each do |dout|
puts "Testing output dir: #{dout}"
test_out = Mv2Dir.new(dout)
puts "#{test_out.path.to_s} #{test_out.message.to_s}"
puts "#{test_out.writeable?.to_s} #{test_out.message.to_s}"
end

puts test_in.path
puts test_out.path

end

----------------------------------------------------------------

I have to ask why in Ruby 'elsif' is not 'elseif' and 'writable?' is not
'writeable?' Is there a shortage of 'e's in the the ruby world?

On a more serious note, can anyone tell me why the optparse construct
given below does not behave as I intend when the script is called thus:

#ruby mv2dir.rb --help

options.on("-h", "-?", "--help", "--about") do |opt|
puts "Help text goes here."
Process.exit
end


Finally, I can use suggestions on how to code the log method. I would
like to be able to write things like: test_dir =
Mv2Dir.new("/some/path").log or test_dir.writeable?.log but as
(test_dir.writeable? = TRUE) this becomes TRUE.log which is, of course,
undefined. I am not sure how to do this but what I want is to force a
logging event force any Mv2Dir method by appending the .log. Any ideas?
 
W

William James

James said:
# find, move and optionally rename data files that match glob expression
# provided.
#
# ruby mv2dir.rb source_dir glob_exp [target_dir=./] [ext]

# ruby mv2dir.rb source_dir glob_exp target_dir
require 'fileutils'
Dir[ File.join(ARGV[0], "*") ].select{ |p|
Regexp.new(ARGV[1]).match File.basename( p ) }.each{ |p|
FileUtils.move( p, ARGV[2] ) }
 
W

William James

William said:
James said:
# find, move and optionally rename data files that match glob expression
# provided.
#
# ruby mv2dir.rb source_dir glob_exp [target_dir=./] [ext]

# ruby mv2dir.rb source_dir glob_exp target_dir
require 'fileutils'
Dir[ File.join(ARGV[0], "*") ].select{ |p|
Regexp.new(ARGV[1]).match File.basename( p ) }.each{ |p|
FileUtils.move( p, ARGV[2] ) }

Please pardon previous prolixity.

require 'fileutils'
FileUtils.move Dir[ File.join(ARGV[0], "*") ].select{ |p|
Regexp.new(ARGV[1]).match File.basename( p ) }, ARGV[2]
 
J

James B. Byrne

James said:
There appears to be something wrong with either my appreciation of how
Syslog is supposed to work, with its implementation, or with how I
understand class variables are preserved.

I misunderstood where @@variables have to be declared. I moved @@logid
from the initialize method into the class definition and thereafter.log
worked as I intended.
 
J

James B. Byrne

I really would appreciate a critque of class Mv2Dir by someone with a
deal more experience with OOP and Ruby than I.
 
J

James Byrne

unknown wrote:

harp:~ > cat a.rb
(src, glob, dst, ext = ARGV) and (dst ||= ".") and (f = File)
Dir::glob(f.join(src, glob)){|e|
f.rename(e, f.join(dst, f.basename(e) + ext.to_s))}

Thank you.
Jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,708
Latest member
SherleneF1

Latest Threads

Top