Making a simple parser

F

Felipe Balbi

Hi all,

To automate some of the tests I have to run, I decided to use ruby to
generate some script files on a particular (very simple) language based
on several possible input files. My first approach at this was to use
inherited method on a parent class (which I called InputFormat) to hold
all the children in an array. Then, different formats could become a
child class of InputFormat and return a known data format (I decided to
use an array of hashes because the output is really really simple) to
the output generator code.

So the idea is something like:

class InputFormat
@children = []

def initialize(input)
@input = input
end

def parse
@children.each { |child|
child.parse(@input) if child.supported?(@input)
}
end

def self.inherited(child)
@children << child
end
end

class AInputFormat < InputFormat
def supported?
# check if we can parse this type of file
end

def parse
# parse and generate array of hashes in known format
end
end


Then on the core file I would have something like:

input = InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ? Any
tips you guys could give me to achieve what I want ? (from several
possible input formats generate one output format)
 
7

7stud --

Felipe Balbi wrote in post #993252:
Hi all,

To automate some of the tests I have to run, I decided to use ruby to
generate some script files on a particular (very simple) language based
on several possible input files. My first approach at this was to use
inherited method on a parent class (which I called InputFormat) to hold
all the children in an array. Then, different formats could become a
child class of InputFormat and return a known data format (I decided to
use an array of hashes because the output is really really simple) to
the output generator code.

So the idea is something like:

class InputFormat
@children = []

def initialize(input)
@input = input
end

def parse
@children.each { |child|
child.parse(@input) if child.supported?(@input)
}
end

def self.inherited(child)
@children << child
end
end

class AInputFormat < InputFormat
def supported?
# check if we can parse this type of file
end

def parse
# parse and generate array of hashes in known format
end
end


Then on the core file I would have something like:

input = InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ?

No:

1)
class A
def self.inherited(child)
puts 'inherited called'
end
end

class B < A
end

--output:--
inherited called


2)
class A
def self.inherited(child)
puts 'inherited called'
end
end

B = Class.new(A)

--output:--
inherited called
 
J

jake kaiden

hi felipe,

well, i don't know how many possible input types you have, but if
they're not too very many, you could try a very different approach:

## for this test i created three files, "input.eng," "input.esp," and
"input.fr," each with a few lines of random text...

class Parser
attr_reader :eek:utput
def initialize(inputfile)
@output = [] ## this can of course be changed to what best suits your
purposes
@oktypes = %W[.eng .esp .fr]
self.checkType(inputfile)
end

def checkType(file)
if File.exists?(file)
if ! @oktypes.include?(File.extname(file))
puts "Unrecognized File Type"
else
@oktypes.collect{|type|
case
when file.downcase.include?(type)
self.parseInput(file)
false
end
}
end
else
puts "File Not Found"
end
end

def parseInput(file)
case
when file.downcase.include?(".eng")
self.engParse(file)
when file.downcase.include?(".esp")
self.espParse(file)
when file.downcase.include?(".fr")
self.frParse(file)
end
end

def loadData(inputfile)
@data = []
file = File.open(inputfile, 'r')
file.collect{|line| @data << line.chomp}
file.close
end

## here's where you do whatever parsing you need to, my examples are
dumb... but the important thing is that you end up with @output

def engParse(file)
self.loadData(file)
@data.collect{|line|
@output << line.reverse}
end

def espParse(file)
self.loadData(file)
@data.collect{|line|
@output << line.upcase}
end

def frParse(file)
self.loadData(file)
@data.collect{|line|
@output << line.upcase.reverse}
end

end #class


test = Parser.new("input.esp")
puts test.output

this may be WAY too simple for what you're trying to do, but hey,
maybe not! ;)

- j
 
F

Felipe Balbi

Hi Jake,

jake kaiden wrote in post #993355:
class Parser
attr_reader :eek:utput
def initialize(inputfile)
@output = [] ## this can of course be changed to what best suits your
purposes
@oktypes = %W[.eng .esp .fr]
self.checkType(inputfile)
end

def checkType(file)
if File.exists?(file)
if ! @oktypes.include?(File.extname(file))
puts "Unrecognized File Type"
else
@oktypes.collect{|type|
case
when file.downcase.include?(type)
self.parseInput(file)
false
end
}
end
else
puts "File Not Found"
end
end

def parseInput(file)
case
when file.downcase.include?(".eng")
self.engParse(file)
when file.downcase.include?(".esp")
self.espParse(file)
when file.downcase.include?(".fr")
self.frParse(file)
end
end

def loadData(inputfile)
@data = []
file = File.open(inputfile, 'r')
file.collect{|line| @data << line.chomp}
file.close
end

## here's where you do whatever parsing you need to, my examples are
dumb... but the important thing is that you end up with @output

def engParse(file)
self.loadData(file)
@data.collect{|line|
@output << line.reverse}
end

def espParse(file)
self.loadData(file)
@data.collect{|line|
@output << line.upcase}
end

def frParse(file)
self.loadData(file)
@data.collect{|line|
@output << line.upcase.reverse}
end

end #class

Initially I thought about taking this approach, but frankly I don't
know how many input files I will have, then I wanted to have an
approach so that I don't need to mess with the core classes and
any core file. I wanted changes to be local to the place where they
are necessary. I mean, when I want to add another input format
all I would have to do would be to create a new class and the code
would just work.

With this approach, I would have keep on adding more and more
methods for doing the actual parsing of different formats and what
I wanted was to offload that to another class without touching the
caller code.

Oh well, I'll keep on trying. I'm sure there's some pattern for doing
just that, maybe I just didn't implement correctly :p
 
J

jake kaiden

Felipe Balbi wrote in post #993383:
Hi Jake,
I want to add another input format
all I would have to do would be to create a new class and the code
would just work.

With this approach, I would have keep on adding more and more
methods for doing the actual parsing of different formats and what
I wanted was to offload that to another class without touching the
caller code.

a very good point - and really what inheritance is for. good luck
with a solution, i (and i imagine those who read this post) will keep
playing with the idea...

hasta otro...

-j
 
J

Jesús Gabriel y Galán

Hi all,

To automate some of the tests I have to run, I decided to use ruby to
generate some script files on a particular (very simple) language based
on several possible input files. My first approach at this was to use
inherited method on a parent class (which I called InputFormat) to hold
all the children in an array. Then, different formats could become a
child class of InputFormat and return a known data format (I decided to
use an array of hashes because the output is really really simple) to
the output generator code.

So the idea is something like:

class InputFormat
=A0@children =3D []

=A0def initialize(input)
=A0 =A0@input =3D input
=A0end

=A0def parse
=A0 [email protected] { |child|
=A0 =A0 =A0child.parse(@input) if child.supported?(@input)
=A0 =A0}
=A0end

=A0def self.inherited(child)
=A0 =A0@children << child
=A0end
end

class AInputFormat < InputFormat
=A0def supported?
=A0 =A0# check if we can parse this type of file
=A0end

=A0def parse
=A0 =A0# parse and generate array of hashes in known format
=A0end
end


Then on the core file I would have something like:

input =3D InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ? Any
tips you guys could give me to achieve what I want ? (from several
possible input formats generate one output format)

The above doesn't work, because the @children inside the instance
method "parse" is not the same as the @children inside the class
method "self.inherited". You have to give access to the class instance
variable, and then use that one from the parse method (the you will
see the next problem):

class InputFormat
class << self
attr_accessor :children
end

def initialize(input)
@input =3D input
end

def parse
self.class.children.each {|child| child.parse(@input) if
child.supported?(@input)}
end

def self.inherited(child)
(@children ||=3D []) << child
end
end

class AInputFormat < InputFormat
def supported?
# check if we can parse this type of file
end

def parse
# parse and generate array of hashes in known format
end
end

ruby-1.8.7-p334 :028 > input =3D InputFormat.new("test")
=3D> #<InputFormat:0xb738bf50 @input=3D"test">
ruby-1.8.7-p334 :029 > input.parse
NoMethodError: undefined method `supported?' for AInputFormat:Class
from (irb):11:in `parse'
from (irb):11:in `each'
from (irb):11:in `parse'
from (irb):29

The next problem, as you see, is that you are defining instance
methods in the subclasses, but are calling them on the class. Maybe
the methods parse and supported? in the children could be class
methods, or maybe what you store in @children could be an instance of
the class.

Jesus.
 
7

7stud --

I'm pretty unclear about what you are trying to do, but maybe this will
help:

class InputFormat
@children = []

def self.children
@children
end

def initialize(input)
@input = input
end

def parse
InputFormat.children.each { |child|
child.parse(@input) if child.supported?
}
end

def self.inherited(sub_class)
@children << sub_class.new('dummy')
end
end



class InputFormatA < InputFormat
def supported?
true
end

def parse(str)
puts "InputFormatA is parsing #{str}"
end
end

class InputFormatB < InputFormat
def supported?
true
end

def parse(str)
puts "InputFormatB is parsing #{str}"
end
end



input = InputFormat.new('hello world')
input.parse

--output:--
InputFormatA is parsing hello world
InputFormatB is parsing hello world


Note that when inherited() is called, the methods of the subclass are
not defined yet, so the inherited initialize() is called.
 
7

7stud --

7stud -- wrote in post #993588:
Note that when inherited() is called, the methods of the subclass are
not defined yet, so if you create objects of the subclass inside
inherited(), the initialize() method in the parent is called.

And you can get around that problem by letting InputFormat#parse create
the objects:

class InputFormat
@children = []

def self.children
@children
end

def initialize(input)
@input = input
end

def parse
InputFormat.children.each { |child|
instance = child.new
instance.parse(@input) if instance.supported?
}
end

def self.inherited(sub_class)
@children << sub_class
end
end



class InputFormatA < InputFormat
def initialize
puts "Initializing instance of #{self.class}"
end

def supported?
true
end

def parse(str)
puts "InputFormatA is parsing #{str}"
end
end

class InputFormatB < InputFormat
def initialize
puts "Initializing instance of #{self.class}"
end

def supported?
true
end

def parse(str)
puts "InputFormatB is parsing #{str}"
end
end



input = InputFormat.new('hello world')
input.parse

--output:--
Initializing instance of InputFormatA
InputFormatA is parsing hello world
Initializing instance of InputFormatB
InputFormatB is parsing hello world
 
F

Felipe Balbi

Hi,

t =

#993452:
@children.each { |child|
def supported?

input =3D InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ? A= ny
tips you guys could give me to achieve what I want ? (from several
possible input formats generate one output format)

The above doesn't work, because the @children inside the instance
method "parse" is not the same as the @children inside the class
method "self.inherited". You have to give access to the class instance
variable, and then use that one from the parse method (the you will

aaa, you're right :) Good point.
see the next problem):

class InputFormat
class << self
attr_accessor :children
end

def initialize(input)
@input =3D input
end

def parse
self.class.children.each {|child| child.parse(@input) if
child.supported?(@input)}
end

def self.inherited(child)
(@children ||=3D []) << child
end
end

class AInputFormat < InputFormat
def supported?
# check if we can parse this type of file
end

def parse
# parse and generate array of hashes in known format
end
end

ruby-1.8.7-p334 :028 > input =3D InputFormat.new("test")
=3D> #<InputFormat:0xb738bf50 @input=3D"test">
ruby-1.8.7-p334 :029 > input.parse
NoMethodError: undefined method `supported?' for AInputFormat:Class
from (irb):11:in `parse'
from (irb):11:in `each'
from (irb):11:in `parse'
from (irb):29

The next problem, as you see, is that you are defining instance
methods in the subclasses, but are calling them on the class. Maybe
the methods parse and supported? in the children could be class
methods, or maybe what you store in @children could be an instance of
the class.

I'm not instantiating AInputFormat in any part of the code... so making =

those
class methods is the way to go for me :) Thanks for the tip :)

--
balbi

-- =

Posted via http://www.ruby-forum.com/.=
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top