I need a formula parser

D

Daniel Berger

Hi all,

I'm slowly but surely working on the next major release of my
spreadsheet package (a port of John McNamara's Spreadsheet::WriteExcel
Perl module, 2.x).

However, I still do not have a formula parser. John's module uses
Parse::RecDescent to parse formulas. I need something similar (or
better). Preferably something generic that everyone can use and enjoy,
but I'll take a more tailored parser if necessary.

Can anyone please help me?

For more information on Spreadsheet::WriteExcel, see
http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcel-2.13/

The new project page is at http://rubyforge.org/projects/spreadsheet.

Regards,

Dan

PS - And, if anyone is up to porting OLE::Storage or OLE::Storage_Lite,
that would be swell.
 
L

Lyndon Samson

Hi all,

I'm slowly but surely working on the next major release of my
spreadsheet package (a port of John McNamara's Spreadsheet::WriteExcel
Perl module, 2.x).

However, I still do not have a formula parser. John's module uses
Parse::RecDescent to parse formulas. I need something similar (or
better). Preferably something generic that everyone can use and enjoy,
but I'll take a more tailored parser if necessary.

Can anyone please help me?

Sounds like another fine RubyQuiz!!!
 
J

James Edward Gray II

Sounds like another fine RubyQuiz!!!

I miss Parse::RecDescent myself, but porting it is not exactly a quiz
size task. Or perhaps I didn't understand what you meant...

James Edward Gray II
 
D

Dmitry V. Sabanin

Hi all,

I'm slowly but surely working on the next major release of my
spreadsheet package (a port of John McNamara's
Spreadsheet::WriteExcel Perl module, 2.x).

However, I still do not have a formula parser. John's module uses
Parse::RecDescent to parse formulas. I need something similar (or
better). Preferably something generic that everyone can use and
enjoy, but I'll take a more tailored parser if necessary.

Can anyone please help me?
Did you check http://rubyforge.org/projects/lxl/ ?
 
L

Lyndon Samson

I miss Parse::RecDescent myself, but porting it is not exactly a quiz
size task. Or perhaps I didn't understand what you meant...
I meant a generic formula parser

ie.
"(1+2)*pow(2,2)", block callback for function execution
"(1+2)*pow($a,2)", passed in hash lookup for variable sub

etc
 
J

James Edward Gray II

I meant a generic formula parser

ie.
"(1+2)*pow(2,2)", block callback for function execution
"(1+2)*pow($a,2)", passed in hash lookup for variable sub

etc

Okay, that makes more sense. Write it up and send it in... :)

James Edward Gray II
 
C

Charles Mills

Daniel said:
Hi all,

I'm slowly but surely working on the next major release of my
spreadsheet package (a port of John McNamara's Spreadsheet::WriteExcel
Perl module, 2.x).

However, I still do not have a formula parser. John's module uses
Parse::RecDescent to parse formulas. I need something similar (or
better). Preferably something generic that everyone can use and enjoy,
but I'll take a more tailored parser if necessary.

Can anyone please help me?

For more information on Spreadsheet::WriteExcel, see
http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcel-2.13/

The new project page is at http://rubyforge.org/projects/spreadsheet.

Regards,

Dan

PS - And, if anyone is up to porting OLE::Storage or OLE::Storage_Lite,
that would be swell.

Below is a shameless copy of LXL I wrote a while back. lxl_parse.rb is
a command line app. lxl.y requires RACC.

-Charlie

lxl.y
------------------------------------------------------------------

class LXL::parser

# see http://www.informit.com/articles/article.asp?p=328639&seqNum=2
# for XL operator precidence table
prechigh
left ':'
nonassoc UMINUS PERCENT
left '^' '*' '/'
left '+' '-'
left '&' # concatination
left '=' '<' '>' '<=' '>=' '<>'
preclow

token INTEGER FLOAT STRING SHEET PATH IDENTIFIER
PERCENT UMINUS TRUE FALSE


rule

target
: statement
| statement ';'
;

statement
: formula { result = val[0,1] }
| statement ';' expression { result = val[0] << val[2] }
;

formula
: '=' expression { result = val[1] }
;

base_range
: PATH SHEET IDENTIFIER { result = self.lookup(val[0], val[1],
val[2]) }
| SHEET IDENTIFIER { result = self.lookup(nil, val[0], val[1]) }
| IDENTIFIER { result = self.lookup(nil, nil, val[0]) }
;

range
: base_range { result = val[0].Value }
# | base_range ':' IDENTIFIER
# | base_range ' ' IDENTIFIER
# | base_range ',' IDENTIFIER
;

constant
: TRUE { result = true }
| FALSE { result = false }
| INTEGER { result = val[0].to_f }
| FLOAT { result = val[0].to_f }
| STRING { result = LXL::XL.unquote(val[0]) }
;

expression
: constant
| range
| function_call
# unary operations
| '-' expression = UMINUS
{ result = LXL::XL.uminus(val[1]) }
| '%' expression = PERCENT
{ result = LXL::XL.percent(val[1]) }
# arithmatic
| expression '^' expression
{ result = LXL::XL.power(val[0], val[2]) }
| expression '*' expression
{ result = LXL::XL.multiply(val[0], val[2]) }
| expression '/' expression
{ result = LXL::XL.divide(val[0], val[2]) }
| expression '+' expression
{ result = LXL::XL.add(val[0], val[2]) }
| expression '-' expression
{ result = LXL::XL.subtract(val[0], val[2]) }
# concat
| expression '&' expression
{ result = LXL::XL.concat(val[0], val[2]) }
# comparison
| expression '=' expression
{ result = LXL::XL.eq(val[0], val[2]) }
| expression '<' expression
{ result = LXL::XL.lt(val[0], val[2]) }
| expression '>' expression
{ result = LXL::XL.gt(val[0], val[2]) }
| expression '<=' expression
{ result = LXL::XL.le(val[0], val[2]) }
| expression '>=' expression
{ result = LXL::XL.ge(val[0], val[2]) }
| expression '<>' expression
{ result = LXL::XL.ne(val[0], val[2]) }
;

function_call
: IDENTIFIER '(' args ')' { result = self.call_function(val[0],
*val[2]) }
| IDENTIFIER '(' ')' { result = self.call_function(val[0]) }
;

args
: expression { result = val[0,1] }
| ',' expression { result = [0.0, val[1]] }
| ',' { result = [0.0, 0.0] }
| args ',' expression { result = val[0] << val[2] }
| args ',' { result = val[0] << 0.0 }
;

---- header ----

require 'win32ole'
require 'strscan'

# redefine to_s methods so they behave like XL
class Float
def to_s
if self % 1.0 == 0.0
self.to_i.to_s
else
self.to_s
end
end
end

class True
def to_s
"TRUE"
end
end

class False
def to_s
"FALSE"
end
end

module LXL
class FormulaError < StandardError; end

def LXL.value_error(msg)
raise(LXL::FormulaError, "#VALUE: #{msg}")
end

def LXL.name_error(msg)
raise(LXL::FormulaError, "#NAME: #{msg}")
end

module XL
def XL.to_num(v)
LXL.value_error("#{v.inspect} is not a number") \
if v.is_a? String and v !~ /[0-9]+(?:\.[0-9]*)?/
v.to_f # all Excel numbers are floats
end
def XL.to_bool(v)
LXL.value_error("cannot use #{v.inspect} as a boolean") if v.is_a?
String

if v == true or v == false
v
else
v != 0.0 # non zero values are true
end
end
def XL.uminus(a)
-to_num(a)
end
def XL.percent(a)
to_num(a) / 100.0
end
def XL.concat(a,b)
a.to_s + b.to_s
end
# to arithmetic with Floats
[ ['power', '**'],
['multiply', '*'],
['divide', '/'],
['add', '+'],
['subtract', '-'] ].each do |p|
#puts(%{
self.module_eval(%{
def XL.#{p[0]}(a,b)
to_num(a) #{p[1]} to_num(b)
end
})
end
# do comparison with Strings
[ ['eq', '=='],
['lt', '<'],
['gt', '>'],
['le', '<='],
['ge', '>='],
['ne', '!='] ].each do |p|
self.module_eval(%{
def XL.#{p[0]}(a,b)
a.to_s #{p[1]} b.to_s
end
})
end
def XL.unquote(a)
a = a[1...-1] # remove quotes
a.gsub!('""','"')
a
end
end # XL
end # LXL

---- inner ----

def Parser.open(*args)
v = self.new(*args)
if block_given?
begin
yield(v)
ensure
v.close
end
else
v
end
end

s = [ '<=', '>=', '=', '<>', '<', '>',
'+', '-', '*', '^', '/', '(', ')',
'&', ',', ';' ].collect! do |op|
Regexp.quote(op)
end.join('|')
SYMBOL = /#{s}/

def initialize(base_wkbk=nil, base_sheet=nil)
@xl_functions = {}
self.add_xl_functions
if base_wkbk and base_sheet
@wkbks = {}
@xl = WIN32OLE.new('EXCEL.Application')
begin
# turn off screen updating, etc
#@xl
rescue
warn $!
end
@base_wkbk = @xl.Workbooks.Open(base_wkbk)
@base_sheet = @base_wkbk.Sheets(base_sheet) if base_sheet
end
end

def add_function(name)
name = name.upcase.to_sym unless name.is_a? Symbol
warn "function #{name} already exists, overwriting" if
@xl_functions[name]
@xl_functions[name] = Proc.new
self # return self
end

def add_xl_functions
add_function:)AND) { |a,b| LXL::XL.to_bool(a) and LXL::XL.to_bool(b)
}
add_function:)OR) { |a,b| LXL::XL.to_bool(a) or LXL::XL.to_bool(b) }
add_function:)NOT) { |a| not LXL::XL.to_bool(a) }
add_function:)IF) { |a,b,c| LXL::XL.to_bool(a) ? b : c }
end

def closed?
@xl.nil?
end

def close
if @wkbks
begin
@wkbks.each do |name, wkbk|
wkbk.Close
end .clear
ensure
begin
@base_sheet = nil
@base_wkbk.Close
ensure
@base_wkbk = nil
@xl.Quit
@xl = nil
end
end
end
self # return self
end

def eval(str)
# check for initial '=' sign
return str unless str[0] == ?=
s = StringScanner.new(str)

tokens = []
until s.empty?
case
when m = s.scan(/\s+/)
# ignore whitespace
when m = s.scan(/[0-9]+\.[0-9]+/)
# float
tokens << [:FLOAT, m]
when m = s.scan(/[0-9]+/)
# integer
tokens << [:INTEGER, m]
when m = s.scan(/L?"(?:""|[^"])*"/)
# string
tokens << [:STRING, m]
when m = s.scan(/[_A-Za-z]\w*/)
# identifier
m.upcase!
tokens << \
[ if m == "TRUE"
:TRUE
elsif m == "FALSE"
:FALSE
else
:IDENTIFIER
end, m ]
when m = s.scan(SYMBOL)
tokens << [m, m]
when m = s.scan(/(?:\[([^\]]+)\])?(\w+)!/)
tokens << [:pATH, s[1]] if s[1]
tokens << [:SHEET, s[2]]
else
m = s.getch
raise ParseError, "unrecognized token #{m}"
end
end # until
tokens << [false, false] # end of tokens
yyparse(tokens, :each)
end

protected
def workbook(name)
@wkbks[name] || @wkbks[name] = @xl.Workbooks.Open(name)
end

def lookup(wkbk, sheet, address)
LXL.name_error("no base workbook given cannot resolve #{address}") if
self.closed?

if wkbk
wkbk = self.workbook(wkbk)
else
wkbk = @base_wkbk
end
if sheet
sheet = wkbk.Sheets(sheet)
elsif @base_sheet
sheet = @base_sheet
else
LXL.name_error("no base worksheet given cannot resolve #{address}")
end
sheet.Range(address)
end

def call_function(name, *args)
LXL.name_error("no such function #{name}") \
unless f = @xl_functions[name.to_sym]
f.call(*args)
end

def on_error(error_token_id, error_value, value_stack)
raise ParseError, "parse error on #{error_value.to_s.inspect}"
end


lxl_parse.rb
--------------------------------------------------------------

#!/usr/bin/env ruby

require 'optparse'

# create output directories if they don't exist
def create_dest_dirs(dest_file)
dest_dir = File.dirname(dest_file)
dirs = []
until File.directory?(dest_dir)
# [ "/home/boson/fake", "/home/boson" ]
# "/home/boson" exists so loop ends
dirs << dest_dir
dest_dir = File.dirname(dest_dir)
end
dirs.reverse_each do |d|
Dir.mkdir(d)
end
nil
end

class OptionParser
def error(msg)
raise OptionParser::InvalidArgument, msg
end
end

command =File.basename($0)
if $0 != command
$:.unshift(File.join(File.dirname($0),'..','lib'))
end
require 'lxl'

# initialize with defaults
base_wkbk = nil
base_sheet = nil
dest_file = nil
in_file = nil

opts = OptionParser.new do |opts|
opts.banner = "Usage: #{command} [options] [input formulas]"
opts.separator ""
opts.separator "Specific options:"

opts.on("-o", "--output file", String, \
"Output to file (default STDOUT)") do |file|
opts.error("multiple output files give") if dest_file
opts.error("destination file '#{file}' exists") if File.exists?(file)
dest_file = file
end
opts.on("-i", "--input file", String, \
"Input formulas from file") do |file|
opts.error("multiple input files give") if in_file
opts.error("input file #{file} does not exist") unless
File.exists?(file)
in_file = file
end
opts.on("-w", "--wkbk=(workbook)", String, "Set base workbook") do
|wkbk|
opts.error("#{wkbk} does not exist") unless File.exists?(wkbk)
base_wkbk = wkbk
end
opts.on("-s", "--sheet=(worksheet)", String, "Set base worksheet") do
|sheet|
base_sheet = sheet
end

opts.on("-h", "--help", "Print this message") do
puts opts
exit(0)
end

opts.separator ""
opts.separator "Reads from STDIN if input file is '-'"
opts.separator "Formulas read from a file or from STDIN must be
seperated by a newline"
end

begin
opts.parse!(ARGV)
rescue
warn "#{$!} (-h will show vaild options)"
exit(2)
end

# make sure opts gets gc'ed
opts = nil

if dest_file
begin
create_dest_dirs(dest_file)
rescue
warn "could not create destination directory: #{$?}"
exit(2)
end
end

if in_file == '-'
f = STDIN.readlines
elsif in_file
f = File.open(in_file) { |f| f.readlines }
end

if f
f += ARGV
else
f = ARGV
end

begin
out = LXL::parser.open(base_wkbk, base_sheet) do |xl|
f.collect do |text|
r = xl.eval(text)
r.is_a?(Array) ? r.join('; ') : r
end
end
out = out.join("\n")
if dest_file
File.open(dest_file, 'w') { |f| f.write(out) }
else
$stdout.write(out)
end
rescue ParseError, LXL::FormulaError
warn $!
exit(2)
end
 
G

George Ogata

Joel VanderWerf said:

I have a more complete C parser at cast.rubyforge.org, if it's not too
overkill for you. Here's a demo:

irb(main):001:0> require 'cast/cast'
=> true
irb(main):002:0> C::Expression.parse('sin(x / 180.0 * M_PI)')
=> Call
expr: Variable
name: "sin"
args:
- Multiply
expr1: Divide
expr1: Variable
name: "x"
expr2: FloatLiteral
val: 180.0
expr2: Variable
name: "M_PI"
 

Members online

No members online now.

Forum statistics

Threads
474,171
Messages
2,570,935
Members
47,472
Latest member
KarissaBor

Latest Threads

Top