M
MooMaster
I'm trying to develop a little script that does some string
manipulation. I have some few hundred strings that currently look like
this:
cond(a,b,c)
and I want them to look like this:
cond(c,a,b)
but it gets a little more complicated because the conds themselves may
have conds within, like the following:
cond(0,cond(c,cond(e,cond(g,h,(a<f)),(a<d)),(a<b)),(a<1))
What I want to do in this case is move the last parameter to the front
and then work backwards all the way out (if you're thinking recursion
too, I'm vindicated) so that it ends up looking like this:
cond((a<1), 0, cond((a<b),c,cond((a<d), e, cond((a<f), g, h))))
futhermore, the conds may be multiplied by an expression, such as the
following:
cond(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
Here, all I want to do is switch the parameters of the conds without
touching the expression, like so:
cond(f,-1,1)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
So that's the gist of my problem statement. I immediately thought that
regular expressions would provide an elegant solution. I would go
through the string by conds, stripping them & the () off, until I got
to the lowest level, then move the parameters and work backwards. That
thought process became this:
-------------------------------------CODE--------------------------------------------------------
import re
def swap(left, middle, right):
left = left.replace("(", "")
right = right.replace(")", "")
temp = left
left = right
right = temp
temp = middle
middle = right
right = temp
whole = 'cond(' + left + ',' + middle + ',' + right + ')'
return whole
def condReplacer(string):
#regex = re.compile(r'cond\(.*,.*,.+\)')
regex = re.compile(r'cond\(.*,.*,.+?\)')
if not regex.search(string):
print "whole string is: " + string
[left, middle, right] = string.split(',')
right = right.replace('\'', ' ')
string = swap(left.strip(), middle.strip(), right.strip())
print "the new string is:" + string
return string
else:
more_conds = regex.search(string)
temp_string = more_conds.group()
firstParen = temp_string.find('(')
temp_string = temp_string[firstParen:]
print "there are more conditionals!" + temp_string
condReplacer(temp_string)
def lineReader(file):
for line in file:
regex = r'cond\(.*,.*,.+\)?'
if re.search(regex,line,re.DOTALL):
condReplacer(line)
if __name__ == "__main__":
input_file = open("only_conds2.txt", 'r')
lineReader(input_file)
-------------------------------------CODE--------------------------------------------------------
I think my problem lies in my regular expression... If I use the one
commented out I do a greedy search and in my test case where I have a
conditional * an expression, I grab the expression too, like so:
INPUT:
cond(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
OUTPUT:
whole string is:
(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float
(a))
the new string
is:cond(f*((float(e*(2**4+(float(d*8+(float(c*4+(float(b*2+float
(a,-1,1)
when all I really want to do is grab the part associated with the cond.
But if I do a non-greedy search I avoid that problem but stop too early
when I have an expression like this:
INPUT:
cond(a,b,(abs(c) >= d))
OUTPUT:
whole string is: (a,b,(abs(c)
the new string is:cond((abs(c,a,b)
Can anyone help me with the regular expression? Is this even the best
approach to take? Anyone have any thoughts?
Thanks for your time!
manipulation. I have some few hundred strings that currently look like
this:
cond(a,b,c)
and I want them to look like this:
cond(c,a,b)
but it gets a little more complicated because the conds themselves may
have conds within, like the following:
cond(0,cond(c,cond(e,cond(g,h,(a<f)),(a<d)),(a<b)),(a<1))
What I want to do in this case is move the last parameter to the front
and then work backwards all the way out (if you're thinking recursion
too, I'm vindicated) so that it ends up looking like this:
cond((a<1), 0, cond((a<b),c,cond((a<d), e, cond((a<f), g, h))))
futhermore, the conds may be multiplied by an expression, such as the
following:
cond(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
Here, all I want to do is switch the parameters of the conds without
touching the expression, like so:
cond(f,-1,1)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
So that's the gist of my problem statement. I immediately thought that
regular expressions would provide an elegant solution. I would go
through the string by conds, stripping them & the () off, until I got
to the lowest level, then move the parameters and work backwards. That
thought process became this:
-------------------------------------CODE--------------------------------------------------------
import re
def swap(left, middle, right):
left = left.replace("(", "")
right = right.replace(")", "")
temp = left
left = right
right = temp
temp = middle
middle = right
right = temp
whole = 'cond(' + left + ',' + middle + ',' + right + ')'
return whole
def condReplacer(string):
#regex = re.compile(r'cond\(.*,.*,.+\)')
regex = re.compile(r'cond\(.*,.*,.+?\)')
if not regex.search(string):
print "whole string is: " + string
[left, middle, right] = string.split(',')
right = right.replace('\'', ' ')
string = swap(left.strip(), middle.strip(), right.strip())
print "the new string is:" + string
return string
else:
more_conds = regex.search(string)
temp_string = more_conds.group()
firstParen = temp_string.find('(')
temp_string = temp_string[firstParen:]
print "there are more conditionals!" + temp_string
condReplacer(temp_string)
def lineReader(file):
for line in file:
regex = r'cond\(.*,.*,.+\)?'
if re.search(regex,line,re.DOTALL):
condReplacer(line)
if __name__ == "__main__":
input_file = open("only_conds2.txt", 'r')
lineReader(input_file)
-------------------------------------CODE--------------------------------------------------------
I think my problem lies in my regular expression... If I use the one
commented out I do a greedy search and in my test case where I have a
conditional * an expression, I grab the expression too, like so:
INPUT:
cond(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
OUTPUT:
whole string is:
(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float
(a))
the new string
is:cond(f*((float(e*(2**4+(float(d*8+(float(c*4+(float(b*2+float
(a,-1,1)
when all I really want to do is grab the part associated with the cond.
But if I do a non-greedy search I avoid that problem but stop too early
when I have an expression like this:
INPUT:
cond(a,b,(abs(c) >= d))
OUTPUT:
whole string is: (a,b,(abs(c)
the new string is:cond((abs(c,a,b)
Can anyone help me with the regular expression? Is this even the best
approach to take? Anyone have any thoughts?
Thanks for your time!