N
Nathan Harmston
Hi,
I know this isnt the pyparsing list, but it doesnt seem like there is
one. I m trying to use pyparsing to parse a file however I cant get
the Optional keyword to work. My file generally looks like this:
ALIGNMENT 1020 YS2-10a02.q1k chr09 1295 42 141045
142297 C 1254 95.06 1295 reject_bad_break 0
or this:
ALIGNMENT 36 YS2-10a08.q1k chrm 208 165 10745
10788 C 44 95.45 593 reject_low 10,14
and my grammar work well for these lines, however somethings the row looks like:
ALIGNMENT 53 YS2-10b03.p1k chr12 180 125 1067465
1067520 C 56 98.21 532|5,2 reject_low 25
So I try to parse the 532 using
from pyparsing import *
integer = Word( nums )
float = Word( nums+".")
identifier = Word( alphanums+"-_." )
alignment = Literal("ALIGNMENT ").suppress()
row_1 = integer.setResultsName("row_1")#.setParseAction(make_int)
src_id = identifier.setResultsName("src_id")
dest_id = identifier.setResultsName("dest_id")
src_start = integer.setResultsName("src_start")#.setParseAction(make_int)
src_stop = integer.setResultsName("src_stop")#.setParseAction(make_int)
dest_start = integer.setResultsName("dest_start")#.setParseAction(make_int)
dest_stop = integer.setResultsName("dest_stop")#.setParseAction(make_int)
row_8 = oneOf("F C").setResultsName("row_8")
length = integer.setResultsName("length")#.setParseAction(make_int)
percent_id = float.setResultsName("percent_id")#.setParseAction(make_float)
row_11 = integer + Optional(Literal("|") + commaSeparatedList )
)#.setResultsName("row_11")#.setParseAction(make_int)
result = Word(alphas+"_").setResultsName("result")
row_13 = commaSeparatedList.setResultsName("row_13")
def make_alilines_status_parser():
return alignment + row_1 + src_id + dest_id + src_start + src_stop
+ dest_start + dest_stop + row_8 + length + percent_id + row_11 +
result + row_13
def parse_alilines_status(ifile):
alilines = make_alilines_status_parser()
for l in ifile:
yield alilines.parseString( l )
However my parser always fails on lines of type 3. Does anyone know
why the Optional part is not working.
Many Thanks in advance
Nathan
I know this isnt the pyparsing list, but it doesnt seem like there is
one. I m trying to use pyparsing to parse a file however I cant get
the Optional keyword to work. My file generally looks like this:
ALIGNMENT 1020 YS2-10a02.q1k chr09 1295 42 141045
142297 C 1254 95.06 1295 reject_bad_break 0
or this:
ALIGNMENT 36 YS2-10a08.q1k chrm 208 165 10745
10788 C 44 95.45 593 reject_low 10,14
and my grammar work well for these lines, however somethings the row looks like:
ALIGNMENT 53 YS2-10b03.p1k chr12 180 125 1067465
1067520 C 56 98.21 532|5,2 reject_low 25
So I try to parse the 532 using
from pyparsing import *
integer = Word( nums )
float = Word( nums+".")
identifier = Word( alphanums+"-_." )
alignment = Literal("ALIGNMENT ").suppress()
row_1 = integer.setResultsName("row_1")#.setParseAction(make_int)
src_id = identifier.setResultsName("src_id")
dest_id = identifier.setResultsName("dest_id")
src_start = integer.setResultsName("src_start")#.setParseAction(make_int)
src_stop = integer.setResultsName("src_stop")#.setParseAction(make_int)
dest_start = integer.setResultsName("dest_start")#.setParseAction(make_int)
dest_stop = integer.setResultsName("dest_stop")#.setParseAction(make_int)
row_8 = oneOf("F C").setResultsName("row_8")
length = integer.setResultsName("length")#.setParseAction(make_int)
percent_id = float.setResultsName("percent_id")#.setParseAction(make_float)
row_11 = integer + Optional(Literal("|") + commaSeparatedList )
)#.setResultsName("row_11")#.setParseAction(make_int)
result = Word(alphas+"_").setResultsName("result")
row_13 = commaSeparatedList.setResultsName("row_13")
def make_alilines_status_parser():
return alignment + row_1 + src_id + dest_id + src_start + src_stop
+ dest_start + dest_stop + row_8 + length + percent_id + row_11 +
result + row_13
def parse_alilines_status(ifile):
alilines = make_alilines_status_parser()
for l in ifile:
yield alilines.parseString( l )
However my parser always fails on lines of type 3. Does anyone know
why the Optional part is not working.
Many Thanks in advance
Nathan