J
James Stroud
Hello,
I have strings represented as a combination of an alphabet (AGCT) and a an
operator "/", that signifies degeneracy. I want to split these strings into
lists of lists, where the degeneracies are members of the same list and
non-degenerates are members of single item lists. An example will clarify
this:
"ATT/GATA/G"
gets split to
[['A'], ['T'], ['T', 'G'], ['A'], ['T'], ['A', 'G']]
I have written a very ugly function to do this (listed below for the curious),
but intuitively I think this should only take a couple of lines for one
skilled in regex and/or listcomp. Any takers?
James
p.s. Here is the ugly function I wrote:
def build_consensus(astr):
consensus = [] # the lol that will be returned
possibilities = [] # one element of consensus
consecutives = 0 # keeps track of how many in a row
for achar in astr:
if (achar == "/"):
consecutives = 0
continue
else:
consecutives += 1
if (consecutives > 1):
consensus.append(possibilities)
possibilities = [achar]
else:
possibilities.append(achar)
if possibilities:
consensus.append(possibilities)
return consensus
--
James Stroud, Ph.D.
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095
http://www.jamesstroud.com/
I have strings represented as a combination of an alphabet (AGCT) and a an
operator "/", that signifies degeneracy. I want to split these strings into
lists of lists, where the degeneracies are members of the same list and
non-degenerates are members of single item lists. An example will clarify
this:
"ATT/GATA/G"
gets split to
[['A'], ['T'], ['T', 'G'], ['A'], ['T'], ['A', 'G']]
I have written a very ugly function to do this (listed below for the curious),
but intuitively I think this should only take a couple of lines for one
skilled in regex and/or listcomp. Any takers?
James
p.s. Here is the ugly function I wrote:
def build_consensus(astr):
consensus = [] # the lol that will be returned
possibilities = [] # one element of consensus
consecutives = 0 # keeps track of how many in a row
for achar in astr:
if (achar == "/"):
consecutives = 0
continue
else:
consecutives += 1
if (consecutives > 1):
consensus.append(possibilities)
possibilities = [achar]
else:
possibilities.append(achar)
if possibilities:
consensus.append(possibilities)
return consensus
--
James Stroud, Ph.D.
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095
http://www.jamesstroud.com/