O
Ognen Duzlevski
Hi all,
I have rewritten a C program to solve a bioinformatics problem. Portion where most of the time is spent is:
def DynAlign(scoremat,insertmat,delmat,tseq,qseq,tlen,qlen):
global ONEINDELPENALTY,OPENGAPPENALTY
for ndx1 in range(1,tlen+1):
for ndx2 in range(1,qlen+1):
delmat[ndx1][ndx2] = Min(delmat[ndx1-1][ndx2]+ONEINDELPENALTY, \
Min(scoremat[ndx1-1][ndx2]+OPENGAPPENALTY+ONEINDELPENALTY, \
insertmat[ndx1-1][ndx2]+OPENGAPPENALTY+ONEINDELPENALTY))
insertmat[ndx1][ndx2] = Min(insertmat[ndx1][ndx2-1]+ONEINDELPENALTY, \
Min(scoremat[ndx1][ndx2-1]+OPENGAPPENALTY+ONEINDELPENALTY, \
delmat[ndx1][ndx2-1]+OPENGAPPENALTY+ONEINDELPENALTY))
scoremat[ndx1][ndx2] = Min(scoremat[ndx1-1][ndx2-1], \
Min(delmat[ndx1-1][ndx2-1], insertmat[ndx1-1][ndx2-1])) + \
GetFitnessScore(tseq,ndx1-1,qseq,ndx2-1)
def Min(a,b):
if a< b:
return a
else:
return b
In C this is not a big deal, delmat, scoremat and insertmat are int matrices dynamically allocated and the
loop-inside-a-loop is pretty fast. However, on python (2.3) it is very slow. So for comparison, my C version takes 8
seconds to execute and the python version takes around 730 seconds. I have narrowed down through visual observation
(print before and print after entry into the above nested loop) that most of the time is spent in there. I have also
tried the Numeric package with their arrays. It speeded things up somewhat but not considerably.
Any suggestions?
Thank you,
Ognen
I have rewritten a C program to solve a bioinformatics problem. Portion where most of the time is spent is:
def DynAlign(scoremat,insertmat,delmat,tseq,qseq,tlen,qlen):
global ONEINDELPENALTY,OPENGAPPENALTY
for ndx1 in range(1,tlen+1):
for ndx2 in range(1,qlen+1):
delmat[ndx1][ndx2] = Min(delmat[ndx1-1][ndx2]+ONEINDELPENALTY, \
Min(scoremat[ndx1-1][ndx2]+OPENGAPPENALTY+ONEINDELPENALTY, \
insertmat[ndx1-1][ndx2]+OPENGAPPENALTY+ONEINDELPENALTY))
insertmat[ndx1][ndx2] = Min(insertmat[ndx1][ndx2-1]+ONEINDELPENALTY, \
Min(scoremat[ndx1][ndx2-1]+OPENGAPPENALTY+ONEINDELPENALTY, \
delmat[ndx1][ndx2-1]+OPENGAPPENALTY+ONEINDELPENALTY))
scoremat[ndx1][ndx2] = Min(scoremat[ndx1-1][ndx2-1], \
Min(delmat[ndx1-1][ndx2-1], insertmat[ndx1-1][ndx2-1])) + \
GetFitnessScore(tseq,ndx1-1,qseq,ndx2-1)
def Min(a,b):
if a< b:
return a
else:
return b
In C this is not a big deal, delmat, scoremat and insertmat are int matrices dynamically allocated and the
loop-inside-a-loop is pretty fast. However, on python (2.3) it is very slow. So for comparison, my C version takes 8
seconds to execute and the python version takes around 730 seconds. I have narrowed down through visual observation
(print before and print after entry into the above nested loop) that most of the time is spent in there. I have also
tried the Numeric package with their arrays. It speeded things up somewhat but not considerably.
Any suggestions?
Thank you,
Ognen