algorithm, optimization, or other problem?

B

Brian Blais

Hello,

I am trying to translate some Matlab/mex code to Python, for doing neural
simulations. This application is definitely computing-time limited, and I need to
optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
The procedure is very simple, after initializing any variables:

1) select a random input vector, which I will call "x". right now I have it as an
array, and I choose columns from that array randomly. in other cases, I may need to
take an image, select a patch, and then make that a column vector.

2) calculate an output value, which is the dot product of the "x" and a weight
vector, "w", so

y=dot(x,w)

3) modify the weight vector based on a matrix equation, like:

w=w+ eta * (y*x - y**2*w)
^
|
+---- learning rate constant

4) repeat steps 1-3 many times

I've organized it like:

for e in 100: # outer loop
for i in 1000: # inner loop
(steps 1-3)

display things.

so that the bulk of the computation is in the inner loop, and is amenable to
converting to a faster language. This is my issue:

straight python, in the example posted below for 250000 inner-loop steps, takes 20
seconds for each outer-loop step. I tried Pyrex, which should work very fast on such
a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex
file in matlab takes 1.5 seconds per outer-loop step.

Given the huge difference between the Pyrex and the Mex, I feel that there is
something I am doing wrong, because the C-code for both should run comparably.
Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind
coding some in C, but the Python API seemed a bit challenging to me.

One note: I am using the Numeric package, not numpy, only because I want to be able
to use the Enthought version for Windows. I develop on Linux, and haven't had a
chance to see if I can compile numpy using the Enthought Python for Windows.

If there is anything else anyone needs to know, I'll post it. I put the main script,
and a dohebb.pyx code below.


thanks!

Brian Blais

--
-----------------

(e-mail address removed)
http://web.bryant.edu/~bblais




# Main script:

from dohebb import *
import pylab as p
from Numeric import *
from RandomArray import *
import time

x=random((100,1000)) # 1000 input vectors

numpats=x.shape[0]
w=random((numpats,1));

th=random((1,1))

params={}
params['eta']=0.001;
params['tau']=100.0;
old_mx=0;
for e in range(100):

rnd=randint(0,numpats,250000)
t1=time.time()
if 0: # straight python
for i in range(len(rnd)):
pat=rnd
xx=reshape(x[:,pat],(1,-1))
y=matrixmultiply(xx,w)
w=w+params['eta']*(y*transpose(xx)-y**2*w);
th=th+(1.0/params['tau'])*(y**2-th);
else: # pyrex
dohebb(params,w,th,x,rnd)
print time.time()-t1


p.plot(w,'o-')
p.xlabel('weights')
p.show()


#=============================================

# dohebb.pyx

cdef extern from "Numeric/arrayobject.h":

struct PyArray_Descr:
int type_num, elsize
char type

ctypedef class Numeric.ArrayType [object PyArrayObject]:
cdef char *data
cdef int nd
cdef int *dimensions, *strides
cdef object base
cdef PyArray_Descr *descr
cdef int flags


def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd):


cdef int num_iterations
cdef int num_inputs
cdef int offset
cdef double *wp,*xp,*thp
cdef int *rndp
cdef double eta,tau

eta=params['eta'] # learning rate
tau=params['tau'] # used for variance estimate

cdef double y
num_iterations=rnd.dimensions[0]
num_inputs=w.dimensions[0]

# get the pointers
wp=<double *>w.data
xp=<double *>X.data
rndp=<int *>rnd.data
thp=<double *>th.data

for it from 0 <= it < num_iterations:

offset=rndp[it]*num_inputs

# calculate the output
y=0.0
for i from 0 <= i < num_inputs:
y=y+wp*xp[i+offset]

# change in the weights
for i from 0 <= i < num_inputs:
wp=wp+eta*(y*xp[i+offset] - y*y*wp)

# estimate the variance
thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0])
 
G

Gerard Flanagan

Brian said:
Hello,

I am trying to translate some Matlab/mex code to Python, for doing neural
simulations. This application is definitely computing-time limited, and I need to
optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
The procedure is very simple, after initializing any variables:

1) select a random input vector, which I will call "x". right now I have it as an
array, and I choose columns from that array randomly. in other cases, I may need to
take an image, select a patch, and then make that a column vector.

2) calculate an output value, which is the dot product of the "x" and a weight
vector, "w", so

y=dot(x,w)

3) modify the weight vector based on a matrix equation, like:

w=w+ eta * (y*x - y**2*w)
^
|
+---- learning rate constant

4) repeat steps 1-3 many times


Brian

ETA = 0.001
INV_TAU = 1.0/100.0

rather than:
params={}
params['eta']=0.001;
params['tau']=100.0;

ie. take the dictionary lookup (and the repeated division) out?

If you have a collection of random vectors do you gain anything by
*choosing* them randomly?

I'm not exactly sure if it's equivalent to your existing code, but how
about:

x=random((100,250000))
w=random((100,1));
th=random((1,1))
for vector in x:
...
w=w+ ETA * (y*vector - y**2*w)
th = th + INV_TAU *(y**2-th)
...

? (I'm not familar with Numeric)

Gerard
old_mx=0;
for e in range(100):

rnd=randint(0,numpats,250000)
t1=time.time()
if 0: # straight python
for i in range(len(rnd)):
pat=rnd
xx=reshape(x[:,pat],(1,-1))
y=matrixmultiply(xx,w)
w=w+params['eta']*(y*transpose(xx)-y**2*w);
th=th+(1.0/params['tau'])*(y**2-th);
else: # pyrex
dohebb(params,w,th,x,rnd)
print time.time()-t1
 
G

Gerard Flanagan

Gerard said:
Brian said:
Hello,

I am trying to translate some Matlab/mex code to Python, for doing neural
simulations. This application is definitely computing-time limited, and I need to
optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
The procedure is very simple, after initializing any variables:

1) select a random input vector, which I will call "x". right now I have it as an
array, and I choose columns from that array randomly. in other cases, I may need to
take an image, select a patch, and then make that a column vector.

2) calculate an output value, which is the dot product of the "x" and a weight
vector, "w", so

y=dot(x,w)

3) modify the weight vector based on a matrix equation, like:

w=w+ eta * (y*x - y**2*w)
^
|
+---- learning rate constant

4) repeat steps 1-3 many times


Brian

ETA = 0.001
INV_TAU = 1.0/100.0

rather than:
params={}
params['eta']=0.001;
params['tau']=100.0;

or:

ETA = params['eta']
INV_TAU = 1.0/params['tau']


Gerard
ie. take the dictionary lookup (and the repeated division) out?

If you have a collection of random vectors do you gain anything by
*choosing* them randomly?

I'm not exactly sure if it's equivalent to your existing code, but how
about:

x=random((100,250000))
w=random((100,1));
th=random((1,1))
for vector in x:
...
w=w+ ETA * (y*vector - y**2*w)
th = th + INV_TAU *(y**2-th)
...

? (I'm not familar with Numeric)

Gerard
old_mx=0;
for e in range(100):

rnd=randint(0,numpats,250000)
t1=time.time()
if 0: # straight python
for i in range(len(rnd)):
pat=rnd
xx=reshape(x[:,pat],(1,-1))
y=matrixmultiply(xx,w)
w=w+params['eta']*(y*transpose(xx)-y**2*w);
th=th+(1.0/params['tau'])*(y**2-th);
else: # pyrex
dohebb(params,w,th,x,rnd)
print time.time()-t1
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,154
Members
46,702
Latest member
LukasConde

Latest Threads

Top