memory management

S

Sudheer Joseph

HI,
I have been trying to compute cross correlation between a time series at a location f(1) and the timeseries of spatial data f(XYT) and saving the resulting correlation coefficients and lags in a 3 dimensional array which is of fairly big size. Though the code I made for this purpose works upto few iterations then it hangs due to apparent memory crunch. Can anybodysuggest a better way to handle this situation so that the computation and data storing can be done with out hangups. Finally I intend to save the data as netcdf file which is not implemented as of now. Below is the piece of code I wrote for this purpose.

from mpl_toolkits.basemap import Basemap as bm, shiftgrid, cm
import numpy as np
import matplotlib.pyplot as plt
from netCDF4 import Dataset
from math import pow, sqrt
import sys
from scipy.stats import t
indep=120
nlags=365
ncin = Dataset('qu_ru.nc', 'r')
lons = ncin.variables['LON421_600'][:]
lats = ncin.variables['LAT81_220'][:]
dep = ncin.variables['DEPTH1_29'][:]
adep=(dep==indep).nonzero()
didx=int(adep[0])
qu = ncin.variables['qu'][:,:,:]
#qv = ncin.variables['QV'][0,:,:]
ru = ncin.variables['ru'][:,didx,0,0]
ncin.close()
fig = plt.figure()
ax = fig.add_axes([0.1,0.1,0.8,0.8])
# use major and minor sphere radii from WGS84 ellipsoid.
m = bm(projection='cyl', llcrnrlon=30, llcrnrlat=-40,urcrnrlon=120, urcrnrlat=30)
# transform to nx x ny regularly spaced 5km native projection grid
nx = int((m.xmax-m.xmin))+1; ny = int((m.ymax-m.ymin)+1)
q=ru[1:2190]
qmean=np.mean(q)
qstd=np.std(q)
qnorm=(q-qmean)/qstd
lags3d=np.arange(731*140*180).reshape(731,140,180)
r3d=np.arange(731*140*180).reshape(731,140,180)
for i in np.arange(len(lons)):
for j in np.arange(len(lats)):
print i,j
p=qu[1:2190,j,i].squeeze()
p.shape
pmean=np.mean(p)
pstd=np.std(p)
pnorm=(p-pmean)/pstd
n=len(p)
# fg=plt.figure()
c=plt.xcorr(p,q,usevlines=True,maxlags=nlags,normed=True,lw=2)
acp=plt.acorr(p,usevlines=True,maxlags=nlags,normed=True,lw=2)
acq=plt.acorr(q,usevlines=True,maxlags=nlags,normed=True,lw=2)
acp[1][nlags]=0
acq[1][nlags]=0
lags=c[0]
r=c[1]
lags3d[:,j,i]=lags
r3d[:,j,i]=r
 
D

Dave Angel

HI,
I have been trying to compute cross correlation between a time series at a location f(1) and the timeseries of spatial data f(XYT) and saving the resulting correlation coefficients and lags in a 3 dimensional array which is of fairly big size. Though the code I made for this purpose works up to few iterations then it hangs due to apparent memory crunch. Can anybody suggest a better way to handle this situation so that the computation and data storing can be done with out hangups. Finally I intend to save the data as netcdf file which is not implemented as of now. Below is the piece of code I wrote for this purpose.

Python version and OS please. And is the Python 32bit or 64bit? How
much RAM does the computer have, and how big are the swapfiles ?

"Fairly big" is fairly vague. To some people, a list with 100k members
is huge, but not to a modern computer.

How have you checked whether it's running out of memory? Have you run
'top' on it? Or is that just a guess?

I haven't used numpy, scipy, nor matplotlib, and it's been a long time
since I did correlations. But are you sure you're not just implementing
an O(n**3) algorithm or something, and it's just extremely slow?

from mpl_toolkits.basemap import Basemap as bm, shiftgrid, cm
import numpy as np
import matplotlib.pyplot as plt
from netCDF4 import Dataset
from math import pow, sqrt
import sys
from scipy.stats import t

<snip>
 
S

Sudheer Joseph

Python version and OS please. And is the Python 32bit or 64bit? How
much RAM does the computer have, and how big are the swapfiles ?
Python 2.7.3
ubuntu 12.04 64 bit
4GB RAM
"Fairly big" is fairly vague. To some people, a list with 100k members

is huge, but not to a modern computer.
I have a data loaded to memory from netcdf file which is 2091*140*180 grid points (2091 time, 140 latitude 180 longitude) apart from this I define a 2 3d arrays r3d and lags3d to store the output for writing out to netcdf file after completion.
How have you checked whether it's running out of memory? Have you run

'top' on it? Or is that just a guess?

I have not done this but the speed (assessed from the listing of grid i and j) get stopped after j=6 ie after running 6 longitude grids)Will check the top as you suggested

Here is the result of top it used about 3gB memory

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3069 sjo 20 0 3636m 3.0g 2504 D 3 78.7 3:07.44 python
I haven't used numpy, scipy, nor matplotlib, and it's been a long time

since I did correlations. But are you sure you're not just implementing

an O(n**3) algorithm or something, and it's just extremely slow?
Correlation do not involve such computation normally, I am not sure if internally python does some thing like that.
with best regards,
Sudheer
 
S

Sudheer Joseph

Python version and OS please. And is the Python 32bit or 64bit? How
much RAM does the computer have, and how big are the swapfiles ?
Python 2.7.3
ubuntu 12.04 64 bit
4GB RAM
"Fairly big" is fairly vague. To some people, a list with 100k members

is huge, but not to a modern computer.
I have a data loaded to memory from netcdf file which is 2091*140*180 grid points (2091 time, 140 latitude 180 longitude) apart from this I define a 2 3d arrays r3d and lags3d to store the output for writing out to netcdf file after completion.
How have you checked whether it's running out of memory? Have you run

'top' on it? Or is that just a guess?

I have not done this but the speed (assessed from the listing of grid i and j) get stopped after j=6 ie after running 6 longitude grids)Will check the top as you suggested

Here is the result of top it used about 3gB memory

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3069 sjo 20 0 3636m 3.0g 2504 D 3 78.7 3:07.44 python
I haven't used numpy, scipy, nor matplotlib, and it's been a long time

since I did correlations. But are you sure you're not just implementing

an O(n**3) algorithm or something, and it's just extremely slow?
Correlation do not involve such computation normally, I am not sure if internally python does some thing like that.
with best regards,
Sudheer
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top