Memory error

Jamie Mitchell · Mar 24, 2014

Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

Here is the metadata for those files:

print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)

print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)

Then to perform the pearsons correlation:

from scipy.stats.stats import pearsonr

pearsonr(hs,hs_2050s)

I then get a memory error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError

This also happens when I try to create numpy arrays from the data.

Does anyone know how I can alleviate theses memory errors?

Cheers,

Jamie

Jamie Mitchell · Mar 24, 2014

Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

Here is the metadata for those files:

print hs

<type 'netCDF4.Variable'>

int16 hs(time, latitude, longitude)

standard_name: significant_height_of_wind_and_swell_waves

long_name: significant_wave_height

units: m

add_offset: 0.0

scale_factor: 0.002

_FillValue: -32767

missing_value: -32767

unlimited dimensions: time

current shape = (86400, 350, 227)

print hs_2050s

<type 'netCDF4.Variable'>

int16 hs(time, latitude, longitude)

standard_name: significant_height_of_wind_and_swell_waves

long_name: significant_wave_height

units: m

add_offset: 0.0

scale_factor: 0.002

_FillValue: -32767

missing_value: -32767

unlimited dimensions: time

current shape = (86400, 350, 227)

Then to perform the pearsons correlation:

from scipy.stats.stats import pearsonr

pearsonr(hs,hs_2050s)

I then get a memory error:

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr

x = np.asarray(x)

File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray

return array(a, dtype, copy=False, order=order)

MemoryError

This also happens when I try to create numpy arrays from the data.

Does anyone know how I can alleviate theses memory errors?

Cheers,

Jamie

Just realised that obviously pearson correlation requires two 1D arrays and mine are 3D, silly mistake!

Gary Herron · Mar 24, 2014

Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

This is not really a Python question. It's a question about netCDF
(whatever that may be), or perhaps it's interface to Python python-netCD4.

You may get an answer here, but you are far more likely to get one
quickly and accurately from a forum dedicated to netCDF, or python-netCD.

Good luck.

Gary Herron

dieter · Mar 25, 2014

Jamie Mitchell said:
...
I then get a memory error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError

"MemoryError" means that Python cannot get sufficent memory
from the operating system.

You have already found out one mistake. Should you continue to
get "MemoryError" after this is fixed, then your system does not
provide enough resources (memory) to solve the problem at hand.
You would need to find a way to provide more resources.

Adding R squared value to scatter plot	2	May 21, 2014
Overlaying a boxplot onto a time series figure	0	Jun 6, 2014
len() of unsized object - ks test	3	Apr 25, 2014

Memory error

Jamie Mitchell

Jamie Mitchell

Gary Herron

dieter

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads