J
Jamie Mitchell
Hello all,
I'm afraid I am new to all this so bear with me...
I am looking to find the statistical significance between two large netCDF data sets.
Firstly I've loaded the two files into python:
swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
I have then isolated the variables I want to perform the pearson correlation on:
hs=swh.variables['hs']
hs_2050s=swh_2050s.variables['hs']
Here is the metadata for those files:
print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)
print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)
Then to perform the pearsons correlation:
from scipy.stats.stats import pearsonr
pearsonr(hs,hs_2050s)
I then get a memory error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError
This also happens when I try to create numpy arrays from the data.
Does anyone know how I can alleviate theses memory errors?
Cheers,
Jamie
I'm afraid I am new to all this so bear with me...
I am looking to find the statistical significance between two large netCDF data sets.
Firstly I've loaded the two files into python:
swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')
swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')
I have then isolated the variables I want to perform the pearson correlation on:
hs=swh.variables['hs']
hs_2050s=swh_2050s.variables['hs']
Here is the metadata for those files:
print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)
print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)
Then to perform the pearsons correlation:
from scipy.stats.stats import pearsonr
pearsonr(hs,hs_2050s)
I then get a memory error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError
This also happens when I try to create numpy arrays from the data.
Does anyone know how I can alleviate theses memory errors?
Cheers,
Jamie