xskillscore.pearson_r_eff_p_value

xskillscore.pearson_r_eff_p_value(a, b, dim=None, skipna=False, keep_attrs=False)

2-tailed p-value associated with Pearson’s correlation coefficient, accounting for autocorrelation.

Note

This metric should only be applied over the time dimension, since it is designed for temporal autocorrelation. Weights are not included due to the reliance on temporal autocorrelation.

The effective p value is computed by replacing the sample size \(N\) in the t-statistic with the effective sample size, \(N_{eff}\). The same Pearson product-moment correlation coefficient \(r\) is used as when computing the standard p value.

\[t = r\sqrt{ \frac{N_{eff} - 2}{1 - r^{2}} },\]

where \(N_{eff}\) is computed via the autocorrelation in the forecast and observations.

\[N_{eff} = N\left( \frac{1 - \rho_{f}\rho_{o}}{1 + \rho_{f}\rho_{o}} \right),\]

where \(\rho_{f}\) and \(\rho_{o}\) are the lag-1 autocorrelation coefficients for the forecast and observations.

Parameters

a (xarray.Dataset or xarray.DataArray) – Labeled array(s) over which to apply the function.
b (xarray.Dataset or xarray.DataArray) – Labeled array(s) over which to apply the function.
dim (str, list) – The dimension(s) to compute the p value over. Note that this dimension will be reduced as a result. Defaults to None reducing all dimensions.
skipna (bool) – If True, skip NaNs when computing function.
keep_attrs (bool) – If True, the attributes (attrs) will be copied from the first input to the new one. If False (default), the new object will be returned without attributes.

Returns

2-tailed p-value of Pearson’s correlation coefficient, accounting for autocorrelation.

Return type

xarray.Dataset or xarray.DataArray

See also

scipy.stats.pearsonr

References

Bretherton, Christopher S., et al. “The effective number of spatial degrees of freedom of a time-varying field.” Journal of climate 12.7 (1999): 1990-2009.
Wilks, Daniel S. Statistical methods in the atmospheric sciences. Vol. 100. Academic press, 2011.

Examples

>>> a = xr.DataArray(np.random.rand(5, 3, 3),
...                  dims=['time', 'x', 'y'])
>>> b = xr.DataArray(np.random.rand(5, 3, 3),
...                  dims=['time', 'x', 'y'])
>>> xs.pearson_r_eff_p_value(a, b, dim='time')
<xarray.DataArray (x: 3, y: 3)>
array([[0.82544245,        nan, 0.25734167],
       [0.78902959, 0.57503354, 0.8059353 ],
       [0.79242625, 0.66792245,        nan]])
Dimensions without coordinates: x, y