xskillscore.halfwidth_ci_test
- xskillscore.halfwidth_ci_test(forecasts1, forecasts2, observations=None, metric=None, dim=None, time_dim='time', alpha=0.05, **kwargs)
Returns the Jolliffe and Ebert significance test.
Tests whether forecasts1 and forecasts2 have different distance from observations at significance level alpha. https://www.cawcr.gov.au/projects/verification/CIdiff/FAQ-CIdiff.html
Note
alpha
is the desired significance level and the maximum acceptable risk of falsely rejecting the null-hypothesis. The smaller the value of α the greater the strength of the test. The confidence level of the test is defined as 1 - alpha, and often expressed as a percentage. So for example a significance level of 0.05, is equivalent to a 95% confidence level. Source: NIST/SEMATECH e-Handbook of Statistical Methods. https://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm- Parameters
forecasts1 (xarray.Dataset or xarray.DataArray) – first forecast to be compared to the observations.
forecasts2 (xarray.Dataset or xarray.DataArray) – second forecast to be compared to the observations.
observations (xarray.Dataset or xarray.DataArray, optional) – observations to be compared to both forecasts. if None, assumes that arguments forecasts1 and forecasts2 are already MAEs. Defaults to None.
metric (str, optional) – Name of distance metric function to be used for computing the error between forecasts and observation. It can be any of the xskillscore distance metric function except for
mape
. Valid metrics areme
,rmse
,mse
,mae
,median_absolute_error
andsmape
. Note that if metric is None, observations must also be None. Defaults to None.time_dim (str, optional) – time dimension of dimension over which to compute the temporal correlation. Defaults to
'time'
.dim (str or list of str, optional) – dimensions to apply metric function to. Cannot contain
time_dim
. Defaults to None which is then converted to[]
sincedim=None
must not be passed to metric functions.alpha (float, optional) – significance level alpha that forecast1 is different than forecast2.
**kwargs (dict, optional) – Optional keyword arguments passed directly on to call
metric
, excludingdim
.
- Returns
xarray.DataArray or xarray.Dataset – boolean whether the difference in scores (score(f2) - score(f1)) are significant.
xarray.DataArray or xarray.Dataset – difference in scores (score(f2) - score(f1)) reduced by
dim
andtime_dim
.xarray.DataArray or xarray.Dataset – half-width of the confidence interval at the significance level
alpha
.
Examples
>>> f1 = xr.DataArray(np.random.normal(size=(30)), ... coords=[('time', np.arange(30))]) >>> f2 = xr.DataArray(np.random.normal(size=(30)), ... coords=[('time', np.arange(30))]) >>> o = xr.DataArray(np.random.normal(size=(30)), ... coords=[('time', np.arange(30))]) >>> significantly_different, diff, hwci = xs.halfwidth_ci_test( ... f1, f2, o, "mae", time_dim='time', dim=[], alpha=0.05 ... ) >>> significantly_different <xarray.DataArray ()> array(False) >>> diff <xarray.DataArray ()> array(-0.01919449) >>> hwci <xarray.DataArray ()> array(0.38729387) >>> # absolute magnitude of difference is smaller than half-width of >>> # confidence interval, therefore not significant at level alpha=0.05 >>> # now comparing against an offset f2, the difference in MAE is significant >>> significantly_different, diff, hwci = xs.halfwidth_ci_test( ... f1, f2 + 2., o, "mae", time_dim='time', dim=[], alpha=0.05 ... ) >>> significantly_different <xarray.DataArray ()> array(True)
References