How confident are predictability estimates of the winter North Atlantic Oscillation?

Publications

Antje Weisheimer, Damien Decremer, David MacLeod, Christopher O’Reilly, T.N. Stockdale, S. Johnson, T.N. Palmer
Quarterly Journal of the Royal Meteorological Society Royal Meteorological Society, UK, 2018
DOI: 10.1002/qj.3446

Abstract

Atmospheric seasonal predictability in winter over the Euro‐Atlantic region is studied with an emphasis on the signal‐to‐noise paradox of the North Atlantic Oscillation. Seasonal hindcasts of the ECMWF model for the recent period 1981–2009 show, in agreement with other studies, that correlation skill over Greenland and parts of the Arctic is higher than the signal‐to‐noise ratio implies. This leads to the paradoxical situation where the real world appears more predictable than the models suggest, with the forecast ensembles being overly dispersive (or underconfident). However, it is demonstrated that these conclusions are not supported by the diagnosed relationship between ensemble mean root‐mean‐square error (RMSE) and ensemble spread which indicates a slight under‐dispersion (overconfidence). Furthermore, long atmospheric seasonal hindcasts suggest that over the 110‐year period from 1900 to 2009 the ensemble system is well calibrated (neither over‐ nor under‐dispersive). The observed skill changed drastically in the middle of the twentieth century and paradoxical regions during more recent hindcast periods were strongly under‐dispersive during mid‐century decades.

Due to non‐stationarities of the climate system in the form of decadal variability, relatively short hindcasts are not sufficiently representative of longer‐term behaviour. In addition, small hindcast sample size can lead to skill estimates, in particular of correlation measures, that are not robust. It is shown that the relative uncertainty due to small hindcast sample size is often larger for correlation‐based than for RMSE‐based diagnostics. Correlation‐based measures like the RPC are shown to be highly sensitive to the strength of the predictable signal, implying that disentangling of physical deficiencies in the models on the one hand, and the effects of sampling uncertainty on the other hand, is difficult. Given the current lack of a causal physical mechanism to unravel the puzzle, our hypotheses of non‐stationarity and sampling uncertainty provide simple yet plausible explanations for the paradox.