Diagnosis and impacts of non-Gaussianity of innovations in data assimilation

Pires C.A., Talagrand O., Bocquet M.
PHYSICA D-NONLINEAR PHENOMENA, 239, 17, 1701-1717, doi:10.1016/j.physd.2010.05.006

Download PDF


Most of the atmospheric and oceanic data assimilation (DA) schemes rely on the Best Linear Unbiased Estimator (BLUE), which is sub-optimal if errors of assimilated data are non-Gaussian, thus calling for a full Bayesian data assimilation. This paper contributes to the study of the non-Gaussianity of errors in the observational space. Possible sources of non-Gaussianity range from the inherent statistical skewness and positiveness of some physical observables (e.g. moisture, chemical species), the nonlinearity, both of the data assimilation models and of the observation operators among others. Deviations from Gaussianity can be justified from a priori hypotheses or inferred from statistical diagnostics of innovations (observation minus background), leading to consistency relationships between the error statistics. From samples of observations and backgrounds as well as their specified error variances, we evaluate some measures of the innovation non-Gaussianity, such as the skewness, kurtosis and negentropy. Under the assumption of additive errors and by relating statistical moments from both data errors and innovations, we identify potential sources of the innovation non-Gaussianity. These sources range from: (1) univariate error non-Gaussianity, (2), nonlinear correlations between errors, (3) spatio-temporal variability of error variances (heteroscedasticity) and (4) multiplicative noise. Observational and background errors are often assumed independent. This leads to variance-dependent bounds for the skewness and the kurtosis of errors. From innovation statistics, we assess the potential DA impact of some scenarios of non-Gaussian errors. This impact is measured through the mean square difference between the BLUE and the Minimum Variance Unbiased Estimator (MVUE), obtained with univariate observations and background estimates. In order to accomplish this, we compute maximum entropy probability density functions (pdfs) of the errors, constrained by the first four order moments. These pdfs are then used to compute the Bayesian posterior pdf and the MVUE. The referred impact is studied for a large range of statistical moments, being higher for skewed innovations and growing in average with the skewness of data errors, specially if the skewnesses have the same sign. An application has been performed to the quality-accepted ECMWF innovations of brightness temperatures of a set of High Resolution Infrared Sounder (HIRS) channels. In this context, the MVUE has led in some extreme cases to a potential reduction of 20%-60% of the posterior error variance as compared to the BLUE, specially for extreme values of the innovations.