Statistical Modelling 15 (2) (2015), 191–213

Regression with compositional response having unobserved components or below detection limit values

Karl Gerald van den Boogaart
Department of Modelling and Valuation,
Helmholtz Institute Freiberg for Resources Technology,
Germany
e-mail: boogaart@hzdr.de

and

Institute for Stochastics,
Technical University Bergakademie Freiberg,
Freiberg,
Germany


Raimon Tolosana-Delgado
Department of Modelling and Valuation,
Helmholtz Institute Freiberg for Resources Technology,
Germany


Matthias Templ
Department of Statistics and Probability Theory,
Vienna University of Technology,
Wien,
Austria


Abstract:

The typical way to deal with zeros and missing values in compositional data sets is to impute them with a reasonable value, and then the desired statistical model is estimated with the imputed data set, e.g., a regression model. This contribution aims at presenting alternative approaches to this problem within the framework of Bayesian regression with a compositional response. In the first step, a compositional data set with missing data is considered to follow a normal distribution on the simplex, which mean value is given as an Aitchison affine linear combination of some fully observed explanatory variables. Both the coefficients of this linear combination and the missing values can be estimated with standard Gibbs sampling techniques. In the second step, a normally distributed additive error is considered superimposed on the compositional response, and values are taken as ‘below the detection limit’ (BDL) if they are ‘too small’ in comparison with the additive standard deviation of each variable. Within this framework, the regression parameters and all missing values (including BDL) can be estimated with a Metropolis-Hastings algorithm. Both methods estimate the regression coefficients without need of any preliminary imputation step, and adequately propagate the uncertainty derived from the fact that the missing values and BDL are not actually observed, something imputation methods cannot achieve.

Keywords:

Bayesian regression; compositional regression; missing values; nondetects; MCMC.
back