Statistical Modelling 18 (1) (2018), 24–49

Extended Poisson–Tweedie: Properties and regression models for count data

Wagner H. Bonat
Laboratory of Statistics and Geoinformation,
Department of Statistics,
Paraná Federal University,
Curitiba,
Brazil
e-mail: wbonat@ufpr.br

and

Department of Mathematics and Computer Science,
University of Southern Denmark,
Odense,
Denmark


Bent Jørgensen
Department of Mathematics and Computer Science,
University of Southern Denmark,
Odense,
Denmark


Célestin C. Kokonendji
Laboratoire de Mathématiques de Besançon,
Bourgogne Franche-Comté University,
Besançon,
France


John Hinde
School of Mathematics, Statistics and Applied Mathematics,
National University of Ireland Galway,
Galway,
Ireland


Clarice G. B. Demétrio
Departamento de Ciências Exatas,
Escola Superior de Agricultura Luiz de Queiroz,
São Paulo University,
Piracicaba,
Brazil


Abstract:

We propose a new class of discrete generalized linear models based on the class of Poisson–Tweedie factorial dispersion models with variance of the form μ+ϕμp, where μ is the mean and ϕ and p are the dispersion and Tweedie power parameters, respectively. The models are fitted by using an estimating function approach obtained by combining the quasi-score and Pearson estimating functions for the estimation of the regression and dispersion parameters, respectively. This provides a flexible and efficient regression methodology for a comprehensive family of count models including Hermite, Neyman Type A, Pólya–Aeppli, negative binomial and Poisson-inverse Gaussian. The estimating function approach allows us to extend the Poisson–Tweedie distributions to deal with underdispersed count data by allowing negative values for the dispersion parameter ϕ. Furthermore, the Poisson–Tweedie family can automatically adapt to highly skewed count data with excessive zeros, without the need to introduce zero-inflated or hurdle components, by the simple estimation of the power parameter. Thus, the proposed models offer a unified framework to deal with under-, equi-, overdispersed, zero-inflated and heavy-tailed count data. The computational implementation of the proposed models is fast, relying only on a simple Newton scoring algorithm. Simulation studies showed that the estimating function approach provides unbiased and consistent estimators for both regression and dispersion parameters. We highlight the ability of the Poisson–Tweedie distributions to deal with count data through a consideration of dispersion, zero-inflated and heavy tail indices, and illustrate its application with four data analyses. We provide an R implementation and the datasets as supplementary materials.

Keywords:

count data; Estimating functions; overdispersion; underdispersion; Poisson–Tweedie distribution.

Downloads:

Example data and code in zipped archive.
back