Statistical Modelling 3 (2003), 193213
Regression analysis of variates observed on (0,1): percentages,
proportions, and fractions
Robert Kieschnick
University of Texas at Dallas
P. O. Box 830688, JO5.1
Richardson, Texas 75083-0688
U.S.A.
eMail: rkiesch@utdallas.edu
B. D. McCullough
Department of Decision Sciences
Academic Building, Room 230
Philadelphia, PA 19104
U.S.A.
Abstract:
Many types of studies examine the influence of selected variables
on the conditional expectation of a proportion or vector of
proportions, e.g., market shares, rock composition, etc. We identify
four distributional categories into which such data can be put, and
focus on regression models for the first category, for proportions
observed on the open interval (0, 1). For these data, we identify
different specifications used in prior research and compare these
specifications using two common samples and specifications of the
regressors. Based upon our analysis, we recommend that researchers
use either a parametric regression model based upon the beta
distribution or a quasi-likelihood regression model developed by
Papke and Wooldridge (1997) for these data. Concerning the choice
between these two regression models, we recommend that researchers
use the parametric regression model unless their sample size is
large enough to justify the asymptotic arguments underlying the
quasi-likelihood approach.
Keywords:
Percentages, proportions, fractions, regression
Downloads:
Data
and TSP macro file in zipped archive.
NB: In the paper subsets of the data in the data files were used.
Authors provide the larger data files so readers can explore
alternative specifications, if they wish.
back