Statistical Modelling 3 (2003), 193–213

Regression analysis of variates observed on (0,1): percentages, proportions, and fractions

Robert Kieschnick
University of Texas at Dallas
P. O. Box 830688, JO5.1
Richardson, Texas 75083-0688
U.S.A.
eMail: rkiesch@utdallas.edu

B. D. McCullough
Department of Decision Sciences
Academic Building, Room 230
Philadelphia, PA 19104
U.S.A.

Abstract:

Many types of studies examine the influence of selected variables on the conditional expectation of a proportion or vector of proportions, e.g., market shares, rock composition, etc. We identify four distributional categories into which such data can be put, and focus on regression models for the first category, for proportions observed on the open interval (0, 1). For these data, we identify different specifications used in prior research and compare these specifications using two common samples and specifications of the regressors. Based upon our analysis, we recommend that researchers use either a parametric regression model based upon the beta distribution or a quasi-likelihood regression model developed by Papke and Wooldridge (1997) for these data. Concerning the choice between these two regression models, we recommend that researchers use the parametric regression model unless their sample size is large enough to justify the asymptotic arguments underlying the quasi-likelihood approach.

Keywords:

Percentages, proportions, fractions, regression
 

Downloads:

Data and TSP macro file in zipped archive.

NB: In the paper subsets of the data in the data files were used. Authors provide the larger data files so readers can explore alternative specifications, if they wish.


back