Statistical Modelling 8 (2008), 23–39

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model

David B Dahl
Department of Statistics,
Texas A&M University,
College Station, TX 77843
USA
eMail: dahl@state.tamu.edu

Qianxing Mo
Memorial Sloan-Kettering Cancer Center
USA

Marina Vannucci
Rice University
USA

Abstract:

We propose a Bayesian nonparametric regression model that exploits clustering for increased sensitivity in multiple hypothesis testing. We build on the recently proposed BEMMA (Bayesian Effects Models for Microarrays) method which is able to model dependence among objects through clustering and then estimates hypothesis-testing parameters averaged over clustering uncertainty. We propose several improvements. First, we separate the clustering of the regression coefficients from the part of the model that accommodates heteroscedasticity. Second, our model accommodates a wider class of experimental designs, such as permitting covariates and not requiring independent sampling. Third, we provide a more satisfactory treatment of nuisance parameters and some hyperparameters. Finally, we do not require the arbitrary designation of a reference treatment. The proposed method is compared in a simulation study to ANOVA and the BEMMA methods.

Keywords:

Bayesian nonparametrics; correlated hypothesis tests; model-based clustering; multiple comparisons
back