Statistical Modelling 10 (2010), 265–290

Finite mixture models for clustering multilevel data with multiple cluster structures

Giuliano Galimberti
Department of Statistics,
University of Bologna
Italy

Gabriele Soffritti
Department of Statistics,
University of Bologna,
via Belle Arti, 41
I–40126 Bologna
Italy
eMail: gabriele.soffritti@unibo.it

Abstract:

Finite mixture models are useful tools for clustering two-way datasets within a sound statistical framework which can assess some important questions, such as how many clusters are there in the data. Models that can also be used for clustering multilevel data have been proposed, with the intent to produce clusterings of units at every level on the basis of all the available variables, considering the hierarchical structure of the dataset. This paper introduces a new class of mixture models for datasets with two levels that makes it possible to discover a clustering of level 2 units and different clusterings of level 1 units corresponding to different subsets of the variables (multiple cluster structures). This new class is obtained by adapting a mixture model proposed to identify multiple cluster structures in a data matrix to the multilevel situation. The usefulness of the new method is shown using simulated data and a real example.

Keywords:

cluster analysis; cluster structure; mixture model; model selection; multilevel data
back