Model selection in block clustering by the integrated classification likelihoodReportar como inadecuado




Model selection in block clustering by the integrated classification likelihood - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 Heudiasyc - Heuristique et Diagnostic des Systèmes Complexes Compiègne 2 DI Heudiasyc - Heuristique et Diagnostic des Systèmes Complexes Compiègne

Abstract : Block clustering or co-clustering aims at simultaneously partitioning the rows and columns of a data table to reveal homogeneous block structures. This structure can stem from the latent block model which provides a probabilistic modeling of data tables whose block pattern is defined from the row and column classes. For continuous data, each table entry is typically assumed to follow a Gaussian distribution. For a given data table, several candidate models are usually examined: they may differ in the numbers of clusters or in the number of free parameters. Model selection then becomes a critical issue, for which the tools that have been derived for model-based one-way clustering need to be adapted. In one-way clustering, most selection criteria are based on asymptotical considerations that are difficult to render in block clustering due to dual nature of rows and columns. We circumvent this problem by developing a non-asymptotic criterion based on the Integrated Classification Likelihood. This criterion can be computed in closed form once a proper prior distribution has been defined on the parameters. The experimental results show steady performances for medium to large data tables with well-separated and moderately-separated clusters.

Keywords : Block clustering integrated classification likelihood model selection Gaussian data





Autor: Aurore Lomet - Gérard Govaert - Yves Grandvalet -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados