Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 IPHT - Institut de Physique Théorique - UMR CNRS 3681 2 Santa Fe Institute 3 LPS - Laboratoire de Physique Statistique de l-ENS

Abstract : We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m ightarrow \infty$ and $\alpha = m-n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $\alpha$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, $r > 4 + 2 \sqrt{\alpha}$, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.

Keywords : Cond-mat.dis-nn Cond-mat Cs Cs.IT Math Stat

Autor: Thibault Lesieur - Caterina De Bacco - Jess Banks - Florent Krzakala - Cris Moore - Lenka Zdeborová -

Fuente: https://hal.archives-ouvertes.fr/

DESCARGAR PDF