Separating populations with wide data: A spectral analysis - Statistics > Machine LearningReportar como inadecuado




Separating populations with wide data: A spectral analysis - Statistics > Machine Learning - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: In this paper, we consider the problem of partitioning a small data sampledrawn from a mixture of $k$ product distributions. We are interested in thecase that individual features are of low average quality $\gamma$, and we wantto use as few of them as possible to correctly partition the sample. We analyzea spectral technique that is able to approximately optimize the total datasize-the product of number of data points $n$ and the number of features$K$-needed to correctly perform this partitioning as a function of $1-\gamma$for $K>n$. Our goal is motivated by an application in clustering individualsaccording to their population of origin using markers, when the divergencebetween any two of the populations is small.



Autor: Avrim Blum, Amin Coja-Oghlan, Alan Frieze, Shuheng Zhou

Fuente: https://arxiv.org/







Documentos relacionados