Optimal properties of centroid-based classifiers for very high-dimensional data - Mathematics > Statistics TheoryReportar como inadecuado




Optimal properties of centroid-based classifiers for very high-dimensional data - Mathematics > Statistics Theory - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: We show that scale-adjusted versions of the centroid-based classifier enjoysoptimal properties when used to discriminate between two very high-dimensionalpopulations where the principal differences are in location. The scaleadjustment removes the tendency of scale differences to confound differences inmeans. Certain other distance-based methods, for example, those founded onnearest-neighbor distance, do not have optimal performance in the sense that wepropose. Our results permit varying degrees of sparsity and signal strength tobe treated, and require only mild conditions on dependence of vectorcomponents. Additionally, we permit the marginal distributions of vectorcomponents to vary extensively. In addition to providing theory we explorenumerical properties of a centroid-based classifier, and show that thesefeatures reflect theoretical accounts of performance.



Autor: Peter Hall, Tung Pham

Fuente: https://arxiv.org/







Documentos relacionados