A comprehensive evaluation of multicategory classification methods for microbiomic dataReportar como inadecuado

A comprehensive evaluation of multicategory classification methods for microbiomic data - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.


, 1:11

First Online: 05 April 2013Received: 10 January 2013Accepted: 13 March 2013


BackgroundRecent advances in next-generation DNA sequencing enable rapid high-throughput quantitation of microbial community composition in human samples, opening up a new field of microbiomics. One of the promises of this field is linking abundances of microbial taxa to phenotypic and physiological states, which can inform development of new diagnostic, personalized medicine, and forensic modalities. Prior research has demonstrated the feasibility of applying machine learning methods to perform body site and subject classification with microbiomic data. However, it is currently unknown which classifiers perform best among the many available alternatives for classification with microbiomic data.

ResultsIn this work, we performed a systematic comparison of 18 major classification methods, 5 feature selection methods, and 2 accuracy metrics using 8 datasets spanning 1,802 human samples and various classification tasks: body site and subject classification and diagnosis.

ConclusionsWe found that random forests, support vector machines, kernel ridge regression, and Bayesian logistic regression with Laplace priors are the most effective machine learning techniques for performing accurate classification from these microbiomic data.

KeywordsMicrobiomic data Machine learning Classification Feature selection AbbreviationsBLRBayesian logistic regression machine learning method

KNNK-nearest neighbors machine learning method

KRRkernel ridge regression machine learning method

L1-LRregularized logistic regression by an L1 penalty machine learning method

L2-LRregularized logistic regression by an L2 penalty machine learning method

OTUoperational taxonomic unit

PCCproportion of correct classifications classification accuracy metric

PNNprobabilistic neural networks machine learning method

QIIMEQuantitative Insights Into Microbial Ecology

RCIrelative classifier information classification accuracy metric

RFrandom forests machine learning method

SVMsupport vector machine machine learning method

Electronic supplementary materialThe online version of this article doi:10.1186-2049-2618-1-11 contains supplementary material, which is available to authorized users.

Download fulltext PDF


Fuente: https://link.springer.com/article/10.1186/2049-2618-1-11

Documentos relacionados