Ranking ForestsReportar como inadecuado

Ranking Forests - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 LTCI - Laboratoire Traitement et Communication de l-Information

Abstract : It is the goal of this paper to examine how the aggregation and feature randomization principles underlying the algorithm RANDOM FOREST 1, originally proposed in the classification-regression setup, can be adapted to bipartite ranking, in order to increase the performance of scoring rules produced by the TREERANK algorithm 2, a recently developed tree induction method, specifically tailored for this global learning problem. Since TREERANK may be viewed as a recursive implementation of a cost-sensitive version of the popular classification algorithm CART 3, with a cost locally depending on the data lying within the node to split, various strategies can be considered for -randomizing- the features involved in the tree growing stage. In parallel, several ways of combining-averaging ranking trees may be used, including techniques inspired from rank aggregation methods recently popularized in Web applications. Ranking procedures based on such approaches are called RANKING FORESTS. Beyond preliminary theoretical background, results of experiments based on simulated data are provided in order to give evidence of their statistical performance.

Keywords : feature randomization Bipartite Ranking data with binary labels ROC optimization AUC criterion tree-based ranking rules bootstrap bagging rank aggregation median procedure feature randomization.

Autor: Stéphan Clémençon -

Fuente: https://hal.archives-ouvertes.fr/


Documentos relacionados