Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and BeyondReportar como inadecuado

Scaling Out Link Prediction with SNAPLE: 1 Billion Edges and Beyond - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 ASAP - As Scalable As Possible: foundations of large scale dynamic distributed systems Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE 2 UR1 - Université de Rennes 1

Abstract : In this paper, we consider how the emblematic problem of link-prediction can be implementedefficiently in gather-apply-scatter GAS platforms, a popular distributed graph-computation model. Ourproposal, called S NAPLE , exploits a novel highly-localized vertex scoring technique, and minimizes thecost of data flow while maintaining prediction quality.When used within GraphLab, S NAPLE can scale to extremely large graphs that a standard implementationof link prediction on GraphLab cannot handle. More precisely, we show that S NAPLE can process a graphcontaining 1.4 billions edges on a 256 cores cluster in less than three minutes, with no penalty in the qualityof predictions. This result corresponds to an over-linear speedup of 30 against a 20-core standalone machinerunning a non-distributed state-of-the-art solution.

Keywords : big data Distributed systems Graph link prediction

Autor: Anne-Marie Kermarrec - François Taïani - Juan Manuel Tirado Martin -



Documentos relacionados