Space-efficient and exact de Bruijn graph representation based on a Bloom filterReportar como inadecuado




Space-efficient and exact de Bruijn graph representation based on a Bloom filter - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

* Corresponding author 1 ENS Cachan Bretagne - École normale supérieure - Cachan, antenne de Bretagne 2 GenScale - Scalable, Optimized and Parallel Algorithms for Genomics Inria Rennes – Bretagne Atlantique , IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE 3 Algorizk Paris

Abstract : The de Bruijn graph data structure is widely used in next-generation sequencing NGS. Many programs, e.g. de novo assemblers, rely on in-memory representation of this graph. However, current techniques for representing the de Bruijn graph of a human genome require a large amount of memory > 30 GB. We propose a new encoding of the de Bruijn graph, which occupies an order of magnitude less space than current representations. The encoding is based on a Bloom filter, with an additional structure to remove critical false positives. An assembly software implementing this structure, Minia, performed a complete de novo assembly of human genome short reads using 5.7 Gb of memory in 23 hours.





Autor: Rayan Chikhi - Guillaume Rizk -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados