Tile QR Factorization with Parallel Panel Processing for Multicore ArchitecturesReportar como inadecuado




Tile QR Factorization with Parallel Panel Processing for Multicore Architectures - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 ICL - Innovative Computing Laboratory Knoxville 2 HiePACS - High-End Parallel Algorithms for Challenging Numerical Simulations LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest 3 LaBRI - Laboratoire Bordelais de Recherche en Informatique

Abstract : To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph DAG of tasks of fine granularity where nodes represent tasks, either panel factorization or update of a block-column, and edges represent dependencies among them. Although past approaches already achieve high performance on moderate and large square matrices, their way of processing a panel in sequence leads to limited performance when factorizing tall and skinny matrices or small square matrices. We present a new fully asynchronous method for computing a QR factorization on shared-memory multicore architectures that overcomes this bottleneck. Our contribution is to adapt an existing algorithm that performs a panel factorization in parallel named Communication-Avoiding QR and initially designed for distributed-memory machines, to the context of tile algorithms using asynchronous computations. An experimental study shows significant improvement up to almost 10 times faster compared to state-of-the-art approaches. We aim to eventually incorporate this work into the Parallel Linear Algebra for Scalable Multi-core Architectures PLASMA library.





Autor: Bilel Hadri - Hatem Ltaief - Emmanuel Agullo - Jack Dongarra -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados