Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA MachinesReportar como inadecuado




Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 RUNTIME - Efficient runtime systems for parallel architectures Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800 2 LaBRI - Laboratoire Bordelais de Recherche en Informatique

Abstract : We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time NUMA shared memory machines. We show how to optimize thread placement and data placement in order to achieve performance gain up to 50% compared to state-of-the-art libraries such as Plasma or MKL.





Autor: Emmanuel Jeannot -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados