LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory SystemReportar como inadecuado




LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 ICL - Innovative Computing Laboratory Knoxville

Abstract : LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance Linpack benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. The optimizations include lookahead, dynamic task scheduling, fine grain parallelism for memory-bound operations, autotuning, and data layout geared towards complex memory hierarchies. Performance in excess of one Tera flop-s is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.





Autor: Jakub Kurzak - P. Luszczek - Mathieu Faverge - Jack J. Dongarra -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados