Bound the Peak Performance of SGEMM on GPU with software-controlled fast memoryReportar como inadecuado




Bound the Peak Performance of SGEMM on GPU with software-controlled fast memory - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 ALF - Amdahl-s Law is Forever Inria Rennes – Bretagne Atlantique , IRISA-D3 - ARCHITECTURE

Abstract : In this paper, we studied the NVIDIA GPU architecture characteristics concerning the SGEMM routine and the potential peak performance of SGEMM on Fermi GPU. Guiding by the analysis, our SGEMM routine achieved about 11% NN, 4.5% TN, 3% NT and 9% TT better performance than cublas in CUDA 4.1 package for large matrices on GTX580 Fermi Card. We also described how to use native assembly language directly in the CUDA runtime source code.





Autor: Junjie Lai - André Seznec -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados