Concurrent Number Cruncher : An Efficient Sparse Linear Solver on the GPUReportar como inadecuado

Concurrent Number Cruncher : An Efficient Sparse Linear Solver on the GPU - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 ALICE - Geometry and Lighting INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications 2 CRPG - Centre de Recherches Pétrographiques et Géochimiques

Abstract : A wide class of geometry processing and PDE resolution methods needs to solve a linear system, where the non-zero pattern of the matrix is dictated by the connectivity matrix of the mesh. The advent of GPUs with their ever-growing amount of parallel horsepower makes them a tempting resource for such numerical computations. This can be helped by new APIs CTM from ATI and CUDA from NVIDIA which give a direct access to the multithreaded computational resources and associated memory bandwidth of GPUs; CUDA even provides a BLAS implementation but only for dense matrices CuBLAS. However, existing GPU linear solvers are restricted to specific types of matrices, or use non-optimal compressed row storage strategies. By combining recent GPU programming techniques with supercomputing strategies namely block compressed row storage and register blocking, we implement a sparse generalpurpose linear solver which outperforms leading-edge CPU counterparts MKL - ACML.

Autor: Luc Buatois - Guillaume Caumon - Bruno Lévy -



Documentos relacionados