Lessons learned from contrasting a BLAS kernel implementations

This work reviews the experience of implementing different versions of the SSPR rank-one update operation of the BLAS library. The main objective was to contrast CPU versus GPU implementation effort and complexity of an optimized BLAS routine, not considering performance. This work contributes with...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: More, Andres
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2013
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/31702
Aporte de:

Ejemplares similares