Lessons learned from contrasting a BLAS kernel implementations

This work reviews the experience of implementing different versions of the SSPR rank-one update operation of the BLAS library. The main objective was to contrast CPU versus GPU implementation effort and complexity of an optimized BLAS routine, not considering performance. This work contributes with...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: More, Andres
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2013
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/31702
Aporte de:
id I19-R120-10915-31702
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
BLAS libraries
SSPR kernel
CPU architecture
GPU architecture
performance analysis
performance measurement
software optimization
Software libraries
Optimization
PROCESSOR ARCHITECTURES
Performance Analysis and Design Aids
spellingShingle Ciencias Informáticas
BLAS libraries
SSPR kernel
CPU architecture
GPU architecture
performance analysis
performance measurement
software optimization
Software libraries
Optimization
PROCESSOR ARCHITECTURES
Performance Analysis and Design Aids
More, Andres
Lessons learned from contrasting a BLAS kernel implementations
topic_facet Ciencias Informáticas
BLAS libraries
SSPR kernel
CPU architecture
GPU architecture
performance analysis
performance measurement
software optimization
Software libraries
Optimization
PROCESSOR ARCHITECTURES
Performance Analysis and Design Aids
description This work reviews the experience of implementing different versions of the SSPR rank-one update operation of the BLAS library. The main objective was to contrast CPU versus GPU implementation effort and complexity of an optimized BLAS routine, not considering performance. This work contributes with a sample procedure to compare BLAS kernel implementations, how to start using GPU libraries and offloading, how to analyze their performance and the issues faced and how they were solved.
format Objeto de conferencia
Objeto de conferencia
author More, Andres
author_facet More, Andres
author_sort More, Andres
title Lessons learned from contrasting a BLAS kernel implementations
title_short Lessons learned from contrasting a BLAS kernel implementations
title_full Lessons learned from contrasting a BLAS kernel implementations
title_fullStr Lessons learned from contrasting a BLAS kernel implementations
title_full_unstemmed Lessons learned from contrasting a BLAS kernel implementations
title_sort lessons learned from contrasting a blas kernel implementations
publishDate 2013
url http://sedici.unlp.edu.ar/handle/10915/31702
work_keys_str_mv AT moreandres lessonslearnedfromcontrastingablaskernelimplementations
bdutipo_str Repositorios
_version_ 1764820468706050049