Home   Publications     edited volumes   Awards   Research   Teaching   Miscellaneous   Full CV [pdf]   BLOG   bio
  
 
 
  
 
  
  Events
  
  
  
  
   
  
   Past Events
  
  
  
  
  
  
   
    | 
Publications of Torsten Hoefler  
Patrik Okanovic, Grzegorz Kwasniewski, Paolo Sylos Labini, Maciej Besta, Flavio Vella, Torsten Hoefler:
 
  |  |   | High Performance Unstructured SpMM Computation Using Tensor Cores
   (In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'24), presented in Atlanta, GA, USA, pages 154:1-154:14, IEEE Press, ISBN: 979-8-3503-5291-7, Nov. 2024) 
  Publisher Reference
 
 AbstractHigh-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures. Yet, existing hardware, such as Tensor Cores (TC), is ill-suited for SpMM, as it imposes strict constraints on data structures that cannot be met by unstructured sparsity found in many applications. To address this, we introduce (S)parse (Ma)trix Matrix (T)ensor Core-accelerated (SMaT): a novel SpMM library that utilizes TCs for unstructured sparse matrices. Our block-sparse library leverages the low-level CUDA MMA (matrix-matrix-accumulate) API, maximizing the performance offered by modern GPUs. Algorithmic optimizations such as sparse matrix permutation, further improve performance by minimizing the number of non-zero blocks. The evaluation on NVIDIA A100 shows that SMaT outperforms SotA libraries (DASP, cuSPARSE, and Magicube) by up to 125x (on average 2.6x). SMaT can be used to accelerate many workloads in scientific computing, large model training, inference, and others.
 
 DocumentsPublisher URL: https://ieeexplore.ieee.org/document/10793184download article:        download slides:      |  |   | BibTeX |  @inproceedings{okanovic2024high,   author={Patrik Okanovic and Grzegorz Kwasniewski and Paolo Sylos Labini and Maciej Besta and Flavio Vella and Torsten Hoefler},   title={{High Performance Unstructured SpMM Computation Using Tensor Cores}},   year={2024},   month={Nov.},   pages={154:1-154:14},   booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'24)},   location={Atlanta, GA, USA},   publisher={IEEE Press},   isbn={979-8-3503-5291-7},   source={http://www.unixer.de/~htor/publications/}, } |  
  |  
  
 
 |