Omnia vincit amor
Home -> Publications
Home
  Publications
    
edited volumes
  Awards
  Research
  Teaching
  Miscellaneous
  Full CV [pdf]
  BLOG
  bio






  Events








  Past Events





Publications of Torsten Hoefler
Daniele De Sensi, Lorenzo Pichetti, Flavio Vella, Tiziano De Matteis, Zebin Ren, Luigi Fusco, Matteo Turisini, Daniele Cesarini, Kurt Lust, Animesh Trivedi, Duncan Roweth, Filippo Spiga, Salvatore Di Girolamo, Torsten Hoefler:

 Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

(In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'24), presented in Atlanta, GA, USA, pages 33:1-33:15, IEEE Press, ISBN: 979-8-3503-5291-7, Nov. 2024)

Publisher Reference

Abstract

Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on the same node are connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due to different technologies, design options, and software layers. This paper comprehensively characterizes three supercomputers - Alps, Leonardo, and LUMI - each with a unique architecture and design. We focus on performance evaluation of intra-node and inter-node interconnects on up to 4,096 GPUs, using a mix of intra-node and inter-node benchmarks. By analyzing its limitations and opportunities, we aim to offer practical guidance to researchers, system architects, and software developers dealing with multi-GPU supercomputing. Our results show that there is untapped bandwidth, and there are still many opportunities for optimization, ranging from network to software optimization.

Documents

Publisher URL: https://ieeexplore.ieee.org/document/10793179download article:     
 

BibTeX

@inproceedings{sensi2024exploring,
  author={Daniele De Sensi and Lorenzo Pichetti and Flavio Vella and Tiziano De Matteis and Zebin Ren and Luigi Fusco and Matteo Turisini and Daniele Cesarini and Kurt Lust and Animesh Trivedi and Duncan Roweth and Filippo Spiga and Salvatore Di Girolamo and Torsten Hoefler},
  title={{Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects}},
  year={2024},
  month={Nov.},
  pages={33:1-33:15},
  booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'24)},
  location={Atlanta, GA, USA},
  publisher={IEEE Press},
  isbn={979-8-3503-5291-7},
  source={http://www.unixer.de/~htor/publications/},
}


serving: 216.73.216.217:31757© Torsten Hoefler