Home   Publications     edited volumes   Awards   Research   Teaching   Miscellaneous   Full CV [pdf]   BLOG   bio
  
 
 
  
 
  
  Events
  
  
  
  
   
  
   Past Events
  
  
  
  
  
  
   
    | 
Publications of Torsten Hoefler  
Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 
  |  |   | Accurately Measuring Overhead, Communication Time and Progression of Blocking and Nonblocking Collective Operations at Massive Scale
   (International Journal of Parallel, Emergent and Distributed Systems. Vol 25, Nr. 4, pages 241-258, Taylor & Francis Group, ISSN: 1744-5779, Jul. 2010) 
 
 AbstractAccurate, reproducible and comparable measurement of the overheads,
    communication times and progression behavior of blocking and
    nonblocking collective operations is a complicated task.
    Although Different measurement schemes for blocking collective
    operations are implemented in well-known benchmarks, many of these
    schemes introduce different systematic errors in their measurements.
    We characterize these errors and select a window-based approach as the
    most accurate method. However, this approach complicates measurements
    significantly and introduces clock synchronization as a new source
    of errors.
    We analyze approaches to avoid or correct those errors and develop a
    scalable synchronization scheme to conduct benchmarks on massively
    parallel systems. Our results are compared to the window-based scheme
    implemented in the SKaMPI benchmarks and show a reduction of the
    synchronization overhead by a factor of 16 on 128 processes.
    We also describe two different measurement schemes for the overhead and
    asynchronous progress of nonblocking collective communications. An
    implementation and results of both measurement schemes are
    presented.
 
 Documentsdownload article:  
  |  |   | BibTeX |  @article{hoefler-collmea,   author={Torsten Hoefler and Timo Schneider and Andrew Lumsdaine},   title={{Accurately Measuring Overhead, Communication Time and Progression of Blocking and Nonblocking Collective Operations at Massive Scale}},   journal={International Journal of Parallel, Emergent and Distributed Systems},   year={2010},   month={Jul.},   pages={241-258},   volume={25},   number={4},   publisher={Taylor \& Francis Group},   issn={1744-5779},   source={http://www.unixer.de/~htor/publications/}, } |  
  |  
  
 
 |