How fast will your application run at <next>-scale? Static and dynamic techniques for application performance modeling

Colloq: Speaker: 
Torsten Hoefler
Colloq: Speaker Institution: 
ETH Zürich, Switzerland
Colloq: Date and Time: 
Tue, 2015-06-30 14:00
Colloq: Location: 
Building 5200, Room 214 (Emory)
Colloq: Host: 
Jeff Vetter
Colloq: Host Email:
Colloq: Abstract: 
Many parallel applications suffer from latent performance limitations that may prevent them from utilizing resources efficiently when scaling to larger parallelism. Often, such scalability bugs manifest themselves only when an attempt to scale the code is actually being made---a point where remediation can be difficult. However, creating analytical performance models that would allow such issues to be pinpointed earlier is so laborious that application developers attempt it at most for a few selected kernels, running the risk of missing harmful bottlenecks. We discuss dynamic techniques to generate performance models for program scalability to identify scaling bugs early and automatically. This automation enables a new set of parallel software development techniques. We demonstrate the practicality of this method with various real-world applications but also point out limitations of the dynamic approach. We then discuss a static analysis that establishes close provable bounds for the number of loop iterations and the scalability of parallel programs. While this analysis captures more loops then existing techniques based on the Polyhedral model, no analysis can count all loops statically. We conclude by briefly discussing how to combine these two approaches into an integrated framework for scalability and performance analysis.
Colloq: Speaker Bio: 
Torsten is an Assistant Professor of Computer Science at ETH Zürich, Switzerland. Before joining ETH, he led the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded Blue Waters project at NCSA/UIUC. He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the "Collective Operations and Topologies" working group. Torsten won best paper awards at the ACM/IEEE Supercomputing Conference SC10, SC13, SC14, EuroMPI 2013, IPDPS 2015, and other conferences. He published numerous peer-reviewed scientific conference and journal articles and authored chapters of the MPI-2.2 and MPI-3.0 standards. His research interests revolve around the central topic of "Performance-centric Software Development" and include scalable networks, parallel programming techniques, and performance modeling. Additional information about Torsten can be found on his homepage at