Performance Modeling: Understanding the Past and Predicting the Future in HPC

Colloq: Speaker: 
Allan Snavely
Colloq: Speaker Institution: 
San Diego Supercomputer Center, University of California at San Diego
Colloq: Date and Time: 
Thu, 2008-01-17 10:00
Colloq: Location: 
Colloq: Host: 
Jeffrey Vetter
Colloq: Host Email:
Colloq: Abstract: 
In the context of High Performance Computing, a performance model is a calculable expression that takes as parameters attributes of application software, input data, and target machine hardware (and possibly other factors) and computes, as output, expected performance. Via parameterization, performance models enable exploration of performance as a function of possible enhancements to machine or code, and can thereby reveal greater understanding of the performance, as well as the opportunities to improve it. The models are therefore useful to explore and improve the design of future machines in the context of application requirements, to explore and improve application algorithm choice and software implementation in the context of machine capabilities, and to determine affinities of applications to machines.<br><br>Because of the historic difficulty in producing truly general models, prior-art generally limited the scope of models to a single system and application, allowing only the system size and job size to vary. This talk will describe methods that can be effective over a broader range of system/application choices. The models can then be used to improve architecture design, inform procurement, and guide application tuning, as well as to improve resource allocation (scheduling). In our research, we are exploring these multiple uses and applying the research results to several real-world problems as will be described. The process of producing performance models historically has been rather time- consuming, requiring large amounts of computer time and highly expert human effort. This has severely limited the number of high-end applications that can be modeled and studied. It has been observed that, due to the difficulty of developing performance models for new applications, as well as the increasing complexity of new systems, supercomputers have become better at predicting and explaining natural phenomena (such as the weather) than at predicting and explaining the performance of themselves or other computers! In our research we are addressing these challenges by automating the formation of models, and making the processes of acquiring application data faster and easier to store, and representing the performance of target machines by a few simple, orthogonal benchmarks. The result is increased scope, accuracy, and applicability of performance models.
Colloq: Speaker Bio: 
Allan Snavely leads the Performance Modeling and Characterization (PMaC) lab at the San Diego Supercomputer Center. He is also an Adjunct Professor in the Department of Computer Science and Engineering at the University of California, San Diego. His research, and practical applications of it, is supported by several federal agencies including the Department of Defense, the National Science Foundation, and the Department of Energy. He and his lab work closely with these agencies to understand and influence the capabilities of supercomputers to support the national strategic interest.