Memphis: Understanding the Memory System Performance of Current and Future Micro-Processor Architectures

Colloq: Speaker: 
Dr. Collin McCurdy
Colloq: Speaker Institution: 
University of Tennessee/ORNL
Colloq: Date and Time: 
Fri, 2010-10-15 10:00
Colloq: Location: 
5100, Room 128 JICS Lecture Hall
Colloq: Host: 
Dr. Jeffrey Vetter
Colloq: Host Email: 
vetter@ornl.gov
Colloq: Abstract: 
Current predictions call for each chip in an Exascale system to contain hundreds or even thousands of processing cores. Even at today’s levels of <10 cores per chip, memory limitations and performance considerations are forcing application teams to consider multi-threading as an alternative to the one-MPI-process-per-core model. At the same time, however, trends in micro-processor design are pushing performance problems associated with Non-Uniform Memory Access (NUMA) in multi-threaded applications to ever-smaller scales. This talk will examine the current state of NUMA and make several contributions. First, I will demonstrate that NUMA can indeed be a significant problem for scientific applications, showing that it can mean the difference between an application scaling perfectly and failing to scale at all. Second, I will summarize the performance problems that NUMA can present for multithreaded applications and describe methods of addressing them. Third, I will describe several methods of using newly available hardware performance counters to aid in finding NUMA-related problems. Finally, I will introduce Memphis, a data-centric toolset that uses Instruction Based Sampling hardware counters to help pinpoint problematic memory accesses, and demonstrate how we have used it to significantly improve dual-socket multi-threaded performance of several production-level codes, including XGC1 and CAM-HOMME.
Colloq: Speaker Bio: 
Collin McCurdy is a Post-Doctoral Research Associate in the Future Technologies Group at Oak Ridge National Laboratory. His research focuses on memory system designs in current and future processor architectures and their implications for scientific applications. He received his PhD in Computer Science from the University of Wisconsin–Madison in 2008.