Enhanced Operating System Support for MPI on Multi-Core Processors

Colloq: Speaker: 
Ron Brightwell
Colloq: Speaker Institution: 
Sandia National Laboratories
Colloq: Date and Time: 
Mon, 2009-04-20 15:00
Colloq: Location: 
ORNL, Bldg. 5100, Room 125
Colloq: Host: 
Jeffrey Vetter
Colloq: Host Email: 
Colloq: Abstract: 
The impact of commodity multi-core processors on the general computing community has been extensive. One can no longer assume that an application will run faster on successive generations of processors unless the application has been parallelized to take advantage of the increasing core count. The parallel computing community has also been adversely impacted by multi-core processors. While the increase in computation power and density from multi-core processors is encouraging, the majority of scientific parallel computing applications depend as much on memory subsystem performance as on compute performance. To make the problem worse, the MPI model, upon which nearly all scalable parallel applications are based, exacerbates the limited memory bandwidth available to a processor.We have developed an operating system page table mapping strategy called SMARTMAP that allows processes on a multi-core processor to directly access each other's memory through simple virtual address bit manipulation. The SMARTMAP capability allows the cooperating parallel processes on a compute node to run independently as separate address spaces, but also provides the ability for the processes to act as threads running in a single address space. When used to implement MPI, SMARTMAP eliminates all extraneous memory-to-memory copies imposed by UNIX-based shared memory strategies, significantly reducing pressure on the memory subsystem for intra-node data transfers. In addition, SMARTMAP can easily support operations that UNIX-based shared memory cannot, such as direct, in-place, threaded MPI reduction operations and one-sided get/put operations.This talk will describe the implementation of SMARTMAP in the Catamount lightweight kernel that runs on the Cray XT-based Red Storm platform at Sandia National Labs. We will show performance results comparing a SMARTMAP-enabled MPI to traditional UNIX-based shared memory approaches for MPI. We will also briefly describe several related ongoing research projects, including the next-generation Portals high-performance network programming interface.
Colloq: Speaker Bio: 
Ron Brightwell received his BS in mathematics in 1991 and his MS in computer science in 1994 from Mississippi State University. He joined Sandia National Laboratories in 1995 and is currently a Principal Member of Technical Staff. While at Sandia, he has designed and developed software for lightweight compute node operating systems and high-performance networks on several large-scale massively parallel systems, including the Intel Paragon and TeraFLOPS, and the Cray T3 and XT series of machines. His research interests include high-performance, scalable communication interfaces and protocols for system area networks, operating systems for massively parallel processing machines, and parallel program performance analysis libraries and tools.