Designing Next Generation High Performance Clusters and Datacenters with InfiniBand

Colloq: Speaker: 
Prof. Dhabaleswar K. (DK) Panda
Colloq: Speaker Institution: 
The Ohio State University
Colloq: Date and Time: 
Fri, 2005-03-18 11:00
Colloq: Location: 
ORNL 5100-Aud
Colloq: Host: 
Jeffrey S. Vetter
Colloq: Host Email: 
vetter@ornl.gov
Colloq: Abstract: 
The emerging InfiniBand Architecture (IBA) is generating a lot of excitement as an open interconnect standard for building next generation high-end systems in a radical different manner. This presentation will focus on research challenges and the state of art solutions for designing HPC clusters and multi-tier datacenters with IBA. For designing HPC clusters with IBA, issues related to designing scalable and high performance implementation of the Message Passing Interface (MPI) standard (both MPI-1 and MPI-2) will be focused. Issues, challenges and solutions for designing efficient support for point-to-point communication, collective communication (broadcast, barrier, all-to-all, etc.), flow control, datatypes, and synchronization on clusters with different processors, PCI interfaces (PCI-X and PCI-Express), and networks (single-rail and multiple-rails) will be presented. The presentation will be based on our experiences in designing MVAPICH (MPI-1 over VAPI) and MVAPICH2 (MPI-2 over VAPI) which are being used in many IBA clusters to extract performance with IBA. Along the datacenter front, issues, challenges and solutions related to designing high performance and scalable multi-tier datacenters with IBA will be highlighted. Performance benefits of the Sockets Direct Protocol (SDP) stack compared to the IP over IB (IPoIB) stack will be presented for various workloads. Impact of latency, throughput, and CPU utilization of the protocol stacks on the overall performance of the datacenter will be highlighted. It will be shown how VAPI-level RDMA techniques can be used for providing strong coherency and reconfigurability with low overhead for designing next generation datacenters with dynamic data.
Colloq: Speaker Bio: 
Dhabaleswar K. (DK) Panda is a Professor of Computer Science at the Ohio State University. His research interests include parallel computer architecture, high performance networking, and network-based computing. He has published over 150 papers in these areas. His research group is currently collaborating with National Laboratories and leading companies on designing various communication and I/O subsystems of next generation HPC systems and datacenters with modern interconnects. The MVAPICH (MPI over VAPI for InfiniBand) package developed by his research group (http://nowlab.cis.ohio-state.edu/projects/mpi-iba/) is being used by more than 190 organizations world-wide to extract the potential of InfiniBand-based clusters for HPC applications. Dr. Panda is a recipient of the NSF CAREER Award, OSU Lumley Research Award (1997 and 2001), and an Ameritech Faculty Fellow Award.