The NIH Biomedical Analysis and Simulation Supercomputer: Standing up a CPU+GPU Cluster for Biomedicine

Colloq: Speaker: 
Dr. Russell Taylor
Colloq: Speaker Institution: 
University of North Carolina Chapel Hill
Colloq: Date and Time: 
Fri, 2009-10-09 10:00
Colloq: Location: 
Bldg. 5100, RM 128 (JICS Lecture Hall)
Colloq: Host: 
Jeff Vetter and David Banks
Colloq: Host Email: 
vetter@ornl.gov and dbanks@eecs.utk.edu
Colloq: Abstract: 
The Biomedical Analysis and Simulation Supercomputer (BASS) at UNC Chapel Hill is a 256-node CPU cluster connected to a 180-node GPU cluster designed to support a range of biomedical jobs from UNC and affiliated groups. Seehttp://wwwx.cs.unc.edu/Research/bass/.<http://wwwx.cs.unc.edu/Research/bass/.%0b%0b>There were a number of successes as the resource was brought on line: fully-funded scientist support, facilities staff owning the installation, insisting on wiping the machine clean after install, scheduling maintenance windows, treating the system as an appliance, encouraging ongoing series of background jobs.There were also a number of surprises along the way: vendor going out of business, power draws exceeding available circuits, user groups changing, funding delayed by a year, PCI device BIOS incompatibilities, shell environment space limitations.This talk describes the good, the bad, and the humorous aspects of standing up the system and getting the load up to above 90% while preserving low-latency access for foreground jobs.
Colloq: Speaker Bio: 
Dr. Taylor is the PI for the Biomedical Analysis and Simulation Supercomputer. He holds positions in Computer Science, Physics & Astronomy, and Applied & Materials Sciences at UNC. He is the co-director of the UNC NIH National Research Resource for Computer Integrated Systems for Microscopy and Manipulation and directs the CS team in the UNC Nanoscale Science Research Group. He is chairman of the board of NanoManipulator Incorporated and co-founder for Navitas Research, LLC.