| Date | Title | Description |
| 2009-10-21 | ORNL FT participates in NSF-funded partnership for innovative supercomputer based on graphics processors | The Georgia Institute of Technology today announced its receipt of a five-year, $12 million Track 2 award from the National Science Foundation's (NSF) Office of Cyberinfrastructure to lead a partnership of academic, industry and government experts in the development and deployment of an innovative and experimental high-performance computing (HPC) system. The award provides for the creation of two heterogeneous, HPC systems that will expand the range of research projects that scientists and engineers can tackle, including computational biology, combustion, materials science, and massive visual analytics. The project brings together leading expertise and technology resources from Georgia Tech's College of Computing, Oak Ridge National Laboratory (ORNL), University of Tennessee, National Institute for Computational Sciences, HP and NVIDIA. |
| 2009-08-19 | ORNL and SNL researchers accelerate combustion simulation using GPUs | Recent work from a team of researchers from ORNL (Kyle Spafford, Jeremy Meredith, Jeffrey Vetter, and Ramanan Sankaran) and from Sandia National Laboratories (Jacqueline Chen and Ray Grout) has explored the performance benefits and accuracy tradeoffs of using graphics processors (GPUs) to accelerate S3D, one of DOE’s leading computational science applications that simulates turbulent combustion. Although they were initially designed for 3D graphics, GPUs have evolved to be an exciting platform for scientific computing due to their impressive processing capabilities and relatively low cost. The results show that computation on the GPU is able to preserve accuracy by using double precision, and execute the application’s most time consuming code up to nine times faster than a traditional CPU. These results will be presented in a paper, entitled “Accelerating S3D: A GPGPU Case Study,” at the upcoming International Workshop on Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar 2009) http://sips.inesc-id.pt/heteropar/. |
| 2009-06-15 | ORNL researchers demonstrate benefits of asynchronous programming in UPC | Aniruddha Shet (from Computer Science Research Group, CSMD) and Vinod Tipparaju from the Future Technologies group at ORNL have implemented a prototype and demonstrated advantages of asynchronous remote methods (ARM) in UPC. In a paper presented at the APGAS 2009 workshop held in conjunction with 23rd International Conference on Supercomputing (ISC 2009), the team adopt the asynchronous style of programming to parallelize a nested, tree-based code in UPC. To maximize performance without losing the ease of application programming, our team design Asynchronous Remote Methods as a potential extension to the UPC standard. Our prototype implementation of the ARM construct in Berkeley UPC yields within 7% of ideal performance and 20-fold improvement over the original Standard UPC solution for the Refine kernel in the MADNESS application. More information can be found at APGAS09. |
| 2009-05-23 | ORNL and Sun Researchers Demonstrate Parallel NFS Over Lustre at IPDPS 2009 | Researchers from the Future Technologies group at ORNL (Jeffrey Vetter and Weikuan Yu) and at Sun Microsystems (Oleg Drokin) have designed and implemented the first prototype of Parallel NFS over the scalable Lustre file system. Parallel NFS (pNFS) is an emergent open standard for parallelizing data transfer over a variety of I/O protocols. In a paper presented at the 23rd IEEE International Parallel and Distributed Processing Symposium, the team presented the design, implementation, and evaluation of lpNFS, a Lustre-based Parallel NFS. The benefits of using pNFS include portability across a range of back-end file systems with no changes to the client-side operating system. The initial performance evaluation shows that the performance of pNFS is comparable to that of native Lustre under many I/O workloads. Given these results, lpNFS appears to be a promising approach to providing a scalable, high performance, portable, datacenter-wide file system for DOE computing facilities. |
| 2009-05-23 | ORNL Researchers accelerate materials application with Graphics Processing Units (GPUs) | In a recent article in the journal ‘Parallel Computing,’ a team of ORNL researchers (Jeremy S. Meredith, Gonzalo Alvareza, Thomas A. Maier, Thomas C. Schulthess, Jeffrey S. Vetter) show how they have accelerated the Quantum Monte Carlo simulation code, named DCA++, using graphics processing units (GPUs) as general-purpose computational devices (also known as GP-GPUs). While initially designed for real time rendering, the high performance and relatively low cost makes GPUs a desirable target for scientific computation. Recent efforts in the community have been addressing the programming challenges, with new languages such as CUDA and OpenCL being widely adopted. However, the original task of GPUs - rendering - has traditionally kept accuracy as a secondary goal, and sacrifices have sometimes been made as a result. In fact, much deployed GPU hardware is only capable of single precision arithmetic, and even this accuracy is not always equivalent to that of a commodity CPU. In this paper, the team investigated the accuracy and performance characteristics of GPUs on DCA++, including results from a preproduction double precision-capable GPU. They then accelerated the full DCA++ application, while concurrently investigating its tolerance to the different levels of arithmetic precision available in GPUs. The results show that while DCA++ has some sensitivity to the arithmetic precision, the single-precision GPU results were comparable to single-precision CPU results. Acceleration of the code on a fully GPU-enabled cluster showed that any remaining inaccuracy in GPU precision was negligible. Sufficient accuracy was retained for scientifically meaningful results while still showing significant speedups; the full parallel runtimes on the GPU cluster were five times faster than that on commodity microprocessors alone. |
| 2009-05-06 | ORNL researcher brings scalable tool infrastructure to Cray XT | At the Cray Users Group (CUG) 2009 annual meeting in Atlanta, Philip Roth of the ORNL Future Technologies Group gave a presentation titled "Scalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks." Roth described his recent work in bringing the MRNet infrastructure to the Cray XT platform, including support for a new MRNet process placement strategy that co-locates all MRNet processes on compute nodes with application processes. |