Future Technologies Colloquium Series


Supporting High Performance I/O with Effective Caching and Prefetching


Song Jiang
Department of Electrical and Computer Engineering Wayne State University Detroit, MI
July 26, 2007
10:00 AM

ORNL, 5700-L202

Host: Weikuan Yu (wyu@ornll.gov )


ABSTRACT:

A large spectrum of data-intensive applications, ranging from small system tools such as CVS and grep, to terascale simulation applications that process huge amounts of scientific data, demand efficient I/O support. In almost all computing platforms, the ubiquitous hard disk remains the most cost-effective medium for on-line storage. While the growth of hard-disk capacity nicely matches the rapidly increasing demand for storage, its electromechanical nature is such that performance improvements lag painfully far behind that of processor performance. We continue to observe that the disk bottleneck is worsening in modern computer systems. In this talk I will present our research on improving disk I/O performance through a better utilization of disk buffer cache. I will describe an integrated caching and prefetching scheme, called DiskSeen, that not only makes access patterns of applications exploitable bythe buffer cache, but also makes the data layout of the disk visible and exploitable by the buffer cache. By making disk layout visible to the buffer cache, Diskseen provides functionalities that existing systems do not have. Examples includes random disk accesses being treated differently than sequential accesses so that disk accessesbecome more sequential, and prefetching being carried out directly on disk blocks using history access information so that metadata and inter-file prefetching is enabled. Using Linux kernel implementations I demonstrate that this technique can significantly improve the performance of a wide variety of applications.

BIO:

Dr. Song Jiang is an assistant professor of the ECE department at Wayne State University. He received his Ph.D in computer science at the College of William and Mary in 2004. After that he had been a postdoctoral researcher at Los Alamos National Laboratory for two years. His current research is in the amelioration of the I/O performance bottleneck in various computer system architectures and I/O performance management in networked storage systems. His research has seen a continuum of publications in leading conferences such as USENIX and FAST. His work on process/memory scheduling to prevent process thrashing has been incorporated into the official version of current Linux kernel.

# # #