Scalable Event Tracing on High-End Parallel Systems

Colloq: Speaker: 
Kathryn Mohror
Colloq: Speaker Institution: 
Portland State University
Colloq: Date and Time: 
Fri, 2009-08-07 09:30
Colloq: Location: 
ORNL, Bldg. 5700, Room MS-A106
Colloq: Host: 
Philip C. Roth
Colloq: Host Email:
Colloq: Abstract: 
Event traces are required to correctly diagnose a number of performance problems that arise on today's highly parallel systems. Unfortunately, the collection of event traces can produce a large volume of data that is difficult, or even impossible, to store and analyze. In this talk, we present the results of two studies. The first is a measurement study of the overheads of collecting event traces. We examine several sources of overhead related to tracing: instrumentation, differing trace buffer sizes, periodic buffer flushes to disk, and increasing numbers of processors in the target application. As expected, the overhead of instrumentation correlates strongly with the number of events; however, our results indicate that the overhead of writing the trace buffer increases with increasing numbers of processors. The second study compares methods for reducing the size of trace files. A promising approach for reducing traces is to identify repeating trace patterns and retain only one representative of each pattern. However, identifying these patterns is not straightforward. We compare several different methods that reduce traces by identifying patterns. We evaluate them for size reduction, introduced error, and retention of performance trends, using benchmarks with carefully chosen performance behaviors and an application.
Colloq: Speaker Bio: 
Kathryn Mohror is a Ph.D. candidate in Computer Science at Portland State University in Portland, Oregon, advised by Karen L. Karavanic. Kathryn received her B.S. in Chemistry from Portland State University in 1999. While developing computer programs for her graduate work in Chemistry, she discovered her passion for computing and left the Chemistry department to pursue a career in Computer Science. She received her M.S. in Computer Science from Portland State University in 2004. Her Master's thesis explored parallel performance tool support of MPI-2. For her dissertation work, Kathryn is researching scalable collection of event-based performance data. Kathryn's research interests include scalable performance measurement and analysis of emerging high-end computing systems and file systems.