Future Technologies Colloquium Series


Timestamp Synchronization for Event Traces of Large-Scale Message-Passing Applications


Daniel Becker
Jülich Supercomputing Centre Forschungszentrum Jülich GmbH Jülich, Germany
October 18, 2007
02:00 PM

ORNL, 5700-L202

Host: Philip Roth (rothpc@ornl.gov )


ABSTRACT:

Identifying wait states in event traces of message-passing applications requires measuring temporal displacements between concurrent events. In the absence of synchronized hardware clocks, linear interpolation techniques can already account for differences in offset and drift, assuming that the drift of an individual processor is not time dependant. However, inaccuracies and drifts varying in time can still cause violations of the logical event ordering. The controlled logical clock algorithm accounts for such violations in point-to-point communication by shifting message events in time as much as needed while trying to preserve the length of intervals between local events. In this talk, I describe how the controlled logical clock is extended to collective communication to enable a more complete correction of realistic message-passing traces. In addition, I present a parallel version of the algorithm that is intended to scale to thousands of application processes and outline its implementation within the framework of the scalasca toolkit.


# # #