Science Automation with the Pegasus Workflow Management System

Colloq: Speaker: 
Ewa Deelman
Colloq: Speaker Institution: 
University of Southern California and USC Information Sciences Institute
Colloq: Date and Time: 
Tue, 2014-09-09 11:00
Colloq: Location: 
Conference Center (Building 5200), Tennessee Room (202-C)
Colloq: Host: 
Jeffrey S. Vetter
Colloq: Host Email:
Colloq: Abstract: 
Scientific workflows allow scientists to declaratively describe potentially complex applications that are composed of individual computational components. Workflows also include a description of the data and control dependencies between the components. This talk will describe example workflows in various science domains including astronomy, bioinformatics, earthquake science, gravitational-wave physics, and others. It will examine the challenges faced by workflow management systems when executing workflows in distributed and high-performance computing environments. In particular, the talk will describe the Pegasus Workflow Management System developed at USC/ISI. Pegasus bridges the scientific domain and the execution environment by automatically mapping high-level workflow descriptions onto distributed resources. It locates the input data and computational resources necessary for workflow execution. It also restructures the workflow for performance and reliability reasons. Pegasus can execute workflows on a laptop, a campus cluster, grids, and clouds. It can handle workflows with a single task or millions of tasks and has been used to manage workflows accessing and generating TeraBytes of data. The talk will describe the capabilities of Pegasus and how it manages heterogeneous computing environments.
Colloq: Speaker Bio: 
Ewa Deelman is a Research Associate Professor at the USC Computer Science Department and the Assistant Director of Science Automation Technologies at the USC Information Sciences Institute. Dr. Deelman's research interests include the design and exploration of collaborative, distributed scientific environments, with particular emphasis on workflow management as well as the management of large amounts of data and metadata. In 2007, Dr. Deelman edited a book: “Workflows in e-Science: Scientific Workflows for Grids”, published by Springer. She is also the founder of the annual Workshop on Workflows in Support of Large-Scale Science, which is held in conjunction with the Super Computing conference. In 1997 Dr. Deelman received her PhD in Computer Science from the Rensselaer Polytechnic Institute.