Workshop Program

 

Workshop Location: Room 504

Opening Remarks: 8:30-8:45 [Slides]

Yong Chen, Texas Tech University
Philip C. Roth, Oak Ridge National Laboratory
Xian-He Sun, Illinois Institute of Technology

Session 1: Keynote by Lucy T. Nowell, PhD

Time: 8:45 - 10:00
Session Chair: Phil Roth, Oak Ridge National Laboratory

  • Title: Data Intensive Science at Extreme Scale: Where Supercomputing and Big Data Meet
  • Abstract: Management, analysis and visualization of extreme-scale scientific data will undergo a radical change during the coming decade. Coupled with changes in the hardware architecture for next-generation supercomputers, explosive growth in the volume and velocity of scientific data results in significant challenges for researchers in computer science, mathematics and statistics, and domain sciences. New government policies aimed at preservation and sharing of digital data and publications that result from government funded research present additional challenges, especially given worsening I/O bottlenecks and the high power costs of data movement that inhibit saving and reuse of data. Failure to develop new data management, analysis and visualization technologies that operate effectively on the changing supercomputer architecture will cripple scientific discovery and put national security at risk. Using examples from climate science, Dr. Lucy Nowell will explore technical and scientific drivers and opportunities for data science research of interest to the Advanced Scientific Computing Research program in the Department of Energy’s Office of Science.
  • Bio: Dr. Lucy Nowell is a computer scientist and program manager in the Office of Advanced Scientific Computing Research (ASCR) within the Department of Energy’s Office of Science. She manages a broad spectrum of ASCR-funded computer science research, with a particular emphasis on scientific data management and analysis. She has also served as a research program manager for the National Science Foundation and the Department of Defense, managing a variety of programs related to the management and preservation of digital data, data integration and analysis, and scientific and information visualization. Dr. Nowell moved to ASCR in the spring of 2009 from Pacific Northwest National Laboratory, where she was a Chief Scientist in the Information Analytics group. She earned her Master of Science and Doctor of Philosophy degrees in Computer Science at Virginia Tech, where the Computer Science Department recently awarded her a Distinguished Alumna Award. She also has the Master of Fine Arts degree in Drama from the University of New Orleans and the Master of Arts and Bachelor of Arts degrees in Theatre from the University of Alabama.

Session 2: Data-Intensive Architectures and Runtime Systems

Time: 10:30-12:30
Session Chair: Dries Kimpe, Argonne National Laboratory

  • Jiangling Yin, Junyao Zhang, Jun Wang and Wu-chun Feng. "SDAFT: A Novel Scalable Data Access Framework for Parallel BLAST"University of Central Florida and Virginia Tech [Slides]

  • Rohit Shivaswamy, Abani Patra and Vipin Chaudhary. "Large Data and Computation in a Hazard Map Workflow Using Hadoop and Neteeza Architectures"SUNY at Buffalo [Slides]

  • Michael Sevilla, Ike Nassi, Kleoni Ioannidou, Scott Brandt and Carlos Maltzahn. "A Framework for an In-depth Comparison of Scale-up and Scale-out"University of California, Santa Cruz [Slides]

  • Dominique Lasalle and George Karypis. "BDMPI: Conquering BigData with Small Clusters using MPI"University of Minnesota [Slides]

Session 3: Data-Intensive Programming Models and File Systems

Time: 1:30-3:00
Session Chair: Weijun Xiao, Virginia Commonwealth University

  • Joong-Yeon Cho, Hyun-Wook Jin, Min Lee and Karsten Schwan. "On the Core Affinity and File Upload Performance of Hadoop". Konkuk University and Georgia Institute of Technology [Slides]

  • Yong Li, Dan Feng and Zhan Shi. "Enhancing Both Fairness and Performance Using Rate-Aware Dynamic Storage Cache Partitioning". Huazhong University of Science and Technology, China [Slides]

  • Patrick Donnelly and Douglas Thain. "Design of an Active Storage Cluster File System for DAG Workflows". University of Notre Dame [Slides]

Session 4: Data Analytics and Tools

Time: 3:30-5:00
Session Chair: Shane Canon, Lawrence Berkeley National Laboratory

  • Peter Coetzee and Stephen Jarvis. "CRUCIBLE: Towards Unified Secure On- and Off-Line Analytics at Scale". University of Warwick, UK [Slides]

  • Lan Vu and Gita Alaghband. "Novel Parallel Method for Mining Frequent Patterns on Multi-core Shared Memory Systems". University of Colorado Denver [Slides]

  • Alan Chappell, Sutanay Choudhury, John Feo, David Haglin, Alessandro Morari, Sumit Purohit, Karen Schuchardt, Antonino Tumeo, Jesse Weaver and Oreste Villa. "Toward a Data Scalable Solution for Facilitating Discovery of Scientific Data Resources". Pacific Northwest National Laboratory and NVIDIA [Slides]

Closing Remarks and Open Discussions: 5:00 - 5:30

Phil Roth, Oak Ridge National Laboratory
Yong Chen, Texas Tech University [Slides]

Evaluate us here (anonymous, your account information will not be collected): Evaluation Form