lecture image CCT Colloquium Series
SPRUCE: An Infrastructure for Emergency, On-Demand, Urgent Computing
Pete Beckman, University of Chicago Computation Institute
Chief Architect, Argonne Leadership Computing Facility
Johnston Hall 338
January 18, 2008 - 11:30 am
High-performance modeling and simulation are playing a driving role in decision making and prediction. For time-critical emergency support applications such as severe weather prediction, flood and wildfire modeling, and influenza modeling, late results can be useless. With HPC resources distributed in a naturally fault-tolerant way across the nation, the community can build infrastructures and policy frameworks for utilizing high-end computational resources in support of emergency computation. A specialized infrastructure is needed to provide computing resources quickly, automatically, and reliably. SPRUCE is a system to support urgent or event-driven computing on both traditional supercomputers and distributed Grids. Currently, SPRUCE is deployed at several large supercomputer centers. The Linked Environments for Atmospheric Discovery (LEAD) project is one of our initial applications. LEAD used the SPRUCE system embedded into their portal for simulating and predicting severe weather events during the tornado season. We are replicating similar framework for the SCOOP Project as well. We are also beginning to work with projects modeling wildfires, influenza and pandemics.
Speaker's Bio:
Peter Beckman has worked in systems software for parallel computing, operating systems, and Grid computing for 20 years. After receiving a Ph.D. in computer science from Indiana University, he helped create the Extreme Computing Laboratory, which focused on parallel C++, portable run-time systems, and collaboration technology. In 1997, Peter joined the Advanced Computing Laboratory at Los Alamos National Laboratory, where he founded the ACL's Linux cluster team and organized the Extreme Linux series of workshops and activities that helped catalyze the high-performance Linux computing cluster community. Peter has also worked in industry. For example, in 2000 he founded a research laboratory in Santa Fe (sponsored by Turbolinux Inc.), which developed the world's first dynamic provisioning system for large clusters and data centers. The following year, Peter became vice president of Turbolinux's worldwide engineering efforts, managing development offices in Japan, China, Korea, and Slovenia. Peter began working at Argonne National Laboratory in 2002. As Director of Engineering for the TeraGrid, a $150 million effort sponsored by the National Science Foundation to build the world's largest open Grid computing environment, he designed and deployed the world's most advanced Grid system for linking production HPC computing centers. After the TeraGrid became fully operational, Peter started a research team focusing on petascale high-performance software systems, wireless sensor networks, Linux, and the SPRUCE system to provide urgent computing for critical, time-sensitive decision support. He is Chief Architect for the Argonne Leadership Computing Facility, which is deploying a 500TF next generation Blue Gene supercomputer. He has published numerous articles, served on national program committees, and presented invited papers and tutorials. Peter, his wife, and two children live in Naperville, Illinois.