NCSA Home
Contact Us | Intranet | Search

Cyberinfrastructure Seminar Series

Tuesday, August 2, 2005

Data Grids, Digital Libraries, and Persistent Archives: An Integrated Approach to Sharing, Publishing, and Archiving Data
Reagan Moore , SDSC
11:00 AM - 12:30 PM (PDT)
1:00   PM - 2:30 PM (CDT)
5239 Beckman Institute (NCSA) via AG

Live webcast at: www.cichannel.org   (Real Player is required)

Applications on the TeraGrid generate simulation output that can be measured in the tens of Terabytes and millions of files.  The ability to manage these massive data sets is simplified through the use of data grid technology.  Data grids organize data that may be distributed across multiple sites in a collection hierarchy. Descriptive metadata is associated with each file to support browsing and discovery.  Digital library services, such as those provided by DSpace, are used to interact with the collection.

The integration of digital library technology with data grids makes it possible to share data easily within a scientific community. Multiple scientific disciplines are now assembling digital libraries composed of digital reference data sets that represent standard observational or simulation scenarios.  By being able to both publish data for external researchers, and share data under access controls within a project, data grids enable scientific research.  Examples include the National Virtual Observatory, the Southern California Earthquake Center, the Biomedical Informatics Research Network, the Alliance for Cell Signaling, and so on.

Data grids also provide support for incorporating new technology, including new storage systems and new access methods.  The ability to support new technologies is called infrastructure independence. Along with mechanisms for authenticity and integrity, infrastructure independence mechanisms form the critical components of a preservation environment.  The ability to use the same software infrastructure to implement a data sharing environment, a digital library, and a preservation environment is one of the great advances provided by data management systems available on the TeraGrid Examples of these technologies will be illustrated using the SDSC Storage Resource Broker Data Grid.

The Cyberinfrastructure Seminar Series is a set of presentations on cyberinfrastructure and related research organized by NCSA and SDSC. These seminars are available on site at the presenting institution and remotely via the Access Grid. For more details regarding the AG venue for this seminar, please refer to: http://agschedule.ncsa.uiuc.edu/meetingdetails.asp?MID=9810. All Access Grid sites are welcome to participate in this seminar. If you have any questions, contact Jennie File, NCSA Training & Outreach Group.