U23A-0045
Developing Best Practices for Scientific Data Stewardship? (SDS)
Science Data Stewardship (SDS) is the art of 'maintaining the science integrity and long term utility of scientific records' and ' the actions which maximize the return on investment for archived scientific data'. This paper will present a series of best practices developed under the Electronic Geophysical Year (eGY) for SDS. These practices include areas such as: Storage and Preservation, Ease of Use, Interoperability, Quality Information and Metadata, Data Availability, User Presentation, Attribution and Accountability, and Electronic Data Preservation These practice are of use for anyone concerned with the long term stewardship and preservation of electronic records. This set of practices is currently being extended by the CODATA working group on the eGY. http://www.sciencedatastewardship.org
U23A-0046
Accessing seismic data through geological interpretation: Challenges and solutions
Between them, the world's research programs, national institutions and corporations, especially oil and gas companies, have acquired substantial volumes of seismic reflection data. Although the vast majority are proprietary and confidential, significant data are released and available for research, including those in public data libraries. The challenge now is to maximise use of these data, by providing routes to seismic not simply on the basis of acquisition or processing attributes but via the geology they image. The Virtual Seismic Atlas (VSA: www.seismicatlas.org) meets this challenge by providing an independent, free-to-use community based internet resource that captures and shares the geological interpretation of seismic data globally. Images and associated documents are explicitly indexed by extensive metadata trees, using not only existing survey and geographical data but also the geology they portray. The solution uses a Documentum database interrogated through Endeca Guided Navigation, to search, discover and retrieve images. The VSA allows users to compare contrasting interpretations of clean data thereby exploring the ranges of uncertainty in the geometric interpretation of subsurface structure. The metadata structures can be used to link reports and published research together with other data types such as wells. And the VSA can link to existing data libraries. Searches can take different paths, revealing arrays of geological analogues, new datasets while providing entirely novel insights and genuine surprises. This can then drive new creative opportunities for research and training, and expose the contents of seismic data libraries to the world.
U23A-0047
Collaborative Establishment of a Long-Term Archive for Stewardship of Interdisciplinary Scientific Data
Much of the scientific data that are being collected today cannot be recreated if they are not properly
preserved and documented. Establishment of reliable long-term digital archives is essential to preserving
these data and associated documentation beyond the working lifetimes of current scientists. Numerous
challenges, both technical and institutional, need to be addressed before these data or their documentation
become lost or inaccessible. Direct collaboration between university research libraries and active scientific
data centers is one approach to addressing these challenges. We report here on the collaboration between
the Columbia Libraries / Information Services and the Center for International Earth Science Information
Network (CIESIN) to establish an interdisciplinary long-term archive for data from the NASA Socioeconomic
Data and Applications Center (SEDAC). The SEDAC long-term archive serves as a trustworthy digital
repository to support preparation, submission, appraisal, ingest, discovery, integration, and interoperability of
scientific data that are expected to be of long-term interest to both natural and social scientists. Significant
progress has been made in establishing the necessary policies and procedures, implementing needed
standards and technologies, and assessing strengths and possible weaknesses in the long-term
sustainability of the archive. Benefits have included sharing approaches and best practices for information
technology solutions and scientific data stewardship. A key issue is the expected future integration of this
specialized archive into the long-term digital repository currently being developed by the University. Planned
activities include testing the migration of selected data from the SEDAC long-term archive to the forthcoming
Libraries repository and the development of interfaces between the digital object management systems being
implemented by SEDAC and the Libraries, which are both based on the Flexible Extensible Digital Object
and Repository Architecture (Fedora).
http://sedac.ciesin.columbia.edu/lta/
U23A-0048
Data Storage Systems at the National Snow and Ice Data Center: Evolution in Response to Change
The National Snow and Ice Data Center (NSIDC) has been managing earth observation data for more than
30 years. What started as a small collection of analog data is now a 150 terabyte archive of more than 650
datasets. Over the years, the data storage systems at NSIDC have evolved in response to growing data
volumes, advancing technologies, and changing user needs.
The life cycle of data storage is rapid in terms of both technology changes and cost reduction. The volumes
of data archives, specifically remotely sensed data, grow exponentially. The increasing availability of
technology and information to end users has resulted in a rapid change in user expectations. Striking a
balance between all of these factors is both an opportunity and a challenge for any data center.
NSIDC data storage evolutions have included media and mass storage system migrations as well as a shift
from tape based archives to online disk archives. Data distribution systems have evolved to larger media
formats and online access to meet user needs. Additionally, online archives have enabled the expansion of
web services and on-demand data access and analysis tools. Throughout all of the evolutions there has
been the underlying need to ensure data integrity and viable backups. Here we will examine the major
changes to the NSIDC data systems over the past 3 decades as well as lessons learned and on-going
challenges.
http://nsidc.org