SF43B-01 13:40h
Integrating Science Information Systems with Geographic Information Systems: The OPeNDAP Experience
The OPeNDAP data access protocol is in increasing use for remote access to data in oceanography, meteorology, land cover studies and the space sciences. At the same time there is increasing use in the geographic information science (GIS) community of protocols to remotely access data. The fundamental difference between the two approaches for remote data access relates to the semantic metadata associated with the data. In the GIS case the semantic metadata are tightly coupled with the geographic needs of GISs, specifically, earth location, projection, etc. By contrast the OPeNDAP data access protocol does not impose a semantic metadata requirement on the data provider, although the protocol does provide a mechanism for the data provider to append semantic information to the data stream. This allows for use of the protocol over a much broader range of scientific disciplines, but also results in no consistency in the semantic metadata provided with the data. The OPeNDAP protocol could be viewed as a lower level protocol than those being developed for use in the GIS community. Despite the lack of consistency in semantic metadata there is a great deal of interest in the GIS community in access to OPeNDAP-enabled servers and conversely in the scientific community in access to GIS data via OPeNDAP-enabled clients. In this presentations we explore efforts undertaken within the OPeNDAP community to address these issues. In particular, we discuss issues related to access via the Web Mapping Server, EASy and MapServer as well as efforts to serve data stored in formats designed for use in GISs such as GeoTIFF
SF43B-02 13:55h
The IRI Climate Data Library: translating between data cultures
The IRI Climate Data Library is a library of datasets. By {\it library} we mean a collection of things, collected from both near and far, designed to make them more accessible for the library's users. Our datasets come from many different sources, many different {\it data cultures}, many different formats. By {\it dataset} we mean a collection of data organized as multidimensional dependent variables, independent variables, and sub-datasets, along with the metadata (particularly use-metadata) that makes it possible to interpret the data in a meaningful manner. Ingrid, which provides the infrastructure for the Data Library, is an environment that lets one work with datasets: read, write, request, serve, view, select, calculate, transform, ... . It hides an extraordinary amount of technical detail from the user, letting the user think in terms of manipulations to datasets rather that manipulations of files of numbers. Among other things, this hidden technical detail could be accessing data on servers in other places, doing only the small needed portion of an enormous calculation, or translating to and from a variety of formats and between data cultures. Our datasets have been primarily climate, both oceanographic and meterological, and are thus of that data culture. Our data is multi-dimensional, our geolocation has been mostly either gridded longitude/latitude, or point-locations longitude/latitude. In order to access and serve data from and to a broader community, we are expanding our holdings and tools in three new directions structurally: (Geographical Information Systems (GIS) image data (similar to most of our holdings except that geolocation frequently requires interpreting the projection), GIS vector data (geolocation is by specifying vector geometries, i.e. lines or polygons), and named locations (data georeferenced only by named location). Our multidimensional data structure permits us to organize and analyze sets of images easily, unlike most GIS software. On the other hand, adding the new geolocation methods gives our users access to data from many more sources. Finally, by translating these datasets from different data cultures into a common structure with standard use-metadata, we can translate between those cultures, and provide the infrastructure necessary for cross-disciplinary research.
http://iridl.ldeo.columbia.edu/dochelp/topics/DATASETS/
SF43B-03 14:10h
An overview of the Live Access Server
In recent years, the Live Access Server (LAS) has gained recognition as a significant component in the design of data portals to provide access to gridded and non-gridded ocean, climate and atmosphere data. LAS is "freeware" available from NOAA's Pacific Marine Environmental Laboratory (http://www.ferret.noaa.gov). It is a named component of the plan for Data Management and Communications (DMAC) within the US Integrated Ocean Observing System (IOOS). LAS has been installed at well over 50 institutions worldwide and provides visualization, browsing, and subsetting services for many Terabytes of data from a broad range of environmental fields. LAS is a configurable scientific data "product" server - providing a friendly user interface to ocean/atmosphere/climate 4-dimensional visualization, download, and comparison of data on the Web. The user may compare variables from distributed sites, either through visual inspection - graphical overlays and side-by-side plots - or by numerically differencing them (with automated regridding). LAS does not generate products, itself. Rather it functions as a broker that directs requests for specific types of products to various "back end" applications to perform the work. Back end applications may include commercial off the shelf software such as IDLr, standardized product request protocols such as the Open GIS Web Mapping Services, scientific applications such as Ferret and GrADS, and UNIX utilities such as the netCDF operators (NCO). The "intelligence" of an LAS site lies entirely in its XML configuration files and associated resources. These files define not only the operations that are possible, but also the data sets, variables, units, space-time domains, etc. -- in short, the information necessary to present the user an interface to the data. By sharing copies of their XML configuration files and associated resources with a consolidated site, a group of LAS sites can be configured as a collaborating "sister server" cluster. The consolidated sisters' (virtual) site provides access to the collective holdings, making it possible for distributed groups of researchers to pool their data holdings, while still retaining independent control. This mode of operation is particularly well suited to model intercomparison projects.
http://www.ferret.noaa.govFerret/LAS/
SF43B-04 14:25h
Digital Libraries as a Component of Cyberinfrastructure: DLESE and the NSDL as Examples of Providing a Knowledge Layer to CI
Digital libraries have been identified in many reports and workshops as one component of cyberinfrastructure to not only support the aggregation and dissemination of materials, but as new architectures to support the development of networks of knowledge and understanding. The Digital Library for Earth System Education (DLESE) is aimed to support Earth system education, but is itself a component of a larger network of digital libraries in the sciences such as the National Science Digital Library (NSDL). These networks of libraries are envisioned to provide a "knowledge layer" over a broad array of heterogeneous components from data sets to learned articles, from data analysis to complex simulation environments. As cyberinfrastructure develops the tools and technologies to access and interact with data (for example in projects such as LEAD, GEON, THREDDS), the distinction between scientific data archiving, access and analysis, and the functions of a digital library continue to blur as we concentrate on the needs of the user, be they researcher, teacher or learner. Data analysis may be incorporated into more complex artifacts such as compound documents, simulation environments, and learning objects, which themselves must contain attributes and descriptions for understanding their context and use. Thus a cyberinfrastructure that incorporates the range of technologies of data access and analysis and digital libraries that can provide broad context must include mechanisms for interoperability through all levels of these infrastructures. New tools and approaches in digital library and information sciences can leverage the materials and relationships provided through this interoperability.
SF43B-05 14:40h
High Performance Geostatistical Modeling of Biospheric Resources
We are using parallel geostatistical codes to study spatial relationships among biospheric resources in several study areas. For example, spatial statistical models based on large- and small-scale variability have been used to predict species richness of both native and exotic plants (hot spots of diversity) and patterns of exotic plant invasion. However, broader use of geostastics in natural resource modeling, especially at regional and national scales, has been limited due to the large computing requirements of these applications. To address this problem, we implemented parallel versions of the kriging spatial interpolation algorithm. The first uses the Message Passing Interface (MPI) in a master/slave paradigm on an open source Linux Beowulf cluster, while the second is implemented with the new proprietary Xgrid distributed processing system on an Xserve G5 cluster from Apple Computer, Inc. These techniques are proving effective and provide the basis for a national decision support capability for invasive species management that is being jointly developed by NASA and the US Geological Survey.
SF43B-06 14:55h
The USGODAE Monterey Data Server
With oversight from the U.S. Global Ocean Data Assimilation Experiment (GODAE) Steering Committee and funding from the Office of Naval Research, the USGODAE Monterey Data Server has been established at the Fleet Numerical Meteorology and Oceanography Center (FNMOC) as an explicit U.S. contribution to GODAE. Support of the Monterey Data Server is accomplished by a cooperative effort between FNMOC and NOAA's Pacific Marine Environmental Laboratory (PMEL) in the on-going development of the server and the support of a collaborative network of GODAE assimilation groups. This server hosts near real-time in-situ oceanographic data, atmospheric forcing fields suitable for driving ocean models, and unique GODAE data sets, including demonstration ocean model products. GODAE is envisioned as a global system of observations, communications, modeling and assimilation, which will deliver regular, comprehensive information on the state of the oceans in a way that will promote and engender wide utility and availability of this resource for maximum benefit to society. It aims to make ocean monitoring and prediction a routine activity in a manner similar to weather forecasting. GODAE will contribute to an information system for the global ocean that will serve interests from climate and climate change to ship routing and fisheries. The USGODAE Server is developed and operated as a prototypical node for this global information system. Because of the broad range and diverse formats of data used by the GODAE community, presenting data with a consistent interface and ensuring its availability in standard formats is a primary challenge faced by the USGODAE Server project. To this end, all USGODAE data sets are available via HTTP and FTP. In addition, USGODAE data are served using Local Data Manager (LDM), THREDDS cataloging, OPeNDAP, and Live Access Server (LAS) from PMEL. Every effort is made to serve USGODAE data through the standards specified by the National Virtual Ocean Data System (NVODS) and the Integrated Ocean Observing System Data Management and Communications (IOOS/DMAC). To provide surface forcing, fluxes, and boundary conditions for ocean model research, USGODAE serves global data from the Navy Operational Global Atmospheric Prediction System (NOGAPS) and regional data from the Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS). Global meteorological data and observational data from the FNMOC Ocean QC process are posted in near real-time to USGODAE. These include T/S profiles, in-situ and satellite sea surface temperature (SST), satellite altimetry, and SSM/I sea ice. They contain all of the unclassified in-situ and satellite observations used to initialize the FNMOC NOGAPS model. Also, the Naval Oceanographic Office provides daily satellite SST and SSH retrievals to USGODAE. The USGODAE Server functions as one of two Argo Global Data Assembly Centers (GDACs), hosting the complete collection of quality-controlled Argo T/S profiling float data. USGODAE Argo data are served through OPeNDAP and LAS, providing complete integration into NVODS and the IOOS/DMAC. Due to its high reliability, ease of data access, and increasing breadth of data, the USGODAE Server is becoming an invaluable resource for both the GODAE community and the general oceanographic community. Continued integration of model, forcing, and in-situ data sets from providers throughout the world is making the USGODAE Monterey Data Server a key part of the international GODAE project.
http://www.usgodae.org
SF43B-07 15:10h
Marine Metadata Interoperability: A Community Framework
There are a number of independent metadata projects throughout the ocean science community, each developing some combination of standards, tools, and ontologies. Some are project-specific, and others address entire disciplines. Few of these efforts have engaged the wider ocean science community, or provided a coordinated set of resources that can guide the development of distributed, integrated and interoperable ocean-data systems. The Marine Metadata Interoperability team plans to implement a community-based framework to coordinate developmens in usable, interoperable marine metadata. The project will build from the myriad existing efforts, encourage widespread community participation (national and international), and demonstrate the use and benefits of metadata standardization through development of prototype interoperable data management solutions. Our goals are to engage the ocean science community by: a) providing technical guidance and reference documentation on using and developing metadata solutions; b) encouraging community involvement in the development and evaluation of those documents, and c) establishing two test-bed activities to demonstrate cross- platform, cross-disciplinary, interoperable distributed data systems. The two coordinated test-bed demonstrations will leverage metadata work across at least three different types of data acquisition systems: (1) cabled or moored platforms, (2) mobile autonomous systems, and (3) remote-sensing platforms. Multiple instances of each system, from different institutions, will be included in the application of existing, modular and scalable data systems. The proposed work will provide resources for metadata development to the ocean science community, building from developments in computer science and other geosciences. It will support the NSF emphasis on needed interoperability between data systems, as demonstrated through ORION and several recent interoperability workshops. It will also address a top priority in the OceanUS Data Management and Communications (DMAC) plan.
http://wiki.mbari.org/marinemetadatawiki/MetadataInteroperabilityProposal
SF43B-08 15:25h
MERSEA, the European Gate to Ocean Data
Mersea ('Marine Environment and Security for the European Area'), a European project to manage the oceans, aims to develop by 2008 the GMES ocean component ('Global Monitoring for Environment and Security'), a system for operational monitoring and forecasting on global and regional scales of the ocean physics, bio-geochemistry and ecosystems. Mersea project started on April 1st, 2004. This ocean monitoring system is envisioned as an operational network that systematically acquires data and disseminates information to serve the needs of intermediate users and policy makers, in support of safe and efficient off-shore activities, environmental management, security, and sustainable use of marine resources. Three real-time data streams have been identified: remote sensed from satellites, in situ from ocean observing networks, and surface forcing fields from numerical weather prediction agencies. Mersea will ensure the availability of near real time and delayed mode products over the period 2004-2008, global and regional products optimised for supporting operational oceanography. Historical data sets for the last 15 years will also be prepared. Mersea is also the European center serving Godae goals ('Global Ocean Data Assimilation Experiment', 2003-2005). The timely delivery of high quality and reliable information to many user categories is essential for the success of such integrated project. There is consequently a large effort to coordinate all delivery actions giving special attention on the users' needs. This effort will cover many issues like product presentation, products and web services catalogue and how to deal for an interdisciplinary and integrated use. A first major difficulty is to reach at many levels product coherency and standardisation, which is needed to facilitate the visibility, understanding and exchange of the ocean observing data. A first task will therefore be to write a common unified framework guide, a kind of member chart, which will require from partners to 1- Apply or define standard for ocean products and associated information, 2- Harmonise the data exchange procedure (ie. rely on a decentralised but compatible system architecture for distribution on Internet), 3- Allow federation and cluster of individual data centers in order to facilitate the routine real-time exchange of high quality and appropriate environmental information, both in real-time and delayed mode (ie. set up a unique common ocean portal, an information management system for ocean products and services on Internet, which will allow the users to identify and access spatial or geographical information from a wide range of sources, from the local level to the global level, in real time and in an inter-operable way for a variety of uses). This presentation will comment on this member chart and requirements, and the ocean portal concept. We will review here also what has bee done for the precursor project Mersea Strand-1 and strengths and weaknesses raised (the participating ocean models FOAM- United Kingdom, Mercator- France , MFS Italy, Topaz- Norway , the unified framework, the distribution choice with Opendap e-technology, the manipulation and visualisation for education and public outreach or real-time complementary intercomparison expertise with a Live Access Server). [See also S. Baudel's presentation for an overview of what already exists.]
http://www.mersea.eu.org