Earth and Space Science Informatics [IN]

IN31D
 MC:3022  Wednesday  0800h

Building Interoperability Across the Geosciences I


Presiding:  L Gundersen, U.S. Geological Survey; L Allison, Arizona Geological Survey

IN31D-01 INVITED

Before you make the data interoperable you have to make the people interoperable

* Jackson, I ij@bgs.ac.uk, British Geological Survey, Keyworth, Nottingham, NG12 5GG, United Kingdom

In February 2006 a deceptively simple concept was put forward. Could we use the International Year of Planet Earth 2008 as a stimulus to begin the creation of a digital geological map of the planet at a target scale of 1:1 million? Could we design and initiate a project that uniquely mobilises geological surveys around the world to act as the drivers and sustainable data providers of this global dataset? Further, could we synergistically use this geoscientist-friendly vehicle of creating a tangible geological map to accelerate progress of an emerging global geoscience data model and interchange standard? Finally, could we use the project to transfer know-how to developing countries and reduce the length and expense of their learning curve, while at the same time producing geoscience maps and data that could attract interest and investment? These aspirations, plus the chance to generate a global digital geological dataset to assist in the understanding of global environmental problems and the opportunity to raise the profile of geoscience as part of IYPE seemed more than enough reasons to take the proposal to the next stage. In March 2007, in Brighton, UK, 81 delegates from 43 countries gathered together to consider the creation of this global interoperable geological map dataset. The participants unanimously agreed the Brighton "Accord" and kicked off "OneGeology", an initiative that now has the support of more than 85 nations. Brighton was never designed to be a scientific or technical meeting: it was overtly about people and their interaction - would these delegates, with their diverse cultural and technical backgrounds, be prepared to work together to achieve something which, while technically challenging, was not complex in the context of leading edge geoscience informatics. Could we scale up what is a simple informatics model at national level, to deliver global coverage and access? The major challenges for OneGeology (and the deployment of interoperability) are rarely scientific or technical; they were and are the significantly more difficult logistical and "geopolitical - cultural" issues. OneGeology has grown and progressed rapidly to be an international project. It has not only achieved its first phase scientific and technical goals in launching its web map portal with map data from 30 nations at the International Geological Congress in August 2008, but has also attracted substantial scientific, public and media interest around the world. OneGeology is, in every sense, a child of its time - an agile Internet paradigm - a project whose informatics interoperability goals are in reality the total project ethos. The project has been allowed to grow and extend just as fast and as wide as its actors agree to take it, for the most part free from the territoriality and bureaucracy that all too often inhibit such initiatives. It is beyond doubt that a conventionally run (and thus constrained) OneGeology would not have achieved its goals. The OneGeology team has taken enormous strides in a very short space of time and the achievements are considerable. But some new challenges now arise. How will we sustain the project? Where do we take it next? Can OneGeology continue its "liberal" modus operandi? How should we fund and provide continuity for a growing and thus more demanding infrastructure and user base. Should we expand the portal to include map data from academia, commerce and the public (and how to maintain authentication if one does that?) How fast do we increase the sophistication of the informatics and the resolution and diversity of the data? The presentation will describe OneGeology, its current status and the technical and cultural issues involved in trying to move forward interoperability on a global scale.

IN31D-02

Geosciences Information Network (GIN): A Distributed, Interoperable Data Network for the Geosciences

* Allison, L Lee.allisonazgs.az.gov, Arizona Geological Survey, 416 W. Congress St., Suite 100, Tucson, AZ 85701-1381, United States
Gundersen, L C lgundersen@usgs.gov, U.S. Geological Survey, 12201 Sunrise Valley Drive National Center MS 911, Reston, VA 20192, United States
Richard, S M Steve.richardazgs.az.gov, Arizona Geological Survey, 416 W. Congress St., Suite 100, Tucson, AZ 85701-1381, United States
Dickinson, T L tdickinson@usgs.gov, U.S. Geological Survey, 12201 Sunrise Valley Drive National Center MS 911, Reston, VA 20192, United States

A coalition of the state geological surveys (AASG), the U.S. Geological Survey (USGS), and other partners will receive NSF funding over the next 3 years under the INTEROP solicitation to start building a distributed, interoperable data network that will make thousands of data bases from the geological surveys and their partners available, searchable, and interoperable. This Geosciences Information Network (GIN) will focus on both spatial and analytical geologic data collected across the country for the past 150 years. Key components of the proposed network include: 1) catalog systems for data discovery; 2) service definitions that define interfaces for searching catalogs and accessing resources; 3) shared interchange formats to encode information for transmission; 4) data providers that publish information using standardized services defined by the network; and 5) client applications enabled to utilize information resources provided by the network. The GIN will integrate and utilize catalog resources that currently exist or are in development. We are working closely with the USGS National Geologic Map Database and its existing map catalog; with the USGS National Geological and Geophysical Data Preservation project, which is developing a metadata catalog for geoscience information resource discovery; and with the GEON catalog. Existing and emerging extensible mark-up languages such as GeoSciML, ChemML, and Open Geospatial Consortium sensor, observation and measurement MLs will provide the necessary interchange formats. Client application development will be fostered by collaboration with industry partners such as ESRI who's Geology Data Model for ArcGIS software is being designed to be compatible with GIN. The GIN project will focus on development of the remaining aspects of the system including: service definitions, technical assistance to data providers to implement the services and bring content online, and system integration. The Geosciences Information Network project will be managed by the Arizona Geological Survey on behalf of the Association of American State Geologists (AASG) in partnership with the USGS. Other collaborations include the OneGeology- Europe (www.onegeology.org) consortium of 27 nations that is building a similar network under the EU INSPIRE initiative, GEON (www.geongrid.org), and Earthchem (www.earthchem.org). OneGeology-Europe and GIN have agreed to integrate their networks and work towards the goal of developing global standards among geological surveys.

IN31D-03

GeosciNET: Building a Global Geoinformatics Partnership

* Snyder, W S wsnyder@boisestate.edu, Department of Geosciences Boise State University, 1910 University Drive, Boise, ID 83725, United States
Lehnert, K A lehnert@ldeo.columbia.edu, Lamont-Doherty Earth Observatory Columbia University, 61 Route 9W, Palisades, NY 10964-8000, United States
Ito, E eito@umn.edu, Department of Geology and Geophysics, University of Minnesota, 310 Pillsbury Drive SE, Minneapolis, MN 55455, United States
Harms, U ulrich@gfz-potsdam.de, GeoForschungsZentrum Potsdam, Telegrafenberg A3, Potsdam, D-14473, Germany
Klump, J jens.klump@gfz-potsdam.de, GeoForschungsZentrum Potsdam, Telegrafenberg A3, Potsdam, D-14473, Germany

GeosciNET is a collaboration of several existing geoinformatics efforts organized to provide a more effective data system for geoscience projects. Current members are: CoreWall (www.corewall.org), Geoinformatics for Geochemistry (GfG; www.geoinfogeochem.org), System for Earth Sample Registration (SESAR; www.geosamples.org ), GeoStrat SYS (www.geostratsys.org (formerly: PaleoStrat, www.paleostrat.org)), and the International Continental Drilling Program (ICDP; www.icdp-online.org). GeosciNET's basic goal is to advance coordination, complementarity, and interoperability, and minimize duplication of efforts among the involved partner systems in order to streamline the development and operation of geoinformatics efforts. We believe that by advancing the development and data holdings of its member groups, the overall value of each site will be significantly enhanced and better meet the needs of the users. With the existing membership, GeosciNET can offer a comprehensive, integrated system for data acquisition, dissemination, archiving, visualization, integration, and analysis. The system will enable a single researcher or a group of collaborators to keep track of, visualize, and digitally archive any type of sample- or stratigraphic-based data produced from drill holes, dredges, measured stratigraphic sections, the field, or the laboratory. The challenge is to build a linked system that provides users a library of research data as well as tools to input, discover, access, integrate, manipulate, analyze, and model interdisciplinary data - all without corrupting the original data and insuring that the data are attributed to the originator at all times. Science runs on data, but despite the importance of data (legacy or otherwise), there are currently few convenient mechanisms that enable users to easily input their data into databases. While some efforts such as GfG databases, PetDB and SedDB have worked hard to compile such data, only users' active participation can capture the major part of critical legacy data, and insure that new data enter the digital stream as they are generated. GeosciNET wants to lower the barriers so users can take advantage of geoinformatics resources and embrace its promise as the platform for doing the science of the future. Once these benefits are understood by the user community, the obstacles that currently exist in building a larger geoinformatics system will start to erode. User participation requires the proper tools such as translators that can recognize tags and parse the data accordingly, and incentives such as tools for visualization, synthesis and analysis, and digital collaboration space. A major focus for GeosciNET is to support individual researchers and projects that do not have their own dedicated data management and education and outreach programs. One of the greatest challenges for geoinformatics lies in being perceived as a friendly resource by its users where they can easily link their observations and analyses and integrate them with other data. GeosciNET will be experimenting with mechanisms to accomplish these goals.

IN31D-04

Full and Open Access to Data in the Global Earth Observing System of Systems (GEOSS): Implementing the GEOSS Data Sharing Principles

* Chen, R S bchen@ciesin.columbia.edu, Committee on Data for Science and Technology (CODATA), 5 rue Auguste Vacquerie, Paris, 75016, France
* Chen, R S bchen@ciesin.columbia.edu, CIESIN, Columbia University, 61 Route 9W, Palisades, NY 10956, United States
Uhlir, P F puhlir@nas.edu, Office of International Scientific Information Programs, the National Academies, 500 Fifth Street NW, Washington, DC 20001, United States
Gabrinowicz, J I jgabryno@olemiss.edu, National Center for Remote Sensing, Air, and Space Law, The University of Mississippi School of Law, P.O. Box 1848, University, MS 38677-1848, United States

Full and open access to data from remote sensing platforms and other sources can facilitate not only scientific research but also the more widespread and effective use of scientific data for the benefit of society. The Global Earth Observing System of Systems (GEOSS) is a major international initiative of the Group on Earth Observations (GEO) to develop "coordinated, comprehensive and sustained Earth observations and information." In 2005, GEO adopted the GEOSS Data Sharing Principles, which call for the "full and open exchange of data, metadata, and products shared within GEOSS, recognizing relevant international instruments and national policies and legislation." These Principles also note that "All shared data, metadata, and products will be made available with minimum time delay and at minimum cost" and that "All shared data, metadata, and products being free of charge or no more than cost of reproduction will be encouraged for research and education." GEOSS Task DA-06-01, aimed at developing a set of recommended implementation guidelines for the Principles, was established in 2006 under the leadership of CODATA, the Committee on Data for Science and Technology of the International Council for Science (ICSU). An international team of authors has developed a draft White Paper on the GEOSS Data Sharing Principles and a proposed set of implementation guidelines. These have been carefully reviewed by independent reviewers, various GEO Committees, and GEO National Members and Participating Organizations. It is expected that the proposed implementation guidelines will be discussed at the GEO-V Plenary in Budapest in November 2008. The current version of the proposed implementation guidelines recognizes the importance of good faith, voluntary adherence to the Principles by GEO National Members and Participating Organizations. It underscores the value of reuse and re-dissemination of GEOSS data with minimum restrictions, not only within GEOSS itself but on the part of GEOSS users. Consistency with relevant international instruments and applicable policies and legislation is essential, and therefore clarification and coordination of applicable policies and procedures are needed. Pricing of GEOSS data, metadata, and products should be based on the premise that the data and information within GEOSS is a public good for public-interest use in the nine societal benefit areas. Time delays for data access from both operational and research systems should be kept to a minimum, reflecting the norms of the relevant scientific communities or data processing centers. The proposed guidelines also emphasize the need to better define research and education uses and to develop and collect usage metrics and indicators. The draft White Paper provides a more detailed review of past and current data policies related to space-based and spatial data, assesses the implications of the Data Sharing Principles for selected case studies, and discusses a number of other important implementation issues. Successful implementation of the GEOSS Data Sharing Principles is likely to be a critical element in the future effectiveness and value of GEOSS.

http://www.codata.org/GEOSS/index.html

IN31D-05

GeoSciML 2: Enabling Enhanced Geologic Information Interoperability

* Brodaric, B brodaric@nrcan.gc.ca, Geological Survey of Canada, 234B - 615 Booth St., Ottawa, ON K1A0E9, Canada
Richard, S M steve.richard@azgs.az.gov, Arizona Geological Survey, 416 W. Congress St., #100, Tucson, AZ 85701, United States
Interoperability Working Group, C jll@bgs.ac.uk, IUGS, www.cgi-iugs.org, N/A, N/A,

Interchange and mark-up languages such as the Geography Markup Language (GML) provide standard structures for transferring geospatial information within cyber-based infrastructures. In 2006 the CGI-IUGS Interoperability Working Group (IWG) released GeoSciML 1 as an application of GML for basic geologic information. Since then further testing and use case analysis has resulted in enhancements to the design of GeoSciML, which have increased both the depth and breadth of the representation of geologic units, earth materials, structures, and associated vocabularies. After careful testing of these enhancements in a recent implementation testbed, the IWG is officially releasing GeoSciML 2. The release includes a GeoSciML schema representation in UML and XML formats, text descriptions of schema components, and example data files. This paper will describe GeoSciML 2, focussing on significant changes and practical implementations, including its utilization in geospatial standard technologies such those being deployed by emerging geoscience information networks. As a result of these advancements GeoSciML 2 is poised to become a key vehicle for the delivery of basic geologic information within such networks.

IN31D-06

QuakeML: Recent Development and First Applications of the Community-Created Seismological Data Exchange Standard

* Euchner, F fabian@sed.ethz.ch, Swiss Seismological Service, ETH Zurich, Zurich, 8093, Switzerland
Schorlemmer, D ds@usc.edu, Department of Earth Sciences, University of Southern California, Los Angeles, CA 90089, United States
Kästli, P kaestli@sed.ethz.ch, Swiss Seismological Service, ETH Zurich, Zurich, 8093, Switzerland
QuakeML Group, t quakeml@sed.ethz.ch

QuakeML is an XML-based exchange format for seismological data which is being developed using a community-driven approach. It covers basic event description, including picks, arrivals, amplitudes, magnitudes, origins, focal mechanisms, and moment tensors. Contributions have been made from ETH, GFZ, USC, SCEC, USGS, IRIS DMC, EMSC, ORFEUS, GNS, ZAMG, BRGM, and ISTI. The current release (Version 1.1, Proposed Recommendation) reflects the results of a public Request for Comments process which has been documented online at http://quakeml.org/RFC_BED_1.0. QuakeML has recently been adopted as a distribution format for earthquake catalogs by GNS Science, New Zealand, and the European-Mediterranean Seismological Centre (EMSC). These institutions provide prototype QuakeML web services. Furthermore, integration of the QuakeML data model in the CSEP (Collaboratory for the Study of Earthquake Predictability, http://www.cseptesting.org) testing center software developed by SCEC is under way. QuakePy is a Python- based seismicity analysis toolkit which is based on the QuakeML data model. Recently, QuakePy has been used to implement the PMC method for calculating network recording completeness (Schorlemmer and Woessner 2008, in press). Completeness results for seismic networks in Southern California and Japan can be retrieved through the CompletenessWeb (http://completenessweb.org). Future QuakeML development will include an extension for macroseismic information. Furthermore, development on seismic inventory information, resource identifiers, and resource metadata is under way. Online resources: http://www.quakeml.org, http://www.quakepy.org

IN31D-07

Design Drivers of Water Data Services

* Valentine, D valentin@sdsc.edu, San Diego Supercomputer Center, U. Cal., San Diego, 9500 Gilman Drive #0505, La Jolla, CA 92093-0505, United States
Zaslavsky, I zaslavsk@sdsc.edu, San Diego Supercomputer Center, U. Cal., San Diego, 9500 Gilman Drive #0505, La Jolla, CA 92093-0505, United States

The CUAHSI Hydrologic Information System (HIS) is being developed as a geographically distributed network of hydrologic data sources and functions that are integrated using web services so that they function as a connected whole. The core of the HIS service-oriented architecture is a collection of water web services, which provide uniform access to multiple repositories of observation data. These services use SOAP protocols communicating WaterML (Water Markup Language). When a client makes a data or metadata request using a CUAHSI HIS web service, these requests are made in standard manner, following the CUAHSI HIS web service signatures – regardless of how the underlying data source may be organized. Also, regardless of the format in which the data are returned by the source, the web services respond to requests by returning the data in a standard format of WaterML. The goal of WaterML design has been to capture semantics of hydrologic observations discovery and retrieval and express the point observations information model as an XML schema. To a large extent, it follows the representation of the information model as adopted by the CUASHI Observations Data Model (ODM) relational design. Another driver of WaterML design is specifications and metadata adopted by USGS NWIS, EPA STORET, and other federal agencies, as it seeks to provide a common foundation for exchanging both agency data and data collected in multiple academic projects. Another WaterML design principle was to create, in version 1 of HIS in particular, a fairly rigid and simple XML schema which is easy to generate and parse, thus creating the least barrier for adoption by hydrologists. WaterML includes a series of elements that reflect common notions used in describing hydrologic observations, such as site, variable, source, observation series, seriesCatalog, and data values. Each of the three main request methods in the water web services - GetSiteInfo, GetVariableInfo, and GetValues – has a corresponding response element in WaterML: SitesResponse, VariableResponse, and TimeSeriesResponse. The WaterML specification is being adopted by federal agencies. The experimental USGS NWIS Daily Values web service returns WaterML-compliant TImeSeriesResponse. The National Climatic Data Center is also prototyping WaterML for data delivery, and has developed a REST-based service that generates WaterML- compliant output for the NCDC ASOS network. Such agency-supported web services coming online provide a much more efficient way to deliver agency data compared to the web site scraper services that the CUAHSI HIS project has developed initially. The CUAHSI water data web services will continue to serve as the main communication mechanism within CUAHSI HIS, connecting a variety of data sources with a growing set of web service clients being developed in both academia and the commercial sector. The driving forces for the development of web services continue to be: - Application experience and needs of the growing number of CUAHSI HIS users, who experiment with additional data types, analysis modes, data browsing and searching strategies, and provide feedback to WaterML developers; - Data description requirements posed by various federal and state agencies; - Harmonization with standards being adopted or developed in neighboring communities, in particular the relevant standards being explored within the Open Geospatial Consortium. CUAHSI WaterML is a standard output schema for CUAHSI HIS water web services. Its formal specification is available as OGC discussion paper at www.opengeospatial.org/standards/dp/ class="ab'>

IN31D-08

Surviving the Transition from FGDC to ISO Metadata Standards

Fox, C G christopher.g.fox@noaa.gov, NOAA National Geophysical Data Center, E/GC 325 Broadway, Boulder, CO 80305- 3328, United States
* Milan, A Anna.Milan@noaa.gov, Cooperative Institute for Research in Environmental Sciences, University of Colorado, University of Colorado at Boulder CIRES 216 UCB, Boulder, CO 80309-0216, United States
Sylvester, D Denise.R.Sylvester@noaa.gov, NOAA National Geophysical Data Center, E/GC 325 Broadway, Boulder, CO 80305- 3328, United States
Habermann, T Ted.Habermann@noaa.gov, NOAA National Geophysical Data Center, E/GC 325 Broadway, Boulder, CO 80305- 3328, United States
Kozimor, J John.Kozimor@noaa.gov, Cooperative Institute for Research in Environmental Sciences, University of Colorado, University of Colorado at Boulder CIRES 216 UCB, Boulder, CO 80309-0216, United States
Froehlich, D David.Froehlich@noaa.gov, Cooperative Institute for Research in Environmental Sciences, University of Colorado, University of Colorado at Boulder CIRES 216 UCB, Boulder, CO 80309-0216, United States

The NOAA Metadata Manager and Repository (NMMR) has served a well established group of data managers at NOAA's National Data Centers for over a decade. It provides a web interface for managing FGDC compliant metadata and publishing that metadata to several large data discovery systems (GeoSpatial One-Stop, NASA's Global Change Master Directory, the Comprehensive Large-Array data Stewardship System, and FirstGov). The Data Center's are now faced with migration of these metadata to new International Metadata Standards (ISO 19115, 19115-2, …). We would like to accomplish this migration while minimizing disruption to the current users and supporting significant new capabilities of the ISO standards. Our current approach involves relational ISO views on top of the existing XML database to convert FGDC content into ISO without changing the data manager interface. These views are the foundation for ISO- compliant XML metadata access via REST-like web services. Additionally, new database tables provide information required by ISO that is not included in the FGDC standard. This approach allows us to support the new standard without disrupting the current system.

IN31D-09

NASA's Earth Science Data Systems Standards Process

* Enloe, Y yonsook@mindspring.com, SGT Inc. Yonsook Enloe, 7701 Greenbelt Rd, Suite 400, Greenbelt, MD 20770,
Ullman, R richard.ullman@nasa.gov, NASA Richard Ullman, Goddard Space Flight Center, Greenbelt, MD 20771,

NASA's Standards Process Group (SPG) facilitates the approval of proposed standards that have proven implementation and operational benefit for use in NASA's Earth science data systems. After some initial experience in approving proposed standards, the SPG has tailored its Standards Process to remove redundant reviews to shorten the review process. We have found that the candidate submissions that self defined communities are proposing for endorsement to the SPG are one of 4 types: (1) A NASA community developed standard used within at least one self defined community where the proposed standard has not been approved or adopted by an external standards organization and where new implementations are expected to be developed from scratch, using the proposed standard as the implementation specification; (2) A standard already approved by an external standards organization but is being proposed for use for the NASA Earth science community; (3) A defacto standard already widely used; or a (4) Technical Note We will discuss real examples of the different types of candidate standards that have been proposed and endorsed (i.e. OPeNDAP's Data Access Protocol, Open Geospatial Consortium's Web Map Server, and the Hierarchical Data Format). We will discuss a potential defacto standard (NASA's Global Change Master Directory (GCMD) Directory Interchange Format (DIF)) that is currently being reviewed. This past year, the SPG has modified its Standards Process to provide a comprehensive but not redundant review of the submitted RFC. The end result of the process tailoring is that the reviews will be completed faster. At each RFC submission, the SPG will decide which reviews will be performed. These reviews are conducted simultaneously and can include these three types: (1) A Technical review to review the technical specification and associated implementations; (2) An Operational Readiness review to evaluate whether the proposed standard works in a NASA environment with NASA Earth science data with the volume of users; (3) Usefulness review to determine whether the candidate standard is useful or helpful or fits the purpose for the users. Some submissions, particularly the defacto standards or standards already approved by other standards organizations, will not need all three types of reviews. As an internal advisory group, the SPG has a NASA agency centered focus. At the same time, there is growing awareness that interagency and international standards are extremely relevant to addressing the regional and global science and decision support applications. The Global Earth Observing System of Systems (GEOSS) Architecture and Data Management (AMD) Standards Interoperability Forum (SIF) is designed to encourage the use of standards in contributed components. It is clear that some of the standards endorsed by the NASA SPG could be important contributions to the GEOSS. The GEOSS recognized standards can also be reviewed as 'defacto' standards by the SPG. NASA stakeholders are often also NOAA stakeholders. Members of the NASA SPG have been working with members of the NOAA standards endorsement process to provide mutual benefit. We will also discuss the role of the NASA SPG participation with these and other cross-agency and international standards initiatives.

http://www.esdswg.org/spg

IN31D-10 INVITED

Arctic Observing Network (AON): Enhancing Observing, Data Archiving and Data Discovery Capabilities as Arctic Environmental System Change Continues

* Jeffries, M O mjeffrie@nsf.gov, National Science Foundation, 4201 Wilson Boulevard, Arlington, VA 22230, United States

The National Science Foundation (NSF) and the National Oceanic and Atmospheric Administration, under the auspices of the U.S. Inter-Agency Arctic Research Policy Committee, are leading the development of the Arctic Observing Network (AON) as part of the implementation of the Study of Environmental Arctic Change (SEARCH) and as a legacy of International Polar Year (IPY). As the Observing Change component of SEARCH, AON complements the Understanding Change and Responding to Change components. AON addresses the need to enhance observing capabilities in a data-sparse region where environmental system changes are among the most rapid on Earth. AON data will contribute to research into understanding the causes and consequences of Arctic environmental system change and its global connections, and to improving predictive skill. AON is also a contribution to the development of a multi-nation, pan-Arctic observing network that is being discussed at the IPY 'Sustaining Arctic Observing Networks' (SAON) workshops. Enhancing Arctic observing capabilities faces many challenges, including coordination and integration of disparate observing elements and data systems that operate according to diverse policies and practices. There is wide agreement that data systems that provide archiving and discovery services are essential and integral to AON. In recognition of this, NSF is supporting the development of CADIS (Cooperative Arctic Data and Information Service) as an AON portal for data discovery, a repository for data storage, and a platform for data analysis. NSF is also supporting ELOKA (Exchange for Local Observations and Knowledge in the Arctic), a pilot project for a data management and networking service for community- based observing that keeps control of data in the hands of data providers while still allowing for broad searches and sharing of information. CADIS and ELOKA represent the application of cyberinfrastructure to meet AON data system needs that might also contribute to the virtual, physical and social coordination and integration that will be required to create a functioning network that will realise Arctic and global value-added services and societal benefits. Enhanced Arctic observing and data systems also need a data policy. The AON data policy is clear and the same as the SEARCH data policy: data must be fully, freely and openly available as quickly as possible after collection and quality control. That is, with few exceptions, AON data are community data and not subject to embargo. Successful implementation of this policy will require something of a cultural shift among many scientists and northern residents. Sustained, inter-operable data systems that make it easy to deposit and discover data have a role to play a role in achieving that cultural shift.

IN31D-11

Semantics-Based Interoperability Framework for the Geosciences

* Sinha, A pitlab@vt.edu, Department of Geosciences, Virginia Tech, Blacksburg, VA 24061, United States
Malik, Z zaki@vt.edu, Department of Computer Science, Virginia Tech,, Blacksburg, VA 24061, United States
Raskin, R raskin@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109, United States
Barnes, C CAL.BARNES@ttu.edu, Department of Geosciences, Texas Tech University, Lubbock, TX 79409, United States
Fox, P pfox@ucar.edu, High Altitude Observatory/NCAR, P.O. Box 3000, Boulder, CO 80307, United States
McGuinness, D dlm@cs.rpi.edu, Department of Computer Science and Cognitive Science, Rensselaer Polytechnic Institute (RPI), Troy, NY 12180, United States
Lin, K klin@sdsc.edu, San Diego Super Computer Center, University of California San Diego, San Diego, CA 92093, United States

Interoperability between heterogeneous data, tools and services is required to transform data to knowledge. To meet geoscience-oriented societal challenges such as forcing of climate change induced by volcanic eruptions, we suggest the need to develop semantic interoperability for data, services, and processes. Because such scientific endeavors require integration of multiple data bases associated with global enterprises, implicit semantic-based integration is impossible. Instead, explicit semantics are needed to facilitate interoperability and integration. Although different types of integration models are available (syntactic or semantic) we suggest that semantic interoperability is likely to be the most successful pathway. Clearly, the geoscience community would benefit from utilization of existing XML-based data models, such as GeoSciML, WaterML, etc to rapidly advance semantic interoperability and integration. We recognize that such integration will require a "meanings-based search, reasoning and information brokering", which will be facilitated through inter-ontology relationships (ontologies defined for each discipline). We suggest that Markup languages (MLs) and ontologies can be seen as "data integration facilitators", working at different abstraction levels. Therefore, we propose to use an ontology-based data registration and discovery approach to compliment mark-up languages through semantic data enrichment. Ontologies allow the use of formal and descriptive logic statements which permits expressive query capabilities for data integration through reasoning. We have developed domain ontologies (EPONT) to capture the concept behind data. EPONT ontologies are associated with existing ontologies such as SUMO, DOLCE and SWEET. Although significant efforts have gone into developing data (object) ontologies, we advance the idea of developing semantic frameworks for additional ontologies that deal with processes and services. This evolutionary step will facilitate the integrative capabilities of scientists as we examine the relationships between data and external factors such as processes that may influence our understanding of "why" certain events happen. We emphasize the need to go from analysis of data to concepts related to scientific principles of thermodynamics, kinetics, heat flow, mass transfer, etc. Towards meeting these objectives, we report on a pair of related service engines: DIA (Discovery, integration and analysis), and SEDRE (Semantically-Enabled Data Registration Engine) that utilize ontologies for semantic interoperability and integration.