Earth and Space Science Informatics [IN]

IN41B MCC:3007 Thursday

Ontologies for Earth and Space Sciences I

Presiding: N Hurlburt, Lockheed Martin Solar and Astrophysics Laboratory; P Fox, NCAR High Altitude Observatory

IN41B-01 INVITED

Ontologies and the Semantic Web: Key Enablers for Earth and Space Science Advances

* McGuinness, D L (dlm@ksl.stanford.edu) , Stanford University and McGuinness Associates, 20 Peter Coutts Circle, Stanford, ca 94305

Ontologies and ontology web languages, such as OWL, are beginning to have significant impacts in a broad range of scientific and other disciplines. Specifically, the opportunity to unambigously provide declarative and operational specifications of term meanings and connect those specifications to documents and data sets containing those terms, has the potential to revolutionize interoperability of data and documents. We have begun a few efforts to leverage scientific ontologies to support interoperability of scientific data by providing tools and infrastructure that utilize definitions of scientific terms that are captured in ontologies and linked on the web. We will discuss both our infrastructure and also provide an introduction to how the emerging semantic web may be leveraged to advance earth and space science.

IN41B-02 INVITED

Next Generation Medical Informatics Powered by Ontologies and the Semantic Web

* Musen, M , Stanford University Medical Informatics, Stanford University, Stanford, CA 94305 United States

Ontologies and environments for creating and manipulating ontologies are growing well beyond the fields of computer science and philosophy. They have been embraced in fields including medicine and medical informatics. Stanford Medical Informatics has been a leader in providing open source ontology tools with the Protege toolkit and has also been a leader in using these tools to develop a number of medical ontologies and applications. Recently SMI has taken the next step in leading new work on bio ontologies and has been selected to lead a new bio ontology institute.

IN41B-03 INVITED

Leveraging Ontologies in Online Business

* Brachman, R , Yahoo! Inc, 701 First Avenue, Sunnyvale, CA 94089 United States

Yahoo! has been a leader in using ontologies to support a structured browsing experience. This is just one way that ontologies may be used in a commercial or online search paradigm. We will discuss some other potentail options and impacts.

IN41B-04

Geo-Ontology: Empowering new Discoveries in Earth Sciences

* Sinha, A (pitlab@vt.edu) , Department of Geosciences, Virginia Tech, Blacksburg, VA 24061 United States
Lin, K (klin@sdsc.edu) , San Diego Super Computer Center, University of california, San Diego, San Diego, CA 92093 United States
Raskin, R (raskin@jpl.nasa.gov) , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109 United States
Barnes, C (Cal.Barnes@ttu.edu) , Department of Geosciences, Texas Tech University, Lubbock, TX 79409 United States
McGuinness, D (dlm@ksl.stanford.edu) , Knowledge Systems Laboratory, Stanford University, Stanford, CA 94305 United States
Najdi, J (jnajdi@vt.edu) , Department of Geosciences, Virginia Tech, Blacksburg, VA 24061 United States

The rapid growth of data-rich resources associated with Earth and other planetary studies, including maps created by in-situ and remote sensing techniques, as well as spatial and aspatial relational databases, is driving new requirements for an information infrastructure that will facilitate scientific discovery. Ongoing research suggests that an ontology-based framework will facilitate registration, management, integration and analysis of databases and other data objects in a web-based environment. For earth scientists, ontologies can be viewed as a representation paradigm that can be used to capture formal declarative specifications of geologic objects, phenomena, and their interrelationships (e.g. subclass, part of, above, etc.). Ontologies may be used to capture classification schemes such as those for minerals, rocks, geologic time scale, or geologic structures, and thereby provide an organizational structure for automatically classifying earth science data. This is only possible because ontologies contain explicit definitions of terms used by scientists to associate meaning to the data or relationships between datasets. Ongoing development and growth of an ontology-based framework for the solid earth requires utilization of existing community-accepted high level ontologies such as SWEET (Semantic Web for Earth and Environmental Terminology) and NADM (North American Geological Data Model). The high level SWEET ontology contains formal definitions for terms used in earth and space sciences, and it encodes structure that recognizes the spatial distribution of earth environments (earth realm) and the interfaces between different realms. These earth realms have associated properties with appropriate units and provide an extensible upper level terminology. Extension of these concepts to high-resolution ontologies where data reside is well underway. For example, we have developed new ontology-based packages containing Planetary Materials (elements, isotopes, rocks and minerals, as well as State of Matter), Planetary Structure, Location and Physical Properties to extend NADM and SWEET to include both relational and spatial databases that contain chemical, modal, textural, isotopic or structural data for rocks and minerals, and their location in 2 or 3-D space. These extensions allow searches and navigation across ontologies with distinct granularities to discover and integrate appropriate and conceptually- related databases. We also recognize that the full power of ontologies will be recognized when earth scientists can utilize this framework to conduct both object and process based integrative science.

IN41B-05

Reasoning with Quantitative Data in Ontologically-Based Software Systems

* Smyth, C (cpsmyth@georeferenceonline.com) , Georeference Online Ltd, 301-850 West Hastings St, Vancouver, BC V6C 1E1 Canada
Poole, D (poole@cs.ubc.ca) , Department of Computer Science, University Of British Columbia, Vancouver, BC V6T 1Z4 Canada
Huang, E (ehuang@georeferenceonline.com) , Georeference Online Ltd, 301-850 West Hastings St, Vancouver, BC V6C 1E1 Canada

Most ontologically-oriented data structures such as RDF, RDF-S and OWL focus on the flexible and efficient representation of qualitative information. They do not make any special provisions for quantitative data, which require a special form of reasoning. In particular, quantitative data, which may pertain to points or ranges on the number line, require reasoning capabilities with respect to equality, overlap, and lack of overlap. This reasoning capability is important to making useful inferences from large volumes of quantitative data, the likes of which are often produced in the earth sciences. An ontologically-framed structure for describing the quantitative attributes of any instance or model (concept) is presented. The structure is specifically designed to facilitate specification of the quantitative data which may be referred to within an ontology, to reasoning about that data, and to being able to support explanation to the human user of the system why certain conclusions have been reached at the end of any cycle of reasoning. Use of the data structure is illustrated in an exercise comparing widely-varying rock compositions, and in the automatic generation of similarity-rankings between the different rock types.

IN41B-06

SWEET- An Upper Level Ontology for Earth System Science

* Raskin, R (raskin@seastar.jpl.nasa.gov) , Jet Propulsion Laboratory, 4800 Oak Grove Dr, 300-320, Pasadena, CA 91109

The Semantic Web for Earth and Environmental Terminology (SWEET) provides a set of upper-level ontologies constituting a concept space of Earth system science. These ontologies can be used, mapped, or extended by developers of specialized domain ontologies. SWEET components are being adopted within a diverse range of applications, including: the Geosciences Network (GEON), the Marine Metadata Initiative (MMI), the Virtual Solar Terrestrial Observatory (VSTO), and the Earth Science Markup Language (ESML). SWEET includes 12 ontologies, decomposed into component parts that can be reassembled to meet the needs of user communities. For example, the Property ontology terms (e.g., temperature, pressure) can be associated with measurable (observable) quantities of a dataset. The Substance ontology provides representations of the substance in which a property is being measured (e.g., air, water, rock). The Earth Realm ontology provides representations for the environmental regions of the Earth (e.g., atmospheric boundary layer, ocean mixed layer). The Data and Service ontology enables representations of how data are captured, stored, and accessed. The Numerics ontology entries represent 2-D and 3-D objects or spatial/temporal entities and relations. The Human Activities ontology captures the human side or applications of Earth science. The Phenomena ontology describes major geophysical or geophysical-related events. All of the ontologies are written in the OWL-DL language to give domain specialists a starting vocabulary, over which layers, synonyms, or extensions can be applied.

IN41B-07

Speeding up ontology creation of scientific terms

* Bermudez, L E (bermudez@mbari.org) , Monterey Bay Aquarium Research Institute, 7700 Sandholdt Road, Moss Landing, CA 95039 United States
Graybeal, J (graybeal@mbari.org) , Monterey Bay Aquarium Research Institute, 7700 Sandholdt Road, Moss Landing, CA 95039 United States

An ontology is a formal specification of a controlled vocabulary. Ontologies are composed of classes (similar to categories), individuals (members of classes) and properties (attributes of the individuals). Having vocabularies expressed in a formal specification like the Web Ontology Language (OWL) enables interoperability due to the comprehensiveness of OWL by software programs. Two main non-inclusive strategies exist when constructing an ontology: an up-down approach and a bottom-Up approach. The former one is directed towards the creation of top classes first (main concepts) and then finding the required subclasses and individuals. The later approach starts from the individuals and then finds similar properties promoting the creation of classes. At the Marine Metadata Interoperability (MMI) Initiative we used a bottom-Up approach to create ontologies from simple-vocabularies (those that are not expressed in a conceptual way). We found that the vocabularies were available in different formats (relational data bases, plain files, HTML, XML, PDF) and sometimes were composed of thousands of terms, making the ontology creation process a very time consuming activity. To expedite the conversion process we created a tool VOC2OWL that takes a vocabulary in a table like structure (CSV or TAB format) and a conversion-property file to create automatically an ontology. We identified two basic structures of simple-vocabularies: Flat vocabularies (e.g., phone directory) and hierarchical vocabularies (e.g., taxonomies). The property file defines a list of attributes for the conversion process for each structure type. The attributes included metadata information (title, description, subject, contributor, urlForMoreInformation) and conversion flags (treatAsHierarchy, generateAutoIds) and other conversion information needed to create the ontology (columnForPrimaryClass, columnsToCreateClassesFrom, fileIn, fileOut, namespace, format). We created more than 50 ontologies and generated more than 250,000 statements (or triples). The previous ontologies allowed domain experts to create 800 relations allowing to infer 2200 more relations among different vocabularies in the MMI workshop "Advancing Domain Vocabularies" held in Boulder Aug, 2005.

IN41B-08

Semantic Modeling for Goal-driven Distributed Data Assimilation and Collaboration

* Bose, P (prasanta.bose@lmco.com) , Lockheed Martin Advanced Technology Center, O/ADBS, B252 3251 Hanover St, Palo Alto, CA 94304 United States
Hurlburt, N (hurlburt@lmssal.com) , Lockheed Martin Advanced Technology Center, O/ADBS, B252 3251 Hanover St, Palo Alto, CA 94304 United States

Future NASA missions involving constellation of spacecraft (e.g. Leonardo, MAGCON), offer a unique opportunity for better coordinated scientific observations and for deeper understanding of the earth, sun, and their connections as a set of interacting systems. Such missions also present several unique challenges to make the data system work. These include the need to integrate and assimilate data spanning multiple sensors and missions and the need to fuse and present this data to researchers in a seamless, timely and efficient manner. Advances in semantic modeling of services (includes webservices) and resources via semantic markup languages and exploitation of underlying ontologies provides a promising direction for naming services and developing interfaces, tools and middleware infrastuctures for goal-driven science operations. In our Collaborative Sun-Earth Connector (COSEC) project - we have been developing a core ontology for space science operations and have developed a middleware infrastructure that exploit such ontologies in addressing the above challenges. This talk will present our current insights and progress in developing such a system.