Earth and Space Science Informatics [IN]

IN23D
 MC:3018  Tuesday  1340h

Frontiers in Advanced Information Systems Technology


Presiding:  G Prescott, NASA Earth Science Technology Office; M Albjerg, NASA Earth Science Technology Office

IN23D-01

OASIS - Optimized Autonomous Space - In-situ Sensorweb

* Song, W songwz@wsu.edu, Washington State University, 14204 NE Salmon Creek Ave, Vancouver, WA 98686,
LaHusen, R rlahusen@usgs.gov, Cascades Volcano Observatory, U.S. Geological Survey, Vancouver, WA 98686,
Kedar, S Sharon.kedar@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109,
Shirazi, B shirazi@wsu.edu, Washington State University, 14204 NE Salmon Creek Ave, Vancouver, WA 98686,
Chien, S steve.chien@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109,
Doubleday, J jdoubled@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109,
Webb, F frank.webb@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109,
Tran, D danny.tran@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109,
Davies, A Ashley.davis@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Dr, Pasadena, CA 91109,

An interagency team of computer scientists, earth and space scientists are collaborating to develop a sensor web system optimized for rapid deployment at active volcanoes. The primary goals of this Optimized Autonomous Space In-situ Sensorweb (OASIS) are: 1) integrate complementary space and in-situ (ground- based) elements into an interactive, autonomous sensorweb; 2) advance sensorweb power and communication resource management technology; and 3) enable scalability for seamless infusion of future space and in-situ assets into the sensorweb. This three year project started with a rigorous multi-disciplinary interchange that resulted in a system requirements document aimed to guide the design of OASIS and future networks and to achieve the project stated goals. Based on those guidelines, we have developed fully self- contained in-situ nodes that integrate GPS, seismic, infrasonic and ash detection sensors. The nodes in the wireless sensor network are linked to the ground control center through a highly optimized mesh network for remote geophysical monitoring operation. OASIS also features an autonomous bidirectional interaction between ground nodes and instruments on the EO-1 space platform through alarming capabilities at the command and control center. We have successfully completed a trial deployment with 5 nodes in the crater of Mount St. Helens, Washington, and demonstrate that sensor web technology provides unprecedent fine- scale real-time continuous data acquisition and interaction for earth science community. We are now optimizing component performance and increasing ease of user interaction for the final demonstration by end of 2009.

http://sensorweb.vancouver.wsu.edu

IN23D-02

A Smart Sensor Web for Ocean Observation: Integrated Acoustics, Satellite Networking, and Predictive Modeling

* Arabshahi, P payman@apl.washington.edu, Applied Physics Laboratory, University of Washington, 1013 NE 40th Street, Seattle, WA 98105, United States
Chao, Y Yi.Chao@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Drive, Pasadena, CA 91109, United States
Chien, S steve.chien@jpl.nasa.gov, Jet Propulsion Laboratory, 4800 Oak Grove Drive, Pasadena, CA 91109, United States
Gray, A aagray@u.washington.edu, Applied Physics Laboratory, University of Washington, 1013 NE 40th Street, Seattle, WA 98105, United States
Howe, B M bhowe@hawaii.edu, Ocean and Resources Engineering, University of Hawaii, 2540 Dole Street, Holmes Hall 402, Honolulu, HI 96822, United States
Roy, S sroy@u.washington.edu, Electrical Engineering, University of Washington, Paul Allen Center AE100R, Seattle, WA 981195, United States

In many areas of Earth science, including climate change research, there is a need for near real-time integration of data from heterogeneous and spatially distributed sensors, in particular in-situ and space- based sensors. The data integration, as provided by a smart sensor web, enables numerous improvements, namely, 1) adaptive sampling for more efficient use of expensive space-based sensing assets, 2) higher fidelity information gathering from data sources through integration of complementary data sets, and 3) improved sensor calibration. The specific purpose of the smart sensor web development presented here is to provide for adaptive sampling and calibration of space-based data via in-situ data. Our ocean-observing smart sensor web presented herein is composed of both mobile and fixed underwater in-situ ocean sensing assets and Earth Observing System (EOS) satellite sensors providing larger-scale sensing. An acoustic communications network forms a critical link in the web between the in-situ and space-based sensors and facilitates adaptive sampling and calibration. After an overview of primary design challenges, we report on the development of various elements of the smart sensor web. These include (a) a cable-connected mooring system with a profiler under real-time control with inductive battery charging; (b) a glider with integrated acoustic communications and broadband receiving capability; (c) satellite sensor elements; (d) an integrated acoustic navigation and communication network; and (e) a predictive model via the Regional Ocean Modeling System (ROMS). Results from field experiments, including an upcoming one in Monterey Bay (October 2008) using live data from NASA's EO-1 mission in a semi closed-loop system, together with ocean models from ROMS, are described. Plans for future adaptive sampling demonstrations using the smart sensor web are also presented.

IN23D-03

An Overview of the Data Products and Technologies Provided by the Global Hydrology Resource Center

* Hardin, D dhardin@itsc.uah.edu, The University of Alabama in Huntsville, 301 Sparkman Drive, Huntsville, AL 35899, United States
Conover, H hconover@itsc.uah.edu, The University of Alabama in Huntsville, 301 Sparkman Drive, Huntsville, AL 35899, United States
Graves, S sgraves@itsc.uah.edu, The University of Alabama in Huntsville, 301 Sparkman Drive, Huntsville, AL 35899, United States
Goodman, M michael.goodman@nasa.gov, NASA Marshall Space Flight Center, 320 Sparkman Drive, Huntsville, AL 35805, United States

The Global Hydrology Resource Center (GHRC) is one of twelve data centers that make up the Distributed Active Archive Centers (DAAC) Alliance. The GHRC collects and distributes climate research quality data and associated products from satellite, aircraft and in-situ instruments, primarily in the fields of lightning detection, microwave imaging, and convective moisture. In addition the researchers at the GHRC working with atmospheric scientists have developed robust advanced information systems applications that enable the use of NASA and other data by scientists and the broader user community. The primary data of the GHRC is lightning data. Raw instrument data from the Lightning Imaging Sensor (LIS) and its precursor the Optical Transient detector (OTD) along with derived products, validation data, and ancillary in-situ lightning data (like that from the National Lightning Detection Network) make up the suite of lightning data sets. This is due in part because the LIS science computing facility is co-located with the GHRC and the LIS team utilizes GHRC services to acquire, process, and archive new and updated lightning datasets and products for their research. In this role, the GHRC serves the global lightning research community and is responsible for the sole archive of lightning data from NASA's LIS and OTD instruments. The GHRC has contributed to numerous NASA field campaigns in various roles dating back to the mid 1990s. During the series of Convection and Moisture experiments (CAMEX) beginning with CAMEX-3 in 1998, the GHRC provided mission support data to the science teams during the experiment, then archived and distributed the experiment data post mission. In 2001, during the CAMEX-4 mission, field experiment operations were revolutionized when project and mission scientists used the GHRC-developed on-line collaboration system for mission planning and execution, and to perform post-experiment analysis. Using web-based forms, flight and science reports were filed from the field and automatically archived by the GHRC. After the campaign, instrument data were ingested and archived along with the real-time reports creating a complete history of the mission. The GHRC user community consists of a broad range of researchers, decision makers, national and international educators, state and local government, commercial entities and the general public. Like the other NASA data centers, the GHRC distributes data to users world wide and provides assistance through the User Services Office. Data from the GHRC are being used in numerous scientific studies, and have been cited in hundreds of publications. By making NASA data available to this broad community the GHRC helps NASA reach its goal to benefit society.

http://ghrc.nsstc.nasa.gov/

IN23D-04

Semantics of data and service registration to advance interdisciplinary information and data access.

* Fox, P P pfox@ucar.edu, HAO/ESSL/NCAR, PO Box 3000, Boulder, CO 80307, United States
McGuinness, D L dlm@cs.rpi.edu, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, United States
McGuinness, D L dlm@cs.rpi.edu, McGuinness Associates, 4 Shaker Bay Road, Latham, NY 12110, United States
Raskin, R raskin@jpl.nasa.gov, JPL/NASA, 4800 Oak Grove Drive, Pasadena, CA 91109, United States
Sinha, A K pitt.lab@vedu, Virginia Tech, 4044 Derring Hall (0420), Blacksburg, VA 24061, United States

In developing an application of semantic web methods and technologies to address the integration of heterogeneous and interdisciplinary earth-science datasets, we have developed methodologies for creating rich semantic descriptions (ontologies) of the application domains. We have leveraged and extended where possible existing ontology frameworks such as SWEET. As a result of this semantic approach, we have also utilized ontologic descriptions of key enabling elements of the application, such as the registration of datasets with ontologies at several levels of granularity. This has enabled the location and usage of the data across disciplines. We are also realizing the need to develop similar semantic registration of web service data holdings as well as those provided with community and/or standard markup languages (e.g. GeoSciML). This level of semantic enablement extending beyond domain terms and relations significantly enhances our ability to provide a coherent semantic data framework for data and information systems. Much of this work is on the frontier of technology development and we will present the current and near-future capabilities we are developing. This work arises from the Semantically-Enabled Science Data Integration (SESDI) project, which is an NASA/ESTO/ACCESS-funded project involving the High Altitude Observatory at the National Center for Atmospheric Research (NCAR), McGuinness Associates Consulting, NASA/JPL and Virginia Polytechnic University.

IN23D-05

Automatic, Real-Time Algorithms for Anomaly Detection in High Resolution Satellite Imagery

* Srivastava, A N ashok.n.srivastava@nasa.gov, NASA Ames Research Center, Mail Stop Mail Stop 269-4, Moffett Field, CA 94035, United States
Nemani, R R ramakrishna.r.nemani@nasa.gov, NASA Ames Research Center, Mail Stop Mail Stop 242-4, Moffett Field, CA 94035, United States
Votava, P Petr.Votava-1@nasa.gov, CSU Monterey Bay, 100 Campus Center, Seaside, CA 93955,
Votava, P Petr.Votava-1@nasa.gov, NASA Ames Research Center, Mail Stop Mail Stop 242-4, Moffett Field, CA 94035, United States

Earth observing satellites are generating data at an unprecedented rate, surpassing almost all other data intensive applications. However, most of the data that arrives from the satellites is not analyzed directly. Rather, multiple scientific teams analyze only a small fraction of the total data available in the data stream. Although there are many reasons for this situation one paramount concern is developing algorithms and methods that can analyze the vast, high dimensional, streaming satellite images. This paper describes a new set of methods that are among the fastest available algorithms for real-time anomaly detection. These algorithms were built to maximize accuracy and speed for a variety of applications in fields outside of the earth sciences. However, our studies indicate that with appropriate modifications, these algorithms can be extremely valuable for identifying anomalies rapidly using only modest computational power. We review two algorithms which are used as benchmarks in the field: Orca, One-Class Support Vector Machines and discuss the anomalies that are discovered in MODIS data taken over the Central California region. We are especially interested in automatic identification of disturbances within the ecosystems (e,g, wildfires, droughts, floods, insect/pest damage, wind damage, logging). We show the scalability of the algorithms and demonstrate that with appropriately adapted technology, the dream of real-time analysis can be made a reality.

http://dashlink.arc.nasa.gov

IN23D-06

Using Medium Resolution Earth Observation Data to Monitor Sensitive Industrial Activities

* Verstraete, M M Michel.Verstraete@jrc.it, Institute for Environment and Sustainability (IES), EC DG-JRC, 2749 Via Enrico Fermi, Ispra (VA), 21020, Italy
Hunt, L A Linda.A.Hunt@nasa.gov, Science Systems and Applications, Inc., One Enterprise Parkway, Suite 200, Hampton, VA 23666-5845, United States
Gonçalves, J Joao.Goncalves@jrc.it, Institute for the Protection and Security of the Citizen (IPSC), EC DG-JRC, 2749 Via Enrico Fermi, Ispra (VA), 21020, Italy

Space-borne Earth Observation (EO) techniques have been used for decades to monitor climate and environmental processes, but the application of these tools to quantitatively characterize sensitive industrial complexes and especially to monitor safety - or security - related sites is still in its infancy. Photo- interpretation of very high spatial resolution imagery (from 1 to 10 m sampling frequency) for specific, pre- defined sites by specially-trained operators remains the main (or only) approach so far. This paper shows that medium resolution EO sensors (spatial sampling of the order of 250 m) and modern physically-based techniques of data analysis may prove complementary to these traditional techniques, as the radiometric, spatial, temporal, spectral and directional signatures of targets of interest can help characterize their nature, properties and structure, and differentiate them from ambient background. The re-processing of existing archives will be shown to be useful for the documentation of events, and possibly processes, that have remained hitherto unknown. The preliminary results that will be presented concern the construction time-line of an alleged (undeclared) nuclear facility. Comparing the performance of the MISR and MODIS sensors that implement quite different technologies leads to suggestions concerning the specifications of future EO sensors.

IN23D-07

Scalable Scientific Data Mining in Distributed, Peer-to-Peer Environments

* Borne, K D kborne@gmu.edu, George Mason University, 4400 University Drive, MS 6A2, Fairfax, VA 22030,
Kargupta, H hillol@cs.umbc.edu, UMBC, 1000 Hilltop Circle, Baltimore, MD 21250,
Das, K kamalika_das@yahoo.com, UMBC, 1000 Hilltop Circle, Baltimore, MD 21250,
Griffin, W griffin5@cs.umbc.edu, UMBC, 1000 Hilltop Circle, Baltimore, MD 21250,
Giannella, C cgiannel@cs.loyola.edu, Loyola College, 4501 N. Charles Street, Baltimore, MD 21210,

Data-intensive science and knowledge discovery involving very large sky surveys are playing increasingly important roles in today's astronomy research. This Discovery Informatics scientific approach is evolving as a core research paradigm in all science disciplines. In particular, Astroinformatics is developing as the formalization of data-intensive astronomy for research and education. Nearly completed projects (such as the Sloan Digital Sky Survey SDSS, the 2-Micron All-Sky Survey 2MASS, and the GALEX All-Sky Survey) and future projects (such as the WISE All-Sky Survey, Pan-STARRS, and the Large Synoptic Survey Telescope LSST) are destined to produce enormous catalogs of astronomical sources. These collections are naturally distributed and heterogeneous, in addition to being terascale, petascale, and beyond. It is this virtual collection of terabyte and (eventually) petabyte catalogs that will enable remarkable new scientific discoveries through the integration and cross-correlation of data across multiple survey dimensions (time, wavelength, and sky coverage). However, this will be difficult to achieve without a computational backbone that includes support for queries and data mining across distributed virtual tables of de-centralized, joined, and integrated sky survey catalogs. Moreover, use of local data management systems such as MyDB, MySpace in AstroGrid, and Grid Bricks for storing and managing user's local data is becoming increasingly popular. This is opening up the possibility of constructing Peer-to-Peer (P2P) networks for data sharing and mining. We will report on our research in these areas. We are exploring the possibility of using distributed and P2P data mining technology for exploratory astronomy from data integrated and cross-correlated across these multiple sky surveys. We will report on new scientific results, including new explorations of the classical fundamental plane problem, in which multiple dimensions of galaxy parameter space can be reduced to a hyperplane in lower dimensions. Since the attributes which define the fundamental plane span two data repositories (SDSS and 2MASS) instead of one, we focus on cross-matching them through the NVO, and we then apply distributed data mining algorithms to analyze these data distributed over a large number of compute nodes. Distributed data mining techniques will not require scientists to download massive chunks of data for scientific discovery and will thus enable them to use distributed database queries across distributed virtual tables of de-centralized, joined and integrated sky survey catalogs. This will make the existing client-server-based astronomy data services richer by providing the power of distributed and P2P data mining technology.

IN23D-08

Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps

Folk, M mfolk@hdfgroup.org, The HDF Group, 1901 S. First St, Suite C-2, Urbana, IL 61820, United States
* Duerr, R rduerr@nsidc.org, NSIDC/CIRES University of Colorado, 449 UCB, Boulder, CO 80309-0449, United States

A preponderance of data from NASA's Earth Observing System (EOS) are archived in the HDF Version 4 (HDF4) format. The long-term preservation of these data is critical for climate and other scientific studies going many decades into the future. Its rich structure, platform independence, and Application Programming Interfaces (API) and libraries make HDF4 very effective for working with the large and complex collection of EOS data products. Unfortunately, these features are achieved by employing a complex internal byte layout of HDF4 files, so future readability of HDF4 data depends on the preservation of the software that can interpret that layout. Having a way to access HDF4 data independent of a library could improve its viability as an archive format, and consequently give confidence that HDF4 data will be readily accessible forever, even if the HDF4 API and library are gone. To address the need to simplify long-term access to EOS data stored in HDF4, a collaborative study between The HDF Group and NASAs Earth Science Data Centers investigated a new approach to accessing data in HDF4 files based on the creation of independent maps that describe the data in HDF4 files, and tools that can use these maps to recover data from those files. With this approach, relatively simple programs could extract the data from an HDF4 file, bypassing the need for the HDF4 library. This report will describe the HDF4 mapping study, which included an assessment of the range of HDF4 formatted data held by NASA, development of a prototype HDF4 layout mapping language and format, and development of prototype tools to create layout maps and to read HDF4 data using layout maps. The report will also describe future plans to put the layout map approach into practice, including plans to get feedback from the community on the approach, the specification, and other matters.