SF31A-0701 0800h
Web Services in Earth Science Data Systems: Realities of Brokering, Chaining and Federating Services
The key to the next-generation of science will be unlocked through synergy between resources held by many differing individuals and organizations. These resources include missions, information, data, models, algorithms and related operations/services. Key infrastructure technology can enable this synergy - allowing scientists to discover and utilize these resources in innovative and efficient ways. Currently, the process to invoke data operations without the use of service brokering capabilities can be laborious. Scientists must first find data of interest, locate appropriate service(s) (e.g. subsetting), download, manage and execute the service on their own server before they can start the process of doing their job of science analysis or research. The vision of the future is that a scientist sits in front of a laptop computer thinking about a science problem. Scenario: A scientist in Illinois wants to study the re-vegetation process over recently active volcanoes and needs satellite images that are 1) mostly cloud-free, 2) exist in the green, red, and near-infrared portions of the spectrum, and 3) occur over specific, discrete regions of the Northern Hemisphere. The multi-instrument data, need to be co-registered, re-projection and delivered in NetCDF format. In this scenario, the user interacts with information about the data and services to identify data resources needed to analyze science question. Of the petabytes of data and hundreds of data services available, the scientist is able to rapidly get a comprehensive, seamless view tailored to her research and analyze the issue. It is not necessary for the scientist to know that behind the scenes exists an enterprise of seamless, distributed data and services that find the data and streams it to a series of chained services in physically disparate locations to apply data transformation algorithms that meet the user's specifications and ultimately delivers the data to the user's laptop. However, the scientist will have access to the complete provenance of the newly generated product. The user receives only the data that she needs in a form that is optimized for her study in rapid turnaround. Today, technology is emerging to begin to meet this vision. Web Services technology is a key part of this evolving solution. There are still challenges to face. This presentation will discuss Web Services technology and its application to achieve this vision as well as the realities of the challenges and issues we still face in making the vision a reality.
http://eos.nasa.gov/echo
SF31A-0702 0800h
EARTH SYSTEM ATLAS: A Platform for Access to Peer-Reviewed Information about process and change in the Earth System
A great deal of time, effort and resources have been expended on global change research to date, but dissemination and visualization of the key pertinent data sets has been problematical. Toward that end, we are constructing an Earth System Atlas which will serve as a single compendium describing the state of the art in our understanding of the Earth system and how it has responded to and is likely to respond to natural and anthropogenic perturbations. The Atlas is an interactive web-based system of data bases and data manipulation tools and so is much more than a collection of pre-made maps posted on the web. It represents a tool for assembling, manipulating, and displaying specific data as selected and customized by the user. Maps are created "on the fly" according to user-specified instructions. The information contained in the Atlas represents the growing body of data assembled by the broader Earth system research community, and can be displayed in the form of maps and time series of the various relevant parameters that drive and are driven by changes in the Earth system at various time scales. The Atlas is designed to display the information assembled by the global change research community in the form of maps and time series of all the relevant parameters that drive or are driven by changes in the Earth System at various time scales. This will serve to provide existing data to the community, but also will help to highlight data gaps that may hinder our understanding of critical components of the Earth system. This new approach to handling Earth system data is unique in several ways. First and foremost, data must be peer-reviewed. Further, it is designed to draw on the expertise and products of extensive international research networks rather than on a limited number of projects or institutions. It provides explanatory explanations targeted to the user's needs, and the display of maps and time series can be customize by the user. In general, the Atlas is designed provide the research community with a new opportunity for data observation and manipulation, enabling new scientific discoveries in the coming years. An initial prototype of the Atlas has been developed and can be manipulated in real time.
http://earthsystematlas.sr.unh.edu
SF31A-0703 0800h
A Syntactic and Semantic Metadata Solution for Intelligent Applications in Earth Science
Different Earth science data collections are archived and distributed with different levels of metadata. Some files are metadata rich and thus allow applications to read this data while other files are metadata deficient and require the development of data specific reader modules for applications. This metadata heterogeneity leads to interoperability problems for scientists trying to use different types of data files with their applications. The Earth Science Markup Language (ESML) is designed as an elegant solution to this problem. ESML is an interchange technology that enables data interoperability by providing the additional metadata needed to allow applications to read the data file without requiring the development of new reader modules. Scientists can write external metadata files using the ESML Schema to describe the structure of their data files. Applications can then utilize the ESML Library to parse this ESML Description File and decode the data format. Software developers can now build data format independent scientific applications utilizing the ESML Library. Furthermore, the ESML Description File allows the addition of semantic metadata defined in different domain ontologies. Thus, the ESML Description File provides a complete syntactic and semantic metadata description of the data and links to the ontologies to provide a context for the semantic terms. This metadata description allows the development of intelligent applications that can now not only read the data but also understand and "use" the data. An agent based scientific data processing framework is one such intelligent application and will be presented in this paper.
SF31A-0704 0800h
Enabling Semantic Interoperability for Earth System Science
Data interoperability across heterogeneous systems can be hampered by differences in terminology, particularly when multiple scientific communities are involved. To reconcile differences in semantics, a common semantic framework was created as a collection of ontologies. Such a shared understanding of concepts enables ontology-aware software tools to understand the meaning of terms in documents and web pages. The ontologies were created as part of the Semantic Web for Earth and Environmental Terminology (SWEET) prototype. The ontologies provide a representation of Earth system science knowledge and associated data, organized in a scalable structure, bulding on the keywords developed by the NASA Global Change Master Directory (GCMD). An integrated search tool consults the ontologies to enable searches without an exact term match. The ontologies can be used within other applications (such as Earth Science Markup Language descriptors) and future semantic web services in Earth system science.
http://sweet.jpl.nasa.gov
SF31A-0705 0800h
Bathymetry, Multibeam, and Coastal Relief: Data on Demand
The National Oceanic and Atmospheric Administration's (NOAA) National Geophysical Data Center (NGDC), Boulder, Colorado is distributing increasing quantities of data via the Internet. With evolving technology and broader bandwidth, it is possible to deliver large volumes (up to 2 GB) of data directly over the Internet and to provide server-side-generated products and services with more user options. In 2004, NGDC (www.ngdc.noaa.gov) initiated a web-based multibeam bathymetric data system,leveraging off geospatial and relational database technologies being utilized at the Center. The system has an ESRI ArcIMS interface, which handles the geospatial character of the data, and provides a standardized GIS interface and tool suite. All of this rests on a foundation Oracle database containing an inventory and metadata. It also uses software developed at NOAA's Pacific Marine Environmental Laboratory, which scripts the National Science Foundation-supported software packages: MBSystem (for managing multibeam data) and Generic Mapping Tools (GMT) (for mapping and display). In whole, the MultiBeam Bathymetric Data Base (MBBDB) (Virden, et al., "Multibeam Bathymetric Data at NOAA/NGDC," OTO '04) enables the web user to browse, discover, review, select, map, and download multibeam data directly, without human intervention by NGDC. A similar development, the NGDC Coastal Relief Model (CRM), http://www.ngdc.noaa.gov/mgg/coastal/coastal.html, is available for download at a variety of sampling intervals. This data set is built on multibeam and conventional offshore hydrography, resampled into a 3 arc-second lat-lon grid, which is matched to a 3 arc-second version of the USGS National Elevation Database (NED). The resulting fusion gives a continuous elevation surface model from the seafloor, across the coastline and onto the land. These data provide the foundation for a multitude of environmental studies and models, such as Tsunami or storm surge propagation, run up, and inundation. The Internet now affords NGDC the opportunity to deliver large volumes of data and products to an increasingly broad array of users, far beyond the traditional scientific disciplines we served a decade ago
http://map.ngdc.noaa.gov/website/mgg/multibeam/viewer.htm
SF31A-0706 0800h
An Integrated Data Management System for Marine Geoscience Research
The National Science Foundation is currently supporting dedicated databases for the Ridge 2000, MARGINS, and U.S. Antarctic Programs. We are developing an integrated Marine Geoscience Data Management System (MG-DMS; www.marine-geo.org) which supports the full range of data types for all of these programs. Construction of a single system allows us to consolidate our hardware, software, and system administration infrastructure; work more efficiently; and focus greater resources on developing a unified metadata schema, controlled vocabularies, and interoperability with other databases. We have developed a Web-based client which offers forms-based search and download capability, and a Java$^{TM}$ application (GeoMapApp; www.geomapapp.org) which offers map-based exploration of multiple data sets and the capability to create custom grids and images. The MG-DMS supports data from a wide variety of disciplines (biological, geological, and physical/chemical oceanographic); types (both physical samples and sensor data); spatial and temporal resolutions; and processing grades (from raw field data through derived products). Metadata records and controlled vocabularies are maintained locally in a central catalog, while the data files themselves are referenced as URLs and may reside in any partner repository. Our hierarchical metadata schema consists of Entries (typically a cruise, flight, or traverse); Dives (deployments of a daughter platform); Lines (survey transects); Stations (discrete survey locations, typically where physical samples are collected); Parameters (data types); and Arbitrary Digital Objects (data files). We are also developing a Lightweight Directory Access Protocol (LDAP)-based authentication system for proprietary data access and user profile management. We are pursuing data interoperability with partner repositories including the Ocean Floor Petrology Database (PetDB) at LDEO, Seismic Processed Data Center (SDC) at UTIG, Ocean Drilling Program Database (Janus) at TAMU, National Deep Submergence Facility (NDSF) at WHOI, Geological Data Center (GDC) at SIO, and National Geophysical Data Center (NGDC). Levels of interoperability range from URL referencing of remote data files (basic) to exchange of XML metadata records (intermediate) to Web Feature and Coverage Services (advanced).
http://www.marine-geo.org
SF31A-0707 0800h
Real-Time Access to Altimetry and Operational Oceanography Products via OPeNDAP/LAS Technologies : the Example of Aviso, Mercator and Mersea Projects
The Products and Services (P&S) department in the Space Oceanography Division at CLS is in charge of diffusing and promoting altimetry and operational oceanography data. P&S is so involved in Aviso satellite altimetry project, in Mercator ocean operational forecasting system, and in the European Godae /Mersea ocean portal. Aiming to a standardisation and a common vision and management of all these ocean data, these projects led to the implementation of several OPeNDAP/LAS Internet servers. OPeNDAP allows the user to extract via a client software (like IDL, Matlab or Ferret) the data he is interested in and only this data, avoiding him to download full information files. OPeNDAP allows to extract a geographic area, a period time, an oceanic variable, and an output format. LAS is an OPeNDAP data access web server whose special feature consists in the facility for unify in a single vision the access to multiple types of data from distributed data sources. The LAS can make requests to different remote OPeNDAP servers. This enables to make comparisons or statistics upon several different data types. Aviso is the CNES/CLS service which distributes altimetry products since 1993. The Aviso LAS distributes several Ssalto/Duacs altimetry products such as delayed and near-real time mean sea level anomaly, absolute dynamic topography, absolute geostrophic velocities, gridded significant wave height and gridded wind speed modulus. Mercator-Ocean is a French operational oceanography centre which distributes its products by several means among them LAS/OPeNDAP servers as part of Mercator Mersea-strand1 contribution. 3D ocean description (temperature, salinity, current and other oceanic variables) of the North Atlantic and Mediterranean are real-time available and weekly updated. LAS special feature consisting in the possibility of making requests to several remote data centres with same OPeNDAP configurations particularly fitted to Mersea strand-1 problematics. This European project (June 2003 to June 2004) sponsored by the European Commission was the first experience of an integrated operational oceanography project. The objective was the assessment of several existing operational in situ and satellite monitoring and numerical forecasting systems for the future elaboration (Mersea Integrated Project, 2004-2008) of an integrated system able to deliver, operationally, information products (physical, chemical, biological) towards end-users in several domains related to environment, security and safety. Five forecasting ocean models with data assimilation coming from operational in situ or satellite data centres, have been intercompared. The main difficulty of this LAS implementation has lied in the ocean model metrics definition and a common file format adoption which forced the model teams to produce the same datasets in the same formats (NetCDF, COARDS/CF convention). Notice that this was a pioneer approach and that it has been adopted by Godae standards (see F. Blanc's paper in this session). Going on these web technologies implementation and entering a more user-oriented issue, perspectives deal with the implementation of a Map Server, a GIS opensource server which will communicate with the OPeNDAP server. The Map server will be able to manipulate simultaneously raster and vector multidisciplinary remote data. The aim is to construct a full complete web oceanic data distribution service. The projects in which we are involved allow us to progress towards that.
http://www.cls.fr/html/oceano/general/applications/welcome_en.html
SF31A-0708 0800h
The "Virtual Puget Sound:" a Cyberinfrastructure (and Social Process) for Analysis of the Multi-Scaled Biophysics and Human Dimensions in a Mesoscale Coupled Land/Atmosphere/Marine System
Earth system sciences is being challenged by the intellectual and the societal requirements of how to quantify the spatial patterns and temporal dynamics of changes in the atmosphere, landscape, and seascape, including human resources management. There are multiple issues in how to do this. The first is establishing the multi-disciplinary basis of how to systematically organize the required geophysical elements, from the very slow geological process forming the basic template to the very fast moving event-driven processes brought on by an individual rainstorm. The second is how to mobilize, access, see, and interact with the very disparate sources of information required. The third problem, perhaps the most difficult, is how to get the disparate disciplinary and management experts to constructively interact. These requirements drove the process for establishing the PRISM "Virtual Puget Sound." The basic construct is recognizing the inherent time and space attributes of the landscape, and then constructing an informatics environment that will allow the respective elements to be brought together in a collaboratory. Central to the enterprise is the use of an XML-enabled DataStream, to mobilize data from archives to models to visualizations. Outcomes are addressing such regional issues and daily stream flow, seasonal water supply and demand, low oxygen in Hood Canal, and sewage treatment plan siting. This model is being extended, as an Earth System Module, elsewhere in the world, from the Amazon to the Mekong.
SF31A-0709 0800h
Autonomous Underwater Vehicle Data Management and Metadata Interoperability for Coastal Ocean Studies
Data from over 1000 km of Autonomous Underwater Vehicle (AUV) surveys of Monterey Bay have been collected and cataloged in an ocean observatory data management system. The Monterey Bay Aquarium Institute's AUV is equipped with a suite of instruments that include a conductivity, temperature, depth (CTD) instrument, transmissometers, a fluorometer, a nitrate sensor, and an inertial navigation system. Data are logged on the vehicle and upon completion of a survey XML descriptions of the data are submitted to the Shore Side Data System (SSDS). Instrument data are then processed on shore to apply calibrations and produce scientifically useful data products. The SSDS employs a data model that tracks data from the instrument that created it through all the consuming processes that generate derived products. SSDS employs OPeNDAP and netCDF to provide data set interoperability at the data level. The core of SSDS is the metadata that is the catalog of these data sets and their relation to all other relevant data. The metadata is managed in a relational database and governed by a Enterprise Java Bean (EJB) server application. Cross-platform Java applications have been written to manage and visualize these data. A Java Swing application - the Hierarchical Ocean Observatory Visualization and Editing System (HOOVES) - has been developed to provide visualization of data set pedigree and data set variables. Because the SSDS data model is generalized according to "Data Producers" and "Data Containers" many different types of data can be represented in SSDS allowing for interoperability at a metadata level. Comparisons of appropriate data sets, whether they are from an autonomous underwater vehicle or from a fixed mooring are easily made using SSDS. The authors will present the SSDS data model and show examples of how the model helps organize data set metadata allowing for data discovery and interoperability. With improved discovery and interoperability the system is helping us understand processes such as benthic pelagic coupling and factors that control phytoplankton productivity in coastal waters.
http://www.mbari.org/ssds/
SF31A-0710 0800h
Biospheric Monitoring and Forecasting
This research combines biospheric models with remotely sensed data and new computer science techniques to develop an operational global monitoring system based on intelligent data management and analysis. The prototype system, called Ecocast, has capabilities for rapid access, analysis, and utilization of large, heterogeneous data sets. The primary goal of this research is to develop, implement, and apply an adaptable architecture for automated conversion of large amounts of data from multiple sources into usable products. Ecocast is currently being used to produce nowcasts and forecasts of biospheric conditions from local to global scales. By bringing together domain experts from the computer science and Earth science communities, this project is leveraging diverse knowledge and resources to build the next generation of biospheric monitoring and forecasting systems. By the end of 2004, Ecocast will be producing 30 different daily ecosystem nowcasts and forecasts for use by scientists, educators, and decision makers.
http://ecocast.arc.nasa.gov
SF31A-0711 0800h
Emissions From the Terrestrial Biosphere
The terrestrial biosphere plays a critical role in the functioning of the earth system. Vegetation emits significant amounts of volatile organic compounds (VOC) and aerosols to the atmosphere through several pathways that include physiological and biochemical processes and disturbances such as wildfire and herbivory. Biogenic VOC emissions can affect chemical processes that determine air quality and control the lifetimes of longer lived chemical species. Direct aerosol emissions from vegetation and wildfires and secondary aerosols formed by biogenic VOC can impact public health, change cloud properties, and control climate processes. Biogenic emissions play a critical role in many atmospheric and biogeochemical processes. Therefore, to realistically simulate the earth system, including air quality and climate, reasonable estimates of biogenic emissions must be included in those simulations. This paper presents an overview of biogenic emissions from undisturbed vegetation and from wildfire. Models that simulate these emissions have been developed to create inputs for regional and global chemical transport models and for climate models. Despite the success in biogenic emission model development, technical challenges for such modeling still exist. Biogenic emission models use a variety of input information, including satellite data, field observations, and output from other models (e.g. NCEP, MM5, WRF). These inputs have a variety of spatial and temporal resolutions and are available in many different formats. Several of the challenges encountered when modeling biogenic emissions will be addressed, including difficulties in applying different input datasets due to format, size, and spatial resolution and limitations in software that hinder the processing of emission estimates.
SF31A-0712 0800h
The GFDL Data Portal: a Doorway to Sharing Model Outputs
In response to increasing community interest in the output of models run at GFDL, we have embarked upon the design of a data portal for sharing numerical model output. Effective sharing of model output requires software standards and tools and then utilizing those tools to build data portals. Among the tools that the GFDL data portal will utilize is the Live Access Server (LAS). LAS is a visualization and analysis tool which allows: presentation of a multitude of variables in an orderly way by easily defining and customizing arbitrary hierarchies; location of datasets and variables of interest from among thousands offered without navigating the complete dataset hierarchy; easy intercomparison of model results (http://ferret.pmel.noaa.gov/LAS). The OPeNDAP framework (http://www.opendap.org) will also be a key component in the GFDL data portal. This framework provides local data access to remote data, regardless of the remote storage format, and will allow for intercomparison of model results between groups of users. The Geophysical Fluid Dynamics Laboratory (GFDL) is a partner in the NASA-funded development of the Earth Systems Modeling Framework (ESMF) collaboration (http://www.esmf.ucar.edu). The ESMF is an effort to greatly improve coordination in the advancement of climate models through the development of a community framework to couple model components (http://www.esmf.ucar.edu). Working with the Global Organization for Earth System Science Portals (GO-ESSP, http://go-essp.gfdl.noaa.gov), the goal is to develop a software infrastructure using agreed-upon standards to provide distributed access to observed and simulated data from climate and weather communities. In this presentation we will discuss the GFDL data portal, the web site(s) through which many model outputs from GFDL are accessible, and its underlying standards and tools.
SF31A-0713 0800h
Describing Climate Numerical Models
The complexity of climate computer models has created an enormous challenge when trying to understand and compare resulting datasets and information. Efforts such as CF exist to describe the model data, but there has been no organized effort to uniformly describe numerical climate model components. The Centre for Global Atmospheric Modelling, in collaboration with the British Atmospheric Data Centre, Climateprediction.net and the Global Organization for Earth System Science Portals are developing a framework to describe climate numerical models based on XML. The numerical model schema describes the complete suite of possibilities for the components of Earth System models. The model experiment schema describes the unique settings, the instantaneous state of the model, made from all available possibilities the model components used to create that experiment. The creation of a standardized framework to describe numerical model components, and the experiments run using those components, will ultimately provide for vast variety of methods which will allow a user to better understand and use the models and the data derived from them -- searchable content, compare and contrast tools, browsing, display, queries, cataloging, personalized interfaces to replaces notebooks.
http://ugamp.nerc.ac.uk/bouton/model_metadata/index.php
SF31A-0714 0800h
Flexible Environments for Grand-Challenge Simulation in Climate Science
Current climate models are monolithic codes, generally in Fortran, aimed at high-performance simulation of the modern climate. Though they adequately serve their designated purpose, they present major barriers to application in other problems. Tailoring them to paleoclimate of planetary simulations, for instance, takes months of work. Theoretical studies, where one may want to remove selected processes or break feedback loops, are similarly hindered. Further, current climate models are of little value in education, since the implementation of textbook concepts and equations in the code is obscured by technical detail. The Climate Systems Center at the University of Chicago seeks to overcome these limitations by bringing modern object-oriented design into the business of climate modeling. Our ultimate goal is to produce an end-to-end modeling environment capable of configuring anything from a simple single-column radiative-convective model to a full 3-D coupled climate model using a uniform, flexible interface. Technically, the modeling environment is implemented as a Python-based software component toolkit: key number-crunching procedures are implemented as discrete, compiled-language components 'glued' together and co-ordinated by Python, combining the high performance of compiled languages and the flexibility and extensibility of Python. We are incrementally working towards this final objective following a series of distinct, complementary lines. We will present an overview of these activities, including PyOM, a Python-based finite-difference ocean model allowing run-time selection of different Arakawa grids and physical parameterizations; CliMT, an atmospheric modeling toolkit providing a library of 'legacy' radiative, convective and dynamical modules which can be knitted into dynamical models, and PyCCSM, a version of NCAR's Community Climate System Model in which the coupler and run-control architecture are re-implemented in Python, augmenting its flexibility and adaptability.
SF31A-0715 0800h
Cyberinfrastructure for Atmospheric Discovery
Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed to create an integrated, scalable for identifying, accessing, preparing, assimilating, predicting, managing, analyzing, mining, and visualizing a broad array of meteorological data and model output, independent of format and physical location. A transforming element of LEAD is Workflow Orchestration for On-Demand, Real-Time, Dynamically-Adaptive Systems (WOORDS), which allows the use of analysis tools, forecast models, and data repositories as dynamically adaptive, on-demand, Grid-enabled systems that can a) change configuration rapidly and automatically in response to weather; b) continually be steered by new data; c) respond to decision-driven inputs from users; d) initiate other processes automatically; and e) steer remote observing technologies to optimize data collection for the problem at hand. Although LEAD efforts are primiarly directed at mesoscale meteorology, the IT services being developed has general applicability to other geoscience and environmental science. Integration of traditional and new data sources is a crucial component in LEAD for data analysis and assimilation, for integration of (ensemble mining) of data from sets of simulations, and for comparing results to observational data. As part of the integration effort, LEAD is creating a myLEAD metadata catalog service: a personal metacatalog that extends the Globus MCS system and is built on top of the OGSA-DAI system developed at the National e-Science Center in Edinburgh, Scotland.
http://www.ncsa.uiuc.edu/Expeditions/MEAD
SF31A-0716 0800h
GENESIS SciFlo: Enabling Multi-Instrument Atmospheric Science Using Grid Workflows
The General Earth Science Investigation Suite (GENESIS) project is a NASA-sponsored partnership between the Jet Propulsion Laboratory, academia, and NASA data centers to develop a new suite of web services tools to facilitate multi-sensor investigations in Earth System Science. The goal of GENESIS is to enable large-scale, multi-instrument atmospheric science using combined datasets from the AIRS, MODIS, MISR, and GPS sensors. Investigations will include cross-comparison of spaceborne climate sensors, cloud spectral analysis, study of upper troposphere-strato-sphere water transport, study of the aerosol indirect cloud effect, and global climate model validation. The challenges are to bring together very large datasets, reformat and understand the individual instrument retrievals, co-register or re-grid the retrieved physical parameters, perform computationally-intensive data fusion and data mining operations, and accumulate complex statistics over months to years of data. To meet these challenges, we are developing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data access, subsetting, registration, mining, fusion, compression, and advanced statistical analysis. SciFlo is a system for Scientific Knowledge Creation on the Grid using a Semantically-Enabled Dataflow Execution Environment. SciFlo leverages Simple Object Access Protocol (SOAP) Web Services and the Grid Computing standards (Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable web services and executable operators into a distributed computing flow (operator tree). The SciFlo client & server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. Once an analysis has been specified for a chunk or day of data, it can be easily repeated with different control parameters or over months of data. We will discuss the design issues and solutions used in the implementation of SciFlo, including XML dataflow documents, heavy use of XML datatyping & semantic web concepts, parallel dataflow execution engines, data access simply by naming, and catalog lookup of operator bundles. To illustrate the SciFlo concepts, an example dataflow will be demonstrated in which atmospheric temperature and water vapor profiles from the AIRS, GPS, and MODIS instruments are retrieved using SOAP (data access) services, co-registered, and visually & statistically compared on demand.
http://genesis.jpl.nasa.gov
SF31A-0717 0800h
A Sample Data Publication: Interactive Access, Analysis and Display of Remotely Stored Datasets From Hurricane Charley
This paper is an example of what we call data interactive publications. With a properly configured workstation, the readers can click on "hotspots" in the document that launches an interactive analysis tool called the Unidata Integrated Data Viewer (IDV). The IDV will enable the readers to access, analyze and display datasets on remote servers as well as documents describing them. Beyond the parameters and datasets initially configured into the paper, the analysis tool will have access to all the other dataset parameters as well as to a host of other datasets on remote servers. These data interactive publications are built on top of several data delivery, access, discovery, and visualization tools developed by Unidata and its partner organizations. For purposes of illustrating this integrative technology, we will use data from the event of Hurricane Charley over Florida from August 13-15, 2004. This event illustrates how components of this process fit together. The Local Data Manager (LDM), Open-source Project for a Network Data Access Protocol (OPeNDAP) and Abstract Data Distribution Environment (ADDE) services, Thematic Realtime Environmental Distributed Data Service (THREDDS) cataloging services, and the IDV are highlighted in this example of a publication with embedded pointers for accessing and interacting with remote datasets. An important objective of this paper is to illustrate how these integrated technologies foster the creation of documents that allow the reader to learn the scientific concepts by direct interaction with illustrative datasets, and help build a framework for integrated Earth System science.
http://my.unidata.ucar.edu/content/projects/THREDDS/DataPublications/index.html
SF31A-0718 0800h
A comparison of three alternatives for generation of continuous rainfall data at point locations
The availability of high resolution rainfall data is becoming an important issue in the design and management of small (mostly urban) water resources systems. These sequences serve as inputs for a range of applications that include continuous flow simulation, rain water tank design, and the evaluation of alternate policies for assessment of environmental impacts in small catchment areas. This study compares three alternatives for generating sequences of high resolution point rainfall. These alternatives are: (a) A nonparametric approach which aims to re-use fractions of continuous rainfall with respect to their respective daily totals as the basis of disaggregating generated daily values to a continuous time scale, (b) A parametric alternative that generates continuous rainfall under the assumption of an alternating renewal process, and disaggregates it to a finer time step using a dimensionless mass curve, (c) A cascade based parametric procedure that uses the self scaling rationale to disaggregate daily rainfall to a finer time scale. These alternatives are compared based on their ability to generate extremes at various scales of aggregation, apart from a comparison of some of the basic statistical attributes of interest from the point of using these sequences for continuous flow simulation. Comparisons are based on the use of 88 and 129 years of continuous rainfall data from Sydney and Melbourne, Australia, respectively.
SF31A-0719 0800h
An Integrated Hydrologic Monitoring Network
Ecological studies depend on the ability to monitor an environment, collect data at appropriate spatial and temporal scales, and analyze that data from the diverse viewpoints of many relevant disciplines. Historically, environmental studies have been conducted by small teams of researchers, usually collecting data by hand at some set but low frequency, and organizing it according to ad hoc, project-specific goals. Recent years have seen dramatic advancement in the ability to gather environmental data remotely and therefore at much higher frequency. We are working to create a dynamic and integrated network of environmental sensors in natural environments to acquire real time data and create tools for visualization appropriate for different audiences to promote scientific exploration. Instrumentation includes an array of water quality and water level sondes and probes distributed throughout three Central Indiana counties. Instrument platforms currently include five river monitoring platforms utilizing YSI water quality and level probes; a lake buoy array that includes three YSI sonde packages monitoring physical, chemical and biological parameters; and over fifteen YSI and Solinist groundwater probes recording both level and water quality. Many sites are providing real-time data and several additional sites are scheduled to be online in the coming months. Visualization of this real time data from remote sensors distributed throughout Central Indiana provides numerous challenges. The benefits of successfully integrating remotely deployed environmental sensors in a post 9-11 world is obvious. We are working to bridge both the extremes associated with the frequency of data collection and the lack of data coordination by creating techniques for data networking and retrieval, and data management, analysis, and visualization capabilities that operate across a range of computing platforms to make this data immediately accessible and useful to a range of interested parties, across multiple disciplines. We are working to integrate multiple data streams into a coherent data base and create applications that allow users to view data from multiple instruments at different sites. Creating visualizations of real time, dynamic data from the everyday world and delivering it via web applications as well as through innovative display spaces will be a key outcome of this program. On-line tools for QA/QC, data queries, graphing, and sensitivity analysis are under development. Our goal is to use the instrumented sites to create analysis and presentation applications to foster a community of learners interested in understanding these ecosystems, and the larger environmental issues that they represent. This broad-based community will include environmental researchers, university faculty in lecture halls, math and science teachers, university and K-12 students, civic leaders, and educators at informal learning centers.
http://www.cees.iupui.edu
SF31A-0720 0800h
A Dynamic Metadata Community Profile for CUAHSI
Common Metadata standards typically lack of domain specific elements, have limited extensibility and do not always resolve semantic heterogeneities that could occur in the annotations. To facilitate the use and extension of metadata specifications a methodology called Dynamic Community Profiles, DCP, is presented. The methodology allows to overwrite elements definitions and to specify core elements as metadata tree paths. DCP uses the Web Ontology Language (OWL), the Resource Description Framework (RDF) and XML syntax to formalize specifications and to create controlled vocabularies in ontologies, which enhances interoperability. This methodology was employed to create a metadata profile for the Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI). The profile was created by extending ISO-19115:2003 geographic metadata standard and restricting the permissible values of some elements. The values used as controlled vocabularies were inferred from hydrologic keywords found in the Global Change Master Directory (GCMD) and from measurement units found in the Hydrologic Handbook. Also, a core metadata set for CUAHSI was formally expressed as tree paths, containing the ISO core set plus additional elements. Finally a tool was developed to test the extension and to allow creation of metadata instances in RDF/XML which conforms to the profile. Also this tool is able to export the core elements to other schema formats such as Metadata Template Files (MTF).
http://loki.cae.drexel.edu:8080/web/how/me/metadatacuahsi.html
SF31A-0721 0800h
Interface Design For Subsurface Characterization Based on Borehole Data Using ArcGIS and Geo Flow 3D
Ground water management system is used for simulation and prediction of the movement of fluid flow under the Earth surface. For simulation of the fluid flow, ground water management system requires input data, such as porosity, hydraulic conductivity and so on. Ground water management system needs reasonable data which should be based on reality. GIS database can offer such kind of information to ground water modeling software. Subsurface characterization process plays very important role in ground water management system because this can provide real input data. The main objective of this study is to develop a GIS application for the ground water management system with which a user can get the borehole information from GIS database, and extract 3D grid data model in order to characterize subsurface using Geo Flow 3D. There are many kinds of field data which can provide information about subsurface. Among them, borehole data was chosen as a main source of input data because borehole data offers much more direct information than others. Borehole data consists of at least more than 2 section information. Each section information has its own properties such as elevation of borehole, diameter of borehole, soil types and so on. In order for a user to get each borehole information, an interface was designed to extract the information of borehole from the GIS database. A user can get borehole information using this interface. A user can search the borehole information based on address and borehole ID, or a user can specify the borehole which a user wants to know by selecting the borehole on the map directly. Once a user select the borehole which a user want to know, a user can search the general borehole information such as recording date, elevation of borehole and so on. A user can also see each section drawn in the interface, and a user can get each section information by just clicking the section a user wants to know. Another interface was designed for the determination of dimension of 3D grid data model because a user also needs to know horizontal and vertical distribution of borehole data for the determination of dimension of 3D grid data model. Once a user decides the dimension of 3D grid data model, a user can build 3D grid data model. Based on the above interface, a user can construct 3D grid data model, and this 3D grid data model can be used for subsurface characterization using Geo Flow 3D.