IN31B-1134
Implementation of CUAHSI-HIS Community Project Components in a Local Observatory
The deployment of the eleven WATERS Network local observatories using CUAHSI-HIS project products
showed that water observations data collected by academic investigators could be stored, published on the
Internet, federated with water observations data published by water agencies, and searched using a concept
framework that connects with variables in each individual data source. For many within the water resources
community, the CUAHSI-HIS community project represents a new opportunity to approach the management,
publication, and analysis of their data systematically - i.e., moving from collections of
ASCII text or spreadsheet files to relational data models.
This research describes the initial efforts carried out by a University of Iowa research group during the
component implementation of a hydrologic community project in a local CI-based digital watershed (DW). The
goal was to test what types of data query the DW can handle and see how it performs in use cases where
data streams are coupled with models for continuous forecasting. This paper also discusses the general
context for the DW development and summarizes the lessons learned by the group during this initial
developmental stage. Given the uniform and scalable nature of the community project components, it is
expected that the workflows presented herein are transferable to other users and other watersheds.
http://his08.iihr.uiowa.edu/uicc
IN31B-1135
CBEO:N, Chesapeake Bay Environmental Observatory as a Cyberinfrastructure Node
Chesapeake Bay Environmental Observatory (CBEO) is an NSF-supported project focused on studying hypoxia in Chesapeake Bay using advanced cyberinfrastructure (CI) technologies. The project is organized around four concurrent and interacting activities: 1) CBEO:S provides science and management context for the use of CI technologies, focusing on hypoxia and its non-linear dynamics as affected by management and climate; 2) CBEO:T constructs a locally-accessible CBEO test bed prototype centered on spatio-temporal interpolation and advanced querying of model runs; 3) CBEO:N incorporates the test bed CI into national environmental observation networks, and 4) CBEO:E develops education and outreach components of the project that translate observational science for public consumption. CBEO:N activities, which are the focus of this paper, are four-fold: - constructing an online project portal to enable researchers to publish, discover, query, visualize and integrate project-related datasets of different types. The portal is based on the technologies developed within the GEON (the Geosciences Network) project, and has established the CBEO project data server as part of the GEON network of servers; * developing a CBEO node within the WATERS network, taking advantage of the CUAHSI Hydrologic Information System (HIS) Server technology that supports online publication of observation data as web services, and ontology-assisted data discovery; *developing new data structures and metadata in order to describe water quality observational data, and model run output, obtained for the Chesapeake Bay area, using data structures adopted and modified from the Observations Data Model of CUAHSI HIS; * prototyping CBEO tools that can be re-used through the portal, in particular implementing a portal version of R-based spatial interpolation tools. The paper describes recent accomplishments in these four development areas, and demonstrates how CI approaches transform research and data sharing in environmental observing systems.
IN31B-1136
Dashboard Visualization for Diverse User Communities
As environmental research begins to intersect further with public policy, a diverse community of both technical and non-technical users is becoming engaged in the process of scientific analysis. These new communities of users, broadly defined as stakeholders, necessitate scientific visualizations consisting of simplified key indicators of environmental status, with the ability to delve into the indicators more deeply if desired. In order to indicate environmental status, a component of change should be integrated, suggesting automatically updating indicators - essentially a real-time visualization. Another key component is that the information be available at-a-glance, with minimal interaction between the visualization and the stakeholder. Lastly, these visualizations need to be readily accessible to stakeholders with diverse levels of software expertise. A new dashboard visualization is introduced which aims to fulfill these requirements of this newly broadened research community. This dashboard consists of four distinct views which show real-time and historical data for an entire environmental system, coupled with methods for filtering the information for extreme values or particular locations. The dashboard accepts input based on the Really Simple Syndication (RSS) standard and standard text files. This input is generated utilizing a custom library for analysis and querying of the Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI) Web services. The input generation components are automated through the use of Cyberintegrator, developed at the National Center for Supercomputing Applications (NCSA). The efficacy of this visualization is demonstrated for the WATERS Network testbed in Corpus Christi Bay, Texas, an environmental system which experiences seasonal hypoxia.
IN31B-1137
Data Stream Technologies for the Semantic Web
In environmental observing systems, as well as in other engineering communities, semantic-web technologies offer a promising choice as a self-describing data exchange mechanism. Nevertheless, these technologies still face many challenges and limitations, among those the management of time-related data. This limitation becomes more prominent in environmental sciences where time series are ubiquitous. We partially address this limitation regarding management of time series through the development of a system that bridges semantic-web technologies and high performance data stream systems. The system we developed acts like a data management middle-ware that allows transparent utilization of multiple stream managers, such as Ring Buffer Network Bus DataTurbine, to transmit and store data, while making it efficiently accessible to semantic- web aware systems by describing the data using a middle-level OWL ontology capable of the description of arbitrary streams, storing metadata in standard RDF stores and indexing it for efficient processing of time- related queries. To improve performance, the system can handle independently-set levels of data granularity and segmentation for the three processes of transmission, storage and time-based indexing and access. This technology has already been employed to enable temporal transformations to near-real-time NEXRAD Level II data at a digital watershed developed at NCSA.
IN31B-1138
Semantic Document Library: A Virtual Research Environment for Documents, Data and Workflows Sharing
The Semantic Document Library (SDL) was driven by use cases from the environmental observatory communities and is designed to provide conventional document repository features of uploading, downloading, editing and versioning of documents as well as value adding features of tagging, querying, sharing, annotating, ranking, provenance, social networking and geo-spatial mapping services. It allows users to organize a catalogue of watershed observation data, model output, workflows, as well publications and documents related to the same watershed study through the tagging capability. Users can tag all relevant materials using the same watershed name and find all of them easily later using this tag. The underpinning semantic content repository can store materials from other cyberenvironments such as workflow or simulation tools and SDL provides an effective interface to query and organize materials from various sources. Advanced features of the SDL allow users to visualize the provenance of the materials such as the source and how the output data is derived. Other novel features include visualizing all geo-referenced materials on a geospatial map. SDL as a component of a cyberenvironment portal (the NCSA Cybercollaboratory) has goal of efficient management of information and relationships between published artifacts (Validated models, vetted data, workflows, annotations, best practices, reviews and papers) produced from raw research artifacts (data, notes, plans etc.) through agents (people, sensors etc.). Tremendous scientific potential of artifacts is achieved through mechanisms of sharing, reuse and collaboration – empowering scientists to spread their knowledge and protocols and to benefit from the knowledge of others. SDL successfully implements web 2.0 technologies and design patterns along with semantic content management approach that enables use of multiple ontologies and dynamic evolution (e.g. folksonomies) of terminology. Scientific documents involved with many interconnected entities (artifacts or agents) are represented as RDF triples using semantic content repository middleware Tupelo in one or many data/metadata RDF stores. Queries to the RDF enables discovery of relations among data, process and people, digging out valuable aspects, making recommendations to users, such as what tools are typically used to answer certain kinds of questions or with certain types of dataset. This innovative concept brings out coherent information about entities from four different perspectives of the social context (Who-human relations and interactions), the casual context (Why – provenance and history), the geo-spatial context (Where – location or spatially referenced information) and the conceptual context (What – domain specific relations, ontologies etc.).
IN31B-1139
Acquisition Of Rainfall Dataset And The Application For The Automatic Harvester In The Chesapeake Bay Region
The objective of this study is the preparation and indexing of rainfall data products for ingestion into the Chesapeake Bay Environmental Observatory (CBEO) node of the CUAHSI/WATERs network. Rainfall products (which are obtained and then processed based on the WSR-88D NEXRAD network) are obtained from the NOAA/NWS Advanced Hydrologic Prediction Service that combines the Multi-sensor Precipitation Estimate (MPE) data generated by the Regional River Forecast Centers and Hydro-NEXRAD rainfall data generated as a service by the University of Iowa. The former is collected on 4*4 km grid (HRAP) with a daily average temporal resolution and the latter on a 1minute*1minute degree grid with hourly values. We have generated a cut-out for the Chesapeake Bay Basin that contains about 9,300 nodes (sites) for the MPE data and about 300,000 nodes (sites) for the Hydro-NEXRAD product. Automated harvesting services have been implemented for both data products. The MPE data is harvested from its download site using ArcGIS which in turn is used to extract the data for the Chesapeake Bay watershed before a scripting program is used to scatter the data into the ODM. The Hydro-NEXRAD is downloaded from a web-based system at the University of Iowa which permits downloads for large scale watersheds organized by Hydraulic Unit Codes (HUC). The resulting ASCII is then automatically parsed and the information stored alongside the MPE data. The two data products stored side-by-side then allows a comparison between them addressing the accuracy and agreement between the methods used to arrive at rainfall data as both use the raw reflectivity data from the WSD-88D system.
IN31B-1140
Virtual Sensors in a Web 2.0 Digital Watershed
The lack of rainfall data in many watersheds is one of the major barriers for modeling and studying many environmental and hydrological processes and supporting decision making. There are just not enough rain gages on the ground. To overcome this data scarcity issue, a Web 2.0 digital watershed is developed at NCSA(National Center for Supercomputing Applications), where users can point-and-click on a web-based google map interface and create new precipitation virtual sensors at any location within the same coverage region as a NEXRAD station. A set of scientific workflows are implemented to perform spatial, temporal and thematic transformations to the near-real-time NEXRAD Level II data. Such workflows can be triggered by the users' actions and generate either rainfall rate or rainfall accumulation streaming data at a user-specified time interval. We will discuss some underlying components of this digital watershed, which consists of a semantic content management middleware, a semantically enhanced streaming data toolkit, virtual sensor management functionality, and RESTful (REpresentational State Transfer) web service that can trigger the workflow execution. Such loosely coupled architecture presents a generic framework for constructing a Web 2.0 style digital watershed. An implementation of this architecture at the Upper Illinois Rive Basin will be presented. We will also discuss the implications of the virtual sensor concept for the broad environmental observatory community and how such concept will help us move towards a participatory digital watershed.
IN31B-1141
A coupling framework as a virtual hydrological laboratory for water, sediment and nutrients modeling at the catchment scale
Watershed is a complex system with some degree of organization. Interactions and feedbacks among all its sub-systems could give rise to emergent patterns at the specific temporal and spatial scales. To explore the hidden processes and emergent patterns, detailed measurements and observations are required which, however, are usually inadequate and even unavailable for most real basins. A virtual hydrological laboratory coupling various mass cycles including water, sediment, nutrients and carbon is necessary for scientific hydrology as well as practical purposes, applicable directly at the watershed scale and flexible enough that associated closure relations can be changed easily to account for site conditions (Wood et al., 2007). In most existing coupling frameworks, the coupled processes are not solved simultaneously, and the corresponding closure relationships are fixed and cannot be easily revised. A new framework is proposed based on the representative elementary watershed (REW) approach (Reggiani et al., 1998; Tian et al., 2006) and THREW model (Tian et al., 2008), which is applied to the Upper Sangamon River Basin, Illinois which has undergone intensive human interferences due to agriculture. Diagnostic analysis is carried out with the model towards exploring the influences of agriculture practices on hydrological processes. Crop transpiration and tile drainage are found to play important roles in the runoff processes. Scaling behavior of water, sediment, and nutrient export at the watershed scale are also analyzed, and the interactions of different mass cycles on the resulting scaling exponents are also discussed. Keywords: diagnostic, virtual hydrological laboratory, feedback
IN31B-1142
Low-energy, low-budget sensor web enablement of an amateur weather station
Sensor Web Enablement (OGC SWE) has developed in into a powerful concept with many potential
applications in environmental monitoring and in other fields. This has spurred development of software
applications for Sensor Observation Services (SOS), while the development of client applications still lags
behind. Furthermore, the deployment of sensors in the field often places tight constraints on energy and
bandwidth available for data capture and transmission.
As a "proof of concept" we equipped an amateur weather station with low-budget, standard components to
read the data from its base station and feed it into a sensor observation service using its standard web-
service interface. We chose the weather station as an example because of its simple measured phenomena
and its low data volume. As sensor observation service we chose the open source software package offered
by the 52North consortium.
Power consumption can be problematic when deploying a sensor platform in the field. Instead of a common
PC we used a Network Storage Link Unit (NSLU2) with a Linux operating system, a configuration also known
as "Debian SLUG". The power consumption of a "SLUG" is of the order of 2 to 5 Watt, compared to 40W in a
small PC. The "SLUG" provides one ethernet and two USB ports, one used by its external USB hard-drive.
This modular setup is open to modifications, for example the addition of a GSM modem for data transmission
over a cellular telephone network.
The simple setup, low price, low power consumption, and the low technological entry-level allow many
potential uses of a "SLUG" in environmental sensor networks in research, education and citizen science. The
use of a mature sensor observation service software allows an easy integration of monitoring networks with
other web services.
http://hdl.handle.net/10273/GFZ.SWE.1000
IN31B-1143
Optimizing Streamgage Network through Spatial Evolutionary Algorithms toward Digital Watershed Development
The streamflow gage network is at the base of the infrastructure of any digital watershed that is supposed to provide reliable, dynamic, online streamflow and water quality information for watershed management. The task for streamgage network optimization is to decide the location of gages and the concomitant link to reliability of water management decisions. The gage network is modeled as a large-scale spatial system, which brings challenges because large-scale spatial problems are computationally expensive for traditional optimization methods. However, spatial patterns (information) provide some opportunities for upgrading regular optimization algorithms for spatial problems. Spatial Evolutionary Algorithms (SEA) is designed to incorporate the spatial knowledge of the system into an evolutionary algorithm. A hierarchical tree structure is designed to encode the spatial datasets in the gage network and shows an increased computational efficiency for large-scale spatial problems. Adopting the tree structure, we design and implement the operators of crossover and mutation, which forms the unique elements of the newly developed SEA. Specifically, splitting and merging procedures based on sensitivity in the mutation operation introduce exploitation into mutation and helps to speed up the optimal search; crossover swaps the nodes within the same area between individuals, maintains locally good information and passes it on to future generations. This study applies the SEA to a case study watershed, the Salt Creek watershed. The streamgage network refinement is defined as a multi-objective optimization problem to maximize the effectiveness of the network extension and minimize the cost. A consistent framework is developed to integrate the model calibration and streamflow gage network optimization. We employ Cross-Entropy (CE) as a measure of incremental information gain and maximize CE to search for the optimal network refinement. The results from the case study watershed demonstrate the effectiveness and computational efficiency of the SEA. Expansion of the method for sensor network design is also discussed.
IN31B-1144
A data model for environmental scientists
Environmental science encompasses a wide range of disciplines from water chemistry to microbiology, ecology and atmospheric sciences. Studies often require working across disciplines which differ in their ways of describing and storing data such that it is not possible to devise a monolithic one-size-fits-all data solution. Based on our experiences with Consortium of the Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) Observations Data Model, Berkeley Water Center FLUXNET carbon-climate work and by examining standards like EPA's Water Quality Exchange (WQX), we have developed a flexible data model that allows extensions without need to altering the schema such that scientists can define custom metadata elements to describe their data including observations, analysis methods as well as sensors and geographical features. The data model supports various types of observations including fixed point and moving sensors, bottled samples, rasters from remote sensors and models, and categorical descriptions (e.g. taxonomy) by employing user-defined-types when necessary. It leverages ADO .NET Entity Framework to provide the semantic data models for differing disciplines, while maintaining a common schema below the entity layer. This abstraction layer simplifies data retrieval and manipulation by hiding the logic and complexity of the relational schema from users thus allows programmers and scientists to deal directly with objects such as observations, sensors, watersheds, river reaches, channel cross-sections, laboratory analysis methods and samples as opposed to table joins, columns and rows.
IN31B-1145
MAEviz: Seismic Risk Assessment Environment - bridging the gap between research and practice
In the field of hazard risk assessment, a new generation of tools is needed to allow researchers and
practicing engineers the ability to leverage investments in new methodologies and software infrastructure
while enabling customization to local conditions. MAEviz represents such a next generation of seismic risk
assessment environment, based on the Mid-America Earthquake (MAE) Center research and designed to be
extended, customized, and evolved to meet the needs of specific organizations and regions.
It is built upon an extensible Open Services Gateway Initiative (OSGi) based GIS application platform and
leverages distributed content management, workflow, and virtual-organization based design concepts.
MAEviz has been developed as a collaboration between the Mid-America Earthquake (MAE) Center
community and the National Center for Supercomputing Applications (NCSA) and is an implementation of the
MAE Centers research Consequence-based Risk Management (CRM) methodology.
MAEviz is open source and provides a modern GIS application interface with sophisticated visualization and
reporting capabilities. It also incorporates mechanisms to integrate distributed data sources, provides
approximately 50 reusable analyses, and has the ability to save and share scenarios to coordinate work in
distributed teams. As an Eclipse Rich Client Platform (RCP) application, MAEviz is composed of multiple
plugins and clearly defined extension points that leverage numerous open source libraries such as Geotools,
iText, kTable, JFreeChart and the Visualization Toolkit (VTK) as well as middleware components developed
at NCSA. This architecture enables MAEviz to rapidly be extended with new scientific analyses and allows
reuse of the base GIS environment capabilities.
MAEviz helps bridge the gap between researchers, practitioners and policy-makers by integrating the latest
research findings and most accurate data, state-of-the-art methodologies in an extensible open source
platform.
http://maeviz.ncsa.uuc.edu