IN12A-01 INVITED
Evolution of Data Policies and Practices Within NASA's Heliophysics Program During Solar Cycle 23
The during the last solar cycle, the data environment sponsored by NASA's Heliophysics program underwent a significant transition. Data sets were opened up, placed on-line, and made available for the research community at large. The causes for this shift in paradigm were many: the IT revolution, new programmatic standards, shifts in attitudes about the roles of the principle investigators (PIs), etc. There were many intended and unintended consequences resulting form this paradigm shift. For example, by opening up their data sets, the PIs unintentionally created instantaneous virtual peer groups (the data users) that reviewed data quality on a regular basis. Feedback gained from the broad spectrum of users ultimately improved the data quality and was appreciated by the PIs. Simultaneously during this period, there was a shift in the way the Heliophysics community analyzed their data: from single instrument, point analyses of observed phenomena to a more systematic approach analyzing classes of phenomena by multi-series, multi- instrument investigations. This transition was enabled by the opening of the data sets and, in turn, the analytic transition provided more demand on the data providers to make access to the data sets more efficient and transparent - a positive feedback loop. This paper will trace the evolution of the Heliophysics data policies and practices and present the results of a survey to ascertain the benefits and problems of undertaking the transition.
IN12A-02 INVITED
Data Sharing and Science: Legal, Normative, and Social Issues
The volume of scientific data, and the interconnectedness of the systems under study, makes integration of
data a necessity. For example, life scientists must integrate data from across biology and chemistry to
comprehend disease and discover cures, and climate change scientists must integrate data from wildly
diverse disciplines to understand our current state and predict the impact of new policies.
The technical challenge of such integration is significant, although emerging technologies appear to be
helping. But the forest of terms and conditions around data make integration difficult to legally perform in
many cases. One approach might be to develop and recommend a single license: any data with this license
can be integrated with any other data under this license. But this approach, which implicitly builds on
intellectual property rights and the ideas of licensing as understood in software and culture, is difficult to
scale for scientific uses. There are too many databases under too many terms already, and it is unlikely that
any one license or suite of licenses will have the correct mix of terms to gain critical mass and allow massive-
scale machine integration of data.
This talk will instead lay out principles for open access data and a protocol for implementing those principles,
as well as describe various international efforts to make data and databases legally and technically
interoperable.
http://sciencecommons.org
IN12A-03
Analyzing Architecture Patterns for Enterprise-level, multi-mission Data Systems
There are a number of agencies and institutions who, as part of their business processes, store and manage large amounts of science-related data resources. These organizations are challenged with collecting, processing, archiving, providing access to, and distributing that data. The Data Systems are tuned to meet their organization's responsibilities and work within the policies and governance mechanisms of those organizations. While serving different kinds of data, and targeting different user communities and supporting different business processes, these systems are all facing similar problems. This paper presents an approach for the analysis of the underlying architectures of these Data Systems. This approach is based on the development of a common way to describe the drivers, influences and characteristics of enterprise-level data systems. With the establishment of a common language for describing this pattern, we propose a survey where information describing these myriad solutions can be captured and analyzed. With this analysis, the community can identify common trends in enterprise-level architectures for science Data Systems, identify the strengths and weaknesses of alternatives, and establish a platform for the analysis of alternative architectures for future Data Systems.
IN12A-04 INVITED
Using Laser Scanning and Open Source GIS to Analyze Trends and Impacts of Topographic Change
Airborne, on-ground and laboratory laser scanning technologies are stimulating the development of new methods and tools for terrain analysis by providing unprecedented spatial and temporal detail of landscape surface evolution. For example, the most recent 2007 and 2008 airborne lidar surveys of North Carolina coast provide data with density of points one magnitude higher than most of the previous surveys and allow us to further extend our study of coastal terrain dynamics both in terms of time period and spatial detail. We demonstrate the capabilities of open source GRASS GIS for integration of lidar point cloud data from different sources, computing time series of digital elevation models and their analysis aimed at identification of vulnerable coastal locations. In addition to traditional approaches, we take advantage of the GRASS GIS support for 3D raster data and present a novel approach to analysis of terrain based on volume representation. We use the 1996-2008 time series of high resolution elevation data to derive a volume with time used as the third dimension and elevation as a voxel value. Isosurfaces are then used for representation of evolution of a given elevation contour, for example a shoreline or an elevation contour close to dune ridges, providing an alternative to representation by hard to read overlapping 2D contours. The second application combines real-world high resolution DEM with a flexible physical 3D model at 1:1200 scale, indoor laser scanner and projectors into a tangible geospatial modeling system. The flexible scale model can be manually modified to create various land management scenarios by adding ponds, dams, buildings or changing roughness of the surface. The link to open source GIS GRASS allows us to run simulations on the modified landscapes and project the results over the laboratory model with the aim to study the conditions that can elucidate poorly understood aspects of landscape dynamics. We illustrate the system's application for environmental design and its impact on watershed runoff distribution using several new scenarios created by modifications of the flexible model surface.
IN12A-05 INVITED
Lightweight Collaboration on Data: Wikis, Semantics, and the Web
Despite 30 years of work in computer science, it has been almost impossible to build large science knowledge bases that can be widely authored, usefully reused and extended, and able to answer moderately sophisticated questions. Tools that try to do this have typically been extremely specialized, training-intensive, and ultimately unscalable. This talk will focus on a new wiki-oriented approach to collaboration on technical data, based on fusing standard Wikipedia technology with the flexible data models and query languages of the Semantic Web. These "semantic wikis" offer an intriguing combination of scalable consensus-oriented collaboration, the ability to flexibly combine text and structured data, proven usability across different organizational models, and web-scale reuse.
IN12A-06
Estimating Root Zone Soil Moisture at Distant Sites Using MODIS NDVI and EVI in a Semi- Arid Region
This study investigated the potential of Normalized Difference Vegetation Index and Enhanced Vegetation Index to estimate in situ measured root zone soil moisture at the VI pixel, referred to as the "native" site in this study, and at increasingly distant sites within the same climatic setting. Soil moisture data was obtained from Soil Climate Analysis Network sites in the Texas-New Mexico border area, and NDVI and EVI products from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on the Terra satellite. Results show that same depth soil moisture values are highly correlated (R = 0.53 to 0.85) at distant sites as far as 150 Km from the native site, and NDVI and EVI are highly correlated at each site (R = 0.95 to 0.98). Raw time series has higher mean correlations than deseasonalized time series at every depth. Deseasonalized time series produce consistent results, and NDVI and EVI both have a significant correlation with soil moisture at distant sites (R = 0.35 to 0.73). R reaches maximum values with a 5 to 10 day time lag. NDVI is slightly higher than EVI. R decreases with distance from the native site. Regression analysis was also conducted using deseasonalized NDVI time series and deseasonalized soil moisture with a 5 day time lag. The model predicted soil moisture at all depths (adjusted R squared = 0.44 to 0.59). Overall, deseasonalized NDVI produces consistent results, and shows that NDVI can estimate root zone soil moisture at distant sites in the study area.
IN12A-07
PALM: a Parallel Dynamic Coupler
In order to efficiently represent complex systems, numerical modeling has to rely on many physical models at
a time: an ocean model coupled with an atmospheric model is at the basis of climate modeling. The continuity
of the solution is granted only if these models can constantly exchange information. PALM is a coupler
allowing the concurrent execution and the intercommunication of programs not having been especially
designed for that.
With PALM, the dynamic coupling approach is introduced: a coupled component can be launched and can
release computers' resources upon termination at any moment during the simulation.
In order to exploit as much as possible computers' possibilities, the PALM coupler handles two levels of
parallelism. The first level concerns the components themselves. While managing the resources, PALM
allocates the number of processes which are necessary to any coupled component. These models can be
parallel programs based on domain decomposition with MPI or applications multithreaded with OpenMP. The
second level of parallelism is a task parallelism: one can define a coupling algorithm allowing two or more
programs to be executed in parallel.
PALM applications are implemented via a Graphical User Interface called PrePALM. In this GUI, the
programmer initially defines the coupling algorithm then he describes the actual communications between the
models.
PALM offers a very high flexibility for testing different coupling techniques and for reaching the best load
balance in a high performance computer. The transformation of computational independent code is almost
straightforward. The other qualities of PALM are its easy set-up, its flexibility, its performances, the simple
updates and evolutions of the coupled application and the many side services and functions that it offers.
http://www.cerfacs.fr/globc/PALM_WEB/
IN12A-08
LISIRD: Where to go for Solar Irradiance Data
LASP, the Laboratory for Atmospheric and Space Physics, has been providing web access to solar irradiance
measurements, reference spectra, composites and model data covering the solar spectrum from .1 to 2400
nm through LISIRD, the LASP Interactive Solar IRradiance Datacenter.
No single instrument can measure the solar spectral irradiance from X-rays to the IR, but the ensemble of
LASP instruments can. LISIRD uses a single interface to provide easy, logical access to a variety of mission
data, merged in time and wavelength. Daily space weather measurements are available, including total solar
irradiance (TSI), Lyman Alpha (121 nm), Magnesium II Index (280 nm), He II (30.4 nm), FE XVI (33.5 nm), and
the FUV continuum (145 to 165 nm).
More recently, LISIRD has recently added the Whole Heliosphere Interval (WHI) Solar Irradiance time series,
which provides a quiet sun reference spectra for the period of April 10-16 of 2008. LISIRD also recently
added a composite solar spectral irradiance product over the range of 120 to 400 nm for the time period from
November 8, 1978 to August 1, 2005. This product, created by Mathew Deland at SSAI, merges data from
six different satellites into a single SSI product. And, we are currently adding a time series for daily solar
spectral irradiance from 1950 to 2006, created by Judith Lean of the Naval Research Lab. This product
adjusts observed irradiance for a given wavelength with parameters that represent known sources of
variability at that wavelength.
LISIRD remains committed to improving data access in a variety of ways. We are planning and developing a
means for the broader community of scientists to easily determine data availability for a particular date range
without having to know mission or instrument details. Improved data subsetting will allow users to request
only the time range or spectra that users need, making data management generally easier.
We expect to continue to enhance our data offerings. Future vision for LISIRD also includes integration of
improved data visualization and analysis tools. We welcome contributions from solar science community
members who wish to share data and tools they have developed. We also expect to integrate LISIRD with
the Virtual Solar Observatory (VSO) and other relevant Virtual Observatories (VOs) for a more integrated
and complete user experience.
We are actively seeking input and feedback to improve LISIRD from interested users of this data. Towards
this end we have provided a survey at our website and to AGU attendees. Those who use LISIRD and
provide feedback will have the opportunity to help steer LISIRD development. Let us know what you would
like to see and we will try to make it happen!
http://lasp.colorado.edu/LISIRD/