Biogeosciences [B]

B33A

MC:Hall D Wednesday 1340h

MC:Hall D Wednesday 1340h

Data Assimilation in Biogeochemical Models Posters

B33A-0386

Estimation of daily global solar radiation in Vietnamese Mekong Delta area: A combinational application of statistical downscaling method and Bayesian inference

Long-term daily global solar radiation (GSR) data of the same quality in the 20th century has been needed as a baseline to assess the climate change impact on paddy rice production in Vietnamese Mekong Delta area (MKD: 104.5-107.5oE/8.2-11.2oN). However, though sunshine duration data is available, the accessibility of GSR data is quite poor in MKD. This study estimated the daily GSR in MKD for 30-yr (1978- 2007) by applying the statistical downscaling method (SDM). The estimates of GSR was obtained from four different sources: (1) the combined equations with the corrected reanalysis data of daily maximum/minimum temperatures, relative humidity, sea level pressure, and precipitable water; (2) the correction equation with the reanalysis data of downward shortwave radiation; (3) the empirical equation with the observed sunshine duration; and (4) the observation at one site for short term. Three reanalysis data, i.e., NCEP-R1, ERA-40, and JRA-25, were used. Also the observed meteorological data, which includes many missing data, were obtained from 11 stations of the Vietnamese Meteorological Agency for 28-yr and five stations of the Global Summary of the Day for 30-yr. The observed GSR data for 1-yr was obtained from our station. Considering the use of data with many missing data for analysis, the Bayesian inference was used for this study, which has the powerful capability to optimize multiple parameters in a non-linear and hierarchical model. The Bayesian inference provided the posterior distributions of 306 parameter values relating to the combined equations, the empirical equation, and the correction equation. The preliminary result shows that the amplitude of daily fluctuation of modeled GSR was underestimated by the empirical equation and the correction equation. The combination of SDM and Bayesian inference has a potential to estimate the long- term daily GSR of the same quality even though in the area where the observed data is quite limited.

B33A-0387

Estimating light use efficiency by linking flux tower data and fraction of absorbed PAR at chlorophyll level (FAPAR_{chl}) derived from daily MODIS observations

We used daily MODIS imagery obtained over 2001-2005 to analyze the seasonal and interannual
photosynthetic light use efficiency (LUE) of the Southern Old Aspen (SOA) flux tower site located near the
southern limit of the boreal forest in Saskatchewan, Canada. This forest stand extends for at least 3 km in all
directions from the flux tower. The MODIS daily reflectance products have resolution of 500 m at nadir and
> 500 m at off-nadir. To obtain the spectral characteristics of a standardized land area to compare with
tower measurements, we scaled up the nominal 500 m MODIS products to an area of 2.5 km × 2.5 km
(5×5 MODIS 500 m grid cells). We then used the 5×5 scaled-up MODIS products in a coupled
canopy-leaf radiative transfer model, PROSAIL-2, to estimate the fraction of photosynthetically active
radiation (PAR) absorbed by the photosynthetically active part of the canopy dominated by chlorophyll
(FAPAR_{chl}) versus that absorbed by the whole canopy (FAPAR_{canopy}). From the tower
measurements, we determined 90-minute averages for APAR and LUE for the physiologically active foliage
(APAR_{chl}, LUE_{chl}) and for the entire canopy (APAR_{canopy}, LUE_{canopy}). The flux
tower measurements of GEP were strongly related to the MODIS-derived estimates of APARchl (r^{2} =
0.78) but weakly related to APARcanopy (r^{2} = 0.33). Gross LUE (slope of GEP:APAR) between 2001
and 2005 for LUE_{chl} was 0.0241 μ mol C μ mol ^{-1} PPFD whereas LUE_{canopy} was
36% lower. Inter-annual variability in growing season (DOY 152-259) LUE_{chl} (μ mol C μ mol
^{-1} PPFD) ranged from 0.0225 in 2003 to 0.0310 in 2004. The five year time series of growing season
LUE_{chl} corresponded well with both the seasonal phase and amplitude of LUE from the tower
measurements. We conclude that LUE_{chl} derived from MODIS observations could provide a useful input
to land surface models for improved estimates of ecosystem carbon dynamics.

B33A-0388

What factor generates greater uncertainty in predicting carbon flux for North America: climate characterization or model choice?

Numerous efforts have begun to characterize a variety of sources of uncertainty in carbon flux estimates from both forward-modeling and inverse modeling approaches. One source of uncertainty is structural, created by the variety of approaches taken to select and characterize the most important biogeochemical processes. To begin to explore this structural uncertainty, we have used an ensemble of well-known models including CASA (Potter et al. (1993), version 2003.04.29), LPJ (Sitch et al. (2003), version 3.1.1-0.9.02), and BGC (White et al. (2000), version 5.0) with a consistent set of inputs for the period 1982-2006 for North America. Initially, the ensemble was run using input climate data interpolated from maximum, minimum and dew-point temperatures, precipitation, vapor pressure deficit, and incident daily solar radiation at stations from the National Climate Data Center's Global Summary of the Day, incorporating on average about 1900 stations. NCDC's Cooperative Summary of the Day data, available over the United States only, yielded a combined data set of approximately 9000 stations that was then used for the ensemble runs. The combined data set resulted in a significantly wetter surface than with the sparser set, resulting in noticeably larger gross primary production (GPP) estimates by models in the ensemble. Mexico and Canada remain significantly undersampled. Uncertainty due to the choice of a relatively sparse or dense station network was smaller than the structural uncertainty due to model choice.

B33A-0389

Constraint on oceanic hydrothermal 3He from inverse modeling

Natural 3He originating from mantle degassing represents a widely used tracer for deep ocean circulation. The enriched 3He signal is injected in the ocean along the ridges axis by hydrothermal processes and is next transported in the ocean by the deep circulation. This tracer has been extensively measured during the GEOSECS and WOCE programs and now offers a satisfactory global oceanic cover. It has also been used to evaluate different Global Ocean models' circulation during the OCMIP2 project (Ocean Carbon Model Intercomparison Project). However its interest is limited by the knowledge of its source function. In model simulations, the 3He source is set proportional to the ridge spreading rates, with a global rate of injection of 1000 mol/yr. We have attempted to constrain this source function using inverse modeling. We have used the NEMO ocean general circulation model and run basis source function for different source regions. A compilation of 3He data (WOCE and GEOSECS) has been performed, and used to constraint the intensity of the source for each region. The inverse model is using a Bayesian approach that was previously developed for atmospheric carbon studies. We have then investigated the two hypotheses that are prescribed to build the source function: the global intensity, and the proportionality to the ridges spreading rates.

B33A-0390

Assimilating Satellite Reflectance Data Into an Ecosystem Model to Constrain Estimates of Terrestrial Carbon Flux.

Earth Observation (EO) data provides a unique constraint to models of terrestrial vegetation growth. Such models are important tools for analysing biogeochemical cycles. Data Assimilation (DA) is a viable mechanism by which EO data may be used to constrain vegetation model behaviour. Assimilating high--level EO products, such as Leaf Area Index is an attractive option. There generally exists a linear relationship between the derived variable and the state vector of the model. This simplifies the implementation of the DA scheme and reduces computational overheads. However, high--level EO products often contain assumptions contrary to those in vegetation models and so this approach is unsatisfactory on a philosophical level. Furthermore, on a practical level it is often difficult to quantify error in high--level products; this is critical for well functioning DA schemes. This paper presents the alternative approach of assimilating so--called low--level EO products: in this case MODIS surface reflectance (MOD09). This is demonstrated with a simple ecosystem model (DALEC) and used to quantify the landscape scale carbon budget for a ponderosa pine site in Oregon. The assumptions made in the production of MOD09 from core satellite measurements do not generally conflict with those in vegetation models. Errors are also easier to quantify than for high--level products. To facilitate assimilation however, an observation operator is required to translate from the vegetation model to the surface reflectance. Bayesian calibration of DALEC coupled with an observation operator is undertaken using eddy--covariance data from a site on the Oregon transect, MOD09 reflectance, and field data. Using MOD09 data alone the model is re-calibrated for other nearby flux tower sites. In this fashion the utility of the low--level EO data to improve modelled estimates of the carbon budget is demonstrated. By extension it is possible to infer how well the EO data improves the carbon modelling at the landscape level.

B33A-0391 INVITED

Applications of Geostatistics to Data Assimilation in Biogeochemical Models

The field of geostatistics offers a rich set of tools for analyzing parameters that display spatial and/or
temporal autocorrelation. Historically, these methods have been used primarily for interpolating sparse
measurements of in situ data. More recently, however, methods based on geostatistical framework have
used in an increasing numbers of areas of earth science. This presentation will discuss a number of recent
developments in geostatistics relevant to data assimilation in biogeochemical models. The overall goal of the
presentation is to emphasize the need to explicitly account for spatial and temporal covariance in sampled
data, and the need to translate available data between relevant spatial and temporal scales. The emphasis
will be on presenting a common framework that can be used to develop problem-specific approaches. The
presented examples will include (i) the identification of environmental parameters controlling observed
variability in eddy covariance flux measurements, (ii) downscaling and upscaling observed spatial variability
across spatial scales, (iii) geostatistical inverse modeling for constraining carbon fluxes at fine spatial
resolutions, and (iv) merging of flux data and atmospheric concentration measurements for constraining
parameters in biospheric models.

http://www.umich.edu/~amichala/

B33A-0392

Global Monthly CO2 Flux Inversion Based on Results of Terrestrial Ecosystem Modeling

Most of our understanding of the sources and sinks of atmospheric CO2 has come from inverse studies of atmospheric CO2 concentration measurements. However, the number of currently available observation stations and our ability to simulate the diurnal planetary boundary layer evolution over continental regions essentially limit the number of regions that can be reliably inverted globally, especially over continental areas. In order to overcome these restrictions, a nested inverse modeling system was developed based on the Bayesian principle for estimating carbon fluxes of 30 regions in North America and 20 regions for the rest of the globe. Inverse modeling was conducted in monthly steps using CO2 concentration measurements of 5 years (2000 – 2005) with the following two models: (a) An atmospheric transport model (TM5) is used to generate the transport matrix where the diurnal variation n of atmospheric CO2 concentration is considered to enhance the use of the afternoon-hour average CO2 concentration measurements over the continental sites. (b) A process-based terrestrial ecosystem model (BEPS) is used to produce hourly step carbon fluxes, which could minimize the limitation due to our inability to solve the inverse problem in a high resolution, as the background of our inversion. We will present our recent results achieved through a combination of the bottom-up modeling with BEPS and the top-down modeling based on TM5 driven by offline meteorological fields generated by the European Centre for Medium Range Weather Forecast (ECMFW).

B33A-0393

The OptIC Data Assimilation Intercomparison: A Statistical Critique

The development of improved terrestrial carbon models has assumed great importance because of concerns
about significant climate-to-carbon feedback processes. The complexity of the interactions leads to
considerable difficulties in the process of model calibration. The OptIC intercomparison explored some
aspects of model calibration, using an idealised terrestrial carbon model. Participants were invited to estimate
model parameters in various cases defined by specified time series of the model state, with various forms of
added noise. The study identified the crucial importance of the choice of cost function. The present analysis
revisits the OptIC study, by considering it as an exercise in statistical estimation. This treats the observations
as random variables. Consequently parameter estimates, â, based on observations will also be
random variables whose distribution is known as the 'sampling distribution'. Key questions for any specific
case are:

Are departures from â/a_true =1 indication of bias or sampling error?

Under what
circumstance are uncertainty estimates (of Var[â]) reliable? We consider cases where the estimate is
obtained by minimising a cost function, Θ_{X}. Assuming that we know the true form of ℓ, the log
likelihood, there are three different characterisations of uncertainty that should be distinguished:

(i) The
uncertainty from maximum-likelihood estimates, corresponding (either exactly or asymptotically) to the
Cramer-Rao bound. In a realistic calibration situation, we won't be able to determine this because the 'true'
form of the likelihood is unknown.

(ii) The actual uncertainty associated with using a particular cost function.
If the true noise distribution is known, this can be calculated in simple cases and determined from simulations
in more complicated cases.

(iii) The 'formal uncertainty' based on assuming (usually incorrectly) that
Θ_{X} is the true likelihood. In the first stage of the analysis, the distinctions are illustrated by analytic
solutions, for the simple case of estimating a constant. The second stage of the study addresses the actual
OptIC model, using various noise models (thus defining ℓ) and various cost functions Θ. The
respective sampling distributions (i) and (ii) above are determined by Monte Carlo simulation. The formal
uncertainties are obtained from standard estimators. The final stage uses the experience from stage 2, to
consider how to deal with fact that the actual likelihood is unknown in real-world problems. Some of the main
possibilities are:

*estimate distribution parameters (e.g. variances)*: this involves augmenting the
parameter vector **a** - the serious limitation is that some knowledge/assumption about the likelihood is
still required;

*bypass the estimation of nuisance parameters*: this is a variant on the previous case and
is subject to the same limitations;

*compromise likelihoods*: average over a set of candidate estimators;

*robust estimators*: modify the maximum likelihood estimator so that it is less vulnerable to relaxation of
the distributional assumptions. Finally the *bootstrap* formalism applies a set of pseudo-realisations that
are used in an analysis that parallels the Monte Carlo estimation of the sampling distribution described in
stage 2.

B33A-0394

Surface CO2 Flux in Weekly Time Resolution Over the Globe Inferred From CONTRAIL Data set

Concentrations of CO2 observed on passenger aircrafts are ready for data assimilation in biogeochemical models. Five auto measurement system called the continuous CO2 measuring Equipments (CME) are installed on Boeing 747 and 777 and are measuring CO2 in every 10 second in ascending and descending mode and every 1 minute during level flight (Machida et al., doi:10.1175/2008JTECHA1082.1). The measurement system, named comprehensive observation network for trace gases by airliner (CONTRAIL) has been tested in 2006 and is in full operation since November 2006. In this presentation, we will show a preliminary result of inverse calculation to estimate weekly sources and sinks of CO2 in 2007 at 64 surface areas on the globe. About 30000 data world wide extending from 3km to 11 km in 2007 were selected from full data set due to a limitation of our solver. A global atmospheric transport model driven with a meteorological data set of ECMWF was used to derive a gain matrix which represents a response at a sampling point of concentrations from a continuous release of CO2 for a week at individual area. Fluxes in 56 weeks starting from 5th December 2006 were estimated. The root mean squared error between concentrations simulated using weekly fluxes and CONTRAIL was 1.6ppm which improved 12 percent from that of concentrations simulated using monthly fluxes estimated from other data set.

B33A-0395

Optimizing ORCHIDEE Land Surface Model Using Fluxnet Data and Satellite FAPAR Product: What can we Learn?

Terrestrial ecosystem models (TEM) are key components in climate and global carbon cycle models. Despite of shared common structures, there is a lack of sound knowledge on the response functions controlling C/water flows, and on the spatial and temporal variations in parameters, which thus yield divergent modelled responses of ecosystems to environmental changes. Recent availability of data on terrestrial activity should help to identify poorly represented or missing processes, and provide confidence intervals on parameter estimates and forecasts. Using a state of the art mechanistic vegetation model (ORCHIDEE) that can be run at local or global scales, we investigated at the Laboratoire des Sciences du Climat et de l"Environnement (LSCE) the benefit of several types of data to improve the model simulations. ORCHIDEE is based on the concept of plant functional type and computes on a half hourly time step the energy, carbon, and water balances. We will review recent findings based on the assimilation of Fluxnet data (Carbon, sensible and latent heat fluxes) and remotely sensed fraction of Absorbed Photosynthetically Active Radiation (fAPAR) at several sites to optimize model parameters. After a brief description of the optimization technique (variationnal system), we will detail some results linked to: 1) the overall model improvement at few European forest sites and whether the assimilation of fluxnet data for 'normal condition' years significantly improves the simulation for other extreme years. 2) the benefit of tropical Fluxnet data (i.e., Santarem site) to improve the model hydrological stress functions for photosynthesis and soil respiration . 3) the complementarities between fAPAR derived from medium (MERIS) or high (SPOT) spatial resolution sensors and Fluxnet measurements to improve ORCHIDEE's prognostic phenology. Uncertainty on the estimated parameters (returned by the inverse procedure) will be evaluated as a function of the different type of data that are assimilated and further used to estimate uncertainty on modeled carbon fluxes at local scale.

B33A-0396

Uncertainties in the Net Ecosystem Exchange of Europe and North America

Here present a thorough upscaling of carbon balance estimates from eddy covariance flux towers to Europe
and North America with an estimate of uncertainties by means of model data integration techniques. Model
parameter regionalization approaches aim to spatially discriminate ecosystem properties, embodying the
concept that different parameters control different processes hence requiring different extrapolation
strategies. In this perspective, the consideration of a multivariate space for model parameter extrapolation
strategies should rely on spatially distributed variables, supporting the identification of upscaling regions.
This target can be partly achieved by the use of variables derived from remote sensing as model drivers.
These act as weights for the flux variability in the upscaling exercise, by adding information about the spatial
structure in the land surface exchanges. In this perspective, the quantification of the FLUXNET
representativeness and heterogeneity is fundamental to assess the upscaling potential of both model
parameters and observed processes. These issues can be better addressed for geographical regions such
as Europe or North America where FLUXNET, albeit confined to individual sites, is already gaining pseudo-
spatial characteristics.

We integrated eddy covariance measurements, partitioned into primary productivity and ecosystem
respiration into the parameterization of a primary productivity empirical light-use efficiency model combined
with a semi-empirical respiration model. We stratified the measurement sites per ecosystem type and climate
classification. For the integration we adopted a Markov Chain Monte Carlo approach, which permitted us to
estimate a posteriori joint probability functions of model parameters. These were used for extrapolating
uncertainties of the regional carbon budgets for Europe and North America. For doing this, the Markovian
Chains of model parameters from each site/year optimization were sub-sampled in such a way ad to maintain
site equal representativity and to have an adequate number of parameter sets. With these parameterizations,
the models were run with a daily time step at 0.25 degree from 2001 to 2004, driven by the ECMWF
climatology and the Fraction of Absorbed Photosynthetically Active Radiation from the MODerate resolution
Imaging Spectroradiometer (MODIS). Each estimate of primary productivity and ecosystem respiration was
used for estimating uncertainties as interquartile difference. Our mean estimate generally captures the
expected spatial and seasonal patterns of primary productivity and ecosystem respiration as determined from
measurements and the literature. The uncertainties were on average ±29% of the mean value for
primary productivity and ±38% for ecosystem respiration.

B33A-0397

Optimizing Large Scale Carbon Fluxes for North America

We combine the SiB3 biosphere model with the RAMS mesoscale meteorology model and associated Lagrangian particle dispersion model (LPDM) and use CO2 observations from a 8-tower network in 2004 to correct a priori ecosystem respiration (ER) and gross primary productivity (GPP) fluxes for a domain consisting of most of North America. Results are presented as weekly corrections to ER and GPP for 2004. A sink is recovered from the inversion but is smaller than expected due to the limited constraint imposed by the sampling footprint of the 8-tower observing network. The sensitivities of the inversion to independently derived boundary conditions, different fossil fuel sources, and various parameters in the inversion are analyzed and discussed.

B33A-0398

Reflex Project: Using Model-Data Fusion to Characterize Confidence in Analyzes and Forecasts of Terrestrial C Dynamics

The Regional Flux Estimation Experiment, REFLEX, is a model-data fusion inter-comparison project, aimed at
comparing the strengths and weaknesses of various model-data fusion techniques for estimating carbon
model parameters and predicting carbon fluxes and states. The key question addressed here is: what are the
confidence intervals on (a) model parameters calibrated from eddy covariance (EC) and leaf area index
(LAI) data and (b) on model analyses and predictions of net ecosystem C exchange (NEE) and carbon
stocks? The experiment has an explicit focus on how different algorithms and protocols quantify the
confidence intervals on parameter estimates and model forecasts, given the same model and data. Nine
participants contributed results using Metropolis algorithms, Kalman filters and a genetic algorithm. Both
observed daily NEE data from FluxNet sites and synthetic NEE data, generated by a model, were used to
estimate the parameters and states of a simple C dynamics model. The results of the analyses supported the
hypothesis that parameters linked to fast-response processes that mostly determine net ecosystem
exchange of CO2 (NEE) were well constrained and well characterised. Parameters associated with turnover
of wood and allocation to roots, only indirectly related to NEE, were poorly characterised. There was only
weak agreement on estimations of uncertainty on NEE and its components, photosynthesis and ecosystem
respiration, with some algorithms successfully locating the true values of these fluxes from synthetic
experiments within relatively narrow 90% confidence intervals. This exercise has demonstrated that a range
of techniques exist that can generate useful estimates of parameter probability density functions for C models
from eddy covariance time series data. When these parameter PDFs are propagated to generate estimates
of annual C fluxes there was a wide variation in size of the 90% confidence intervals. However, some
algorithms were able to make effective estimates of annual fluxes within relatively small CIs. In making
predictions of C fluxes most algorithms did not increase the size of confidence intervals relative to analyses.
How CIs grow over forecast periods of multiple years needs to be better understood. Data on slow, large C
pools should be included in assimilation experiments, even with large confidence intervals, because such
data constrain the parameters poorly served by eddy covariance data.

http://www.carbonfusion.org

B33A-0399

Estimating Terrestrial Wood Biomass from Observed Concentrations of Atmospheric Carbon Dioxide

We estimate terrestrial disequilibrium state and wood biomass from observed concentrations of atmospheric CO2 using the CarbonTracker system coupled to the SiBCASA biophysical model. Starting with a priori estimates of carbon flux from the land, ocean, and fossil fuels, CarbonTracker estimates net carbon sources and sinks from 2000 to 2007 that are optimally consistent with observed CO2 concentrations. The a priori terrestrial Net Primary Productivity (NPP) and heterotrophic respiration (Rh) from SiBCASA assume steady state conditions for initial biomass, implying mature ecosystems with no disturbances where growth balances decay and the long-term, net carbon flux is zero. In reality, harvest, fires, and other disturbances reduce available biomass for decay, thus reducing Rh and resulting in a long-term carbon sink. The disequilibrium state is the ratio of Rh estimated from CarbonTracker to the steady state Rh from SiBCASA. Wood is the largest carbon pool in forest ecosystems and the dominant source of dead organic matter to the soil and litter pools. With much faster turnover times, the soil and litter pools reach equilibrium relative to the wood pool long before the wood pool itself reaches equilibrium. We take advantage of this quasi-steady state to estimate the size of the wood pool that will produce an Rh that corresponds to the net carbon sink from CarbonTracker. We then compare this estimated wood biomass to regional maps of observed above ground wood biomass from the US Forest Inventory Analysis.

B33A-0400 INVITED

What is the Resolution of the North American CO2 Observing Network?

The CarbonTracker modeling system was designed in part to evaluate the predictions of prognostic, or
forward, carbon models. In order for inversions to provide useful constraints to guide the development of
forward models, robust confidence intervals on their inverse flux estimates will be required. Unfortunately,
assessing the uncertainty on an inverse flux estimate is quite difficult, and new approaches are needed. To
meet this goal, we present here results of an observational system simulation experiment (OSSE) meant to
assess the
ability of CO2 measurement networks to detect flux anomalies in North America during the growing season.
We evaluate the power of both the current observational network and a planned larger network comprising
12 tall towers and 24 weekly aircraft profiles in North America.

http://carbontracker.noaa.gov

B33A-0401

An Improved State-Parameter Estimation of Forest Carbon Dynamics in a Boreal Forest Ecosystem of Interior Alaska Using Data Assimilation

The Smoothed Ensemble Kalman Filter (SEnKF) was used to sequentially assimilate eddy covariance measurements into the Erosion Deposition Carbon Model (EDCM) to simultaneously estimate model state variables and parameters for evaluating EDCM performance in simulating carbon dynamics in a boreal forest ecosystem. The study site, dominated by black spruce, is located near Delta Junction (63°54' N, 145°40' W) in interior Alaska. Field measurements and estimates included daily net ecosystem exchange (NEE), gross primary production (GPP), ecosystem respiration (Re), and net primary production (NPP). To detect possible temporal variability of key parameters of EDCM and assess the uncertainty caused by the parameters, we conducted two simulation experiments: (1) simultaneously estimating state variables and nine parameters (e.g., the maximum ecosystem-specific net production rate and maximum decomposition rates in eight surface and soil carbon pools), and (2) only estimating state variables with perturbation error on the parameters. In the parameter sampling process, we used an improved smoothing kernel technique to limit over-dispersion and control filter divergence caused by the narrowing of parameter variance. Both experiments showed that SEnKF effectively reduced modeling errors of NEE, NPP, and respiration. However, the simultaneous estimation of state variables and parameters was superior to estimating state variables alone in terms of error statistics because it caught up with temporal variability of the parameters. Furthermore, the smoothing kernel technique effectively reduced the influence of prescribed parameter variance on the identifiability of the parameters.