收藏切换
A two-stage inflation method in parameter estimation to compensate for constant parameter evolution in Community Earth System Model
收藏切换
PDF
Zheqi Shen1, 2, 3, Youmin Tang1, 2, 3, 4, *
Acta Oceanologica Sinica | 2022, 41(2) : 91 - 102
Less
收藏切换
Acta Oceanologica Sinica | 2022, 41(2): 91-102
Ocean Data Assimilation
A two-stage inflation method in parameter estimation to compensate for constant parameter evolution in Community Earth System Model
Full
Zheqi Shen1, 2, 3, Youmin Tang1, 2, 3, 4, *
Affiliations
  • 1 Key Laboratory of Marine Hazards Forecasting of Ministry of Natural Resources, Hohai University, Nanjing 210098, China
  • 2 College of Oceanography, Hohai University, Nanjing 210098, China
  • 3 Southern Marine Science and Engineering Laboratory (Zhuhai), Zhuhai 519080, China
  • 4 Environmental Science and Engineering, University of Northern British Columbia, Prince George V2N 4Z9, Canada
Published: 2022-02-25 doi: 10.1007/s13131-021-1856-5
Outline
收藏切换

Parameter estimation is defined as the process to adjust or optimize the model parameter using observations. A long-term problem in ensemble-based parameter estimation methods is that the parameters are assumed to be constant during model integration. This assumption will cause underestimation of parameter ensemble spread, such that the parameter ensemble tends to collapse before an optimal solution is found. In this work, a two-stage inflation method is developed for parameter estimation, which can address the collapse of parameter ensemble due to the constant evolution of parameters. In the first stage, adaptive inflation is applied to the augmented states, in which the global scalar parameter is transformed to fields with spatial dependence. In the second stage, extra multiplicative inflation is used to inflate the scalar parameter ensemble to compensate for constant parameter evolution, where the inflation factor is determined according to the spread growth ratio of model states. The observation system simulation experiment with Community Earth System Model (CESM) shows that the second stage of the inflation scheme plays a crucial role in successful parameter estimation. With proper multiplicative inflation factors, the parameter estimation can effectively reduce the parameter biases, providing more accurate analyses.

parameter estimation  /  data assimilation  /  inflation  /  CESM  /  EnKF
Zheqi Shen, Youmin Tang. A two-stage inflation method in parameter estimation to compensate for constant parameter evolution in Community Earth System Model[J]. Acta Oceanologica Sinica, 2022 , 41 (2) : 91 -102 . DOI: 10.1007/s13131-021-1856-5
In oceanic and atmospheric models, physical processes and dynamical motions with spatial scales smaller than the model resolution cannot be fully resolved by the model equations. Contributions from them are usually represented through subgrid-scale schemes or parametrizations in numerical models. The parametrizations formulate the unsolved small-scale effects as a function of the resolved model variables, which can be derived from either the physical laws or the empirical relations. It inevitably involves a certain number of parameters, whose values are intrinsically uncertain. The uncertainty in some parameters may have great impacts on the model performance (Tong and Xue, 2008; Furue et al., 2015).
Parameter estimation (PE) is defined as the process to adjust or optimize the model parameter using observations (Zhang et al., 2020). PE has shown great potential to reduce model biases and improve the model simulation and prediction (Wu et al., 2016; Zhang et al., 2016). Most parameter estimation methods are developed based on the data assimilation methods, which are initially designed for generating estimates of model states (Evensen et al., 1998). Particularly, the ensemble-based methods, such as ensemble Kalman filters (Evensen, 2003) and particle filters (Liu and West, 2001), have been successfully applied for PE (Ruiz et al., 2013). In these ensemble-based methods, the model state can be augmented with a set of poorly known parameters ( Banks, 1992; Anderson, 2001). Updating the augmented state with observations can therefore estimate the model state and parameter simultaneously ( Kivman, 2003; Annan and Hargreaves, 2004; Annan et al., 2005). In practice, PE is more difficult than state estimation (SE) because the connection between parameters and observations is indirect, often nonlinear. It is even more challenging if one estimates the parameters of a general circulation model (GCM).
In GCMs, some parameters arise from the simplification of the underlying physical processes, which may take a globally uniform value. It is found that assimilating numerous data to estimate one single scalar parameter would over-adjust its value, bring unexpected errors. Also, it is impractical to use observational information that has strong geographic dependence to update a global parameter. Aksoy et al. (2006) propose a method that first transforms a single-value parameter into a two-dimensional field and then updates the field spatially in the analysis step. A spatial average (SA) of the entire spatially varying parameter field is then used to recover the globally uniform parameter value for model integration. Besides, the geographic-dependent parameter optimization (GPO) is also developed to allow model parameters to vary geographically after each analysis in particular models (Wu et al., 2013). Liu et al. (2014a) proposed the adaptive spatial average (ASA) method for the recovery of the spatially uniform parameter value. It shows that ASA could be more effective than either SA or GPO for parameter estimation in a coupled GCM (Liu et al., 2014b).
One other issue of PE in GCMs is that parameters are usually assumed to be constant during the model integration so that one parameter only changes its value in the data assimilation step. As a result, the parameter ensemble spread will decrease more rapidly than the state ensemble spread, and the parameter ensemble tends to collapse before an optimal parameter value is found (Santitissadeekorn and Jones, 2015). Gordon et al. (1993) suggest an approach by adding small random disturbances to state members between time steps to deal with the sample degeneracy/attrition for state variables. Later on, this idea has been extrapolated to PE by adding small perturbations to parameters. Liu and West (2001) point out that the artificial parameter evolution by adding artificial perturbations would make the posteriors too diffusive relative to the theoretical posteriors for the fixed parameters. So, they proposed a new artificial model for the parameter by kernel smoothing. These methods are designed for particle filters, and they have been shown effective with moderate dimensional models. However, due to the severe filter degeneracy problem in particle filters, it seems impractical to use these methods to perform joint state/parameter estimation with GCMs.
Adding random perturbations between model steps in some sense can be regarded as additive inflation. For the PE with ensemble Kalman filter, Aksoy et al. (2006) propose an approach called the conditional covariance inflation (CCI) to apply multiplicative inflation to the parameters. Liu et al. (2014a, b) also use the CCI for parameter inflation for PE in coupled GCMs (CGCMs). The CCI is designed to inflate the parameter ensemble spread back to a predefined threshold value when it is smaller than the threshold. However, their experiments do not use covariance inflation for state variables, which might lead to the degeneracy of the state ensemble. Besides, the CCI will not take effect at the first few parameter estimation steps if the parameter spread is larger than the threshold. This will bring incorrect parameter increments in the early stages, and cause the parameter estimation less effective. Besides, Koyama and Watanabe (2010) apply multiplicative inflation to both state and parameter ensembles, but with different inflation factors. Their method is applied to the PE with an ideal Lorenz’ 96 model, and the inflation factors are tuned empirically. However, more details need to be developed to make it available with realistic applications.
In this work, a two-stage inflation method is developed for PE, which can deal with the collapse of parameter ensemble due to the constant evolution of parameters in a CGCM, i.e., the National Center for Atmospheric Research Community Earth System Model (NCAR-CESM). Specifically, the adaptive inflation (Anderson, 2009) is applied to the augmented state, in which the parameters are transformed to fields and attached to the state vectors; and then multiplicative inflation is additionally used to inflate the spatially averaged parameters before they are used for model integration. The inflation factor in the second stage is determined according to the spread growth ratio (SGR) of model states during the model integration, with which the inflation can remedy the underestimation of parameter ensemble spread due to constant parameter evolution.
This paper is organized as follows. The methods are described in Section 2. The model and data are introduced in Section 3. Section 4 presents the results of numerical experiments. Summary and discussion are finally drawn in Section 5.
In this work, the ensemble adjustment Kalman filter (EAKF; Anderson, 2001) is used to perform state and parameter estimations. The adaptive spatial average (ASA) method by Liu et al. (2014a) is also used to transform a scalar parameter to a two-dimensional field in the analysis step and turn it back before model integration. The details of EAKF and ASA are introduced as follows.
If $ x $ denotes the state vector, $ {y}^{o} $ denotes a single scalar observation with variance $ ({{\sigma }^{o})}^{2} $ and $ h $ is the forward operator; the EAKF algorithm first applies the operator $ h $ to each ensemble member of the state, producing the ensemble of prior estimates for the observation, namely,
$ {{y}}\,_{{n}}^{{p}}=h\left({{x}}\,_{{n}}^{{p}}\right),{n}=1,2,\cdots ,{N} , $
here, superscript $ {p} $ indicates prior and subscript $ {n} $ is the index of the ensemble member. And the prior mean $ \overline{{y}\,^{p}} $ and prior variance $ {{(\sigma }_{y}^{p})}^{2} $ can be computed directly from the ensemble members of $ {y}\,_{n}^{p} $.
EAKF update the prior variance $ {{(\sigma }_{y}^{p})}^{2} $ by the observation variance $ ({{\sigma }^{o})}^{2} $ to give the posterior variance and mean value (indicated by superscript $ u $) as
$ {\left({\sigma }_{y}^{u}\right)}^{2}=\left[{\left({\sigma }_{y}^{p}\right)}^{-2}+\left({{\sigma }^{o}}\right)^{-2}\right]^{-1} , $
$ {\overline{{y}^{u}}}={{(\sigma }_{y}^{u})}^{2}\left[\frac{\overline{{y}^{p}}}{{{(\sigma }_{y}^{p})}^{2}}+\frac{{y}^{o}}{({{\sigma }^{o})}^{2}} \right].$
The posterior value in observational space can be computed for each member by shifting the mean and linearly contracting the members to make the sample variance exactly $ {{(\sigma }_{y}^{u})}^{2} $, i.e.,
$ {y}_{n}^{u}=\left(\frac{{\sigma }_{y}^{u}}{{\sigma }_{y}^{p}}\right)({y}_{n}^{p}-{\overline{{y}^{p}}})+{\overline{{y}^{u}}},\;n=1,2,\cdots ,N . $
The increment of each observation can update different model variables at different locations by regression using the following equation,
$ {x}_{m,n}^{u}={x}_{m,n}^{p}+\rho \frac{{\Sigma }_{{x}_{m},y}}{{\left({\sigma }_{y}^{p}\right)}^{2}}({y}_{n}^{u}-{y}_{n}^{p})={x}_{m,n}^{p}+\rho {r}_{{x}_{m},y}\frac{{\sigma }_{{x}_{m}}}{{\sigma }_{y}^{p}}({y}_{n}^{u}-{y}_{n}^{p}) , $
where $ {x}_{m,n}^{p} $ denotes the prior value of a variable at a particular location indicated by the subscript $ m $ for the $ n $th member. $ {\Sigma }_{{x}_{m},y}={r}_{{x}_{m},y}{\sigma }_{{x}_{m}}^{p}{\sigma }_{y}^{p} $ is the covariance of ${{x}}_{{m}}$ and $ {y}^{p} $, and $ {r}_{{x}_{m},y} $ is the correlation coefficient. $ \rho $ is the localization factor (Gaspari and Cohn, 1999) which restricts the computation adjacent to the observation while removing the spurious long-distant relations due to insufficient ensemble size.
To estimate a global scalar parameter member $ {\eta }_{n} $, one should add a spatial dependence to the parameter. The prior value of the parameter field at each location is assumed to be uniform, i.e., $ {\theta }_{m',n}={\eta }_{n} $, where $ m' $ is the index of a parameter in the augmented state. And the observations are assimilated with the same covariance localization strategy as
$ {\theta }_{m',n}^{u}={\theta }_{m',n}^{p}+\rho {r}_{{\theta }_{m'},y}\frac{{\sigma }_{{\theta }_{m'}}^{p}}{{\sigma }_{y}^{p}}({y}_{n}^{u}-{y}_{n}^{p}) . $
Therefore, the posterior values of the parameter are spatial varying. For simplicity, it is assumed that the parameter is extended to a two-dimensional variable in the sea surface in the analysis step.
To show the concept of ASA, a simple scenario in which $\rho = 1$ is considered. Using Eqs (4) and (6) , it can be derived that the updated variance for the parameter at the location indicated by m is
$ ({\sigma }_{{\theta }_{m'}}^{u}{)}^{2}=({\sigma }_{{\theta }_{m'}}^{p}{)}^{2}+{r}_{{\theta }_{m'},y}^{2}{\left({\sigma }_{{\theta }_{m'}}^{p}\right)}^{2}\left[{\left(\frac{{\sigma }_{y}^{u}}{{\sigma }_{y}^{p}}\right)}^{2}-1\right] . $
If substitute Eq. (2) into Eq. (7), it can be written as
$ {r}_{{\theta }_{m'},y}^{2}\frac{{\left({\sigma }_{y}^{p}\right)}^{2}}{{\left({\sigma }_{y}^{p}\right)}^{2}+({{\sigma }^{o})}^{2}}=1-\left(\frac{{\sigma }_{{\theta }_{m'}}^{u}}{{\sigma }_{{\theta }_{m'}}^{p}}\right)^{2}. $
Denote $ h=\dfrac{{\sigma }_{{\theta }_{m'}}^{u}}{{\sigma }_{{\theta }_{m'}}^{p}} $, Eq. (8) strongly indicates that $ h $ can be an index to select the good values from a posterior field. Since $ {\sigma }_{{\theta }_{m'}}^{u} $ always smaller than $ {\sigma }_{{\theta }_{m'}}^{p} $, a smaller $ h $ makes both sides of Eq. (8) closer to 1, representing that either the correlation of observation and parameter is stronger or the ratio of prior variance to the total variance is larger at location $ m' $. It can be sure that those locations correspond to large $ h $ are good values that have the most significant observation impacts, thus averaging them can accelerate PE. Liu et al. (2014a) use a threshold of $ h $ for spatial average. The spatial varying posterior parameter field is averaged over the regions in which $ h $ are larger than the threshold. They also make the threshold adaptive to ensure a considerable amount of good values are selected and averaged.
Covariance inflation is an ad-hoc method to prevent the ensemble spread from narrowing due to model errors. In this work, the time-varying adaptive inflation method developed by Anderson (2009) is applied to the augmented state ensemble. Inflation is applied to the prior ensemble just before forward operators are computed in Eq. (1) as
$ {x}_{m,n}^{inf}=\sqrt{{\text{λ}}_{m}}({x}_{m,n}-\overline{{x}_{m}})+\overline{{x}_{m}} , $
$ {\theta }_{m',n}^{inf}=\sqrt{{\text{λ} }_{m'}}({\theta }_{m',n}-\overline{{\theta }_{m'}})+\overline{{\theta }_{m'}} .$
The inflation factors $\text{λ}$ vary in different locations and with different variables/parameters, they also evolve adaptively with time. For different locations or variables, the correlation between any two elements of the inflation is assumed to be the same as the prior sample correlation between the corresponding elements of the model state vector. The inflation is also updated according to the observation by applying Bayes theorem. The details of the method can be found in Anderson (2009). However, since the two-dimensional parameter fields ${\theta }_{m',n}$ are averaged to a scalar $ {\eta }_{n} $ before model integration, the inflation factors ${\text{λ} }_{m'}$ for parameters are not updated from the previous step, they essentially do not evolve with time.
As mentioned, the state ensemble spread will grow during the model integration, but the parameter ensemble spread remains unchanged due to the constant parameter evolution model, i.e., $\mathrm{\eta }\left({{t}}+1\right)=\mathrm{\eta }\left({{t}}\right)$. It is not a surprise that $ {\sigma }_{{\theta }_{m'}} $ will be relatively under-inflated comparing to $ {\sigma }_{{x}_{m}} $ if only Eqs (9) and (10) are used for covariance inflation. In that situation, the uncertainty of the parameter is under-estimated, such that the observational increments represented by $ ({y}_{n}^{u}-{y}_{n}^{p}) $ can not effectively correct the parameters. To deal with that, extra inflation is used for the scalar global parameters $ \eta $ as follows,
$ {\eta }_{n}=\overline{\eta }+\alpha ({\eta }_{n}-\overline{\eta }),n=1,2,\cdots ,N , $
where $ \overline{\eta } $ is the ensemble mean of the parameter. The inflation factor $ \mathrm{\alpha } $ will be computed and compared in the next section.
With Eqs (10) and (11), a two-stage inflation method is applied to the parameter ensemble to increase the ensemble spread, whereas only one stage inflation is used for the state ensemble. Eq. (11) is applied to the global scalar parameters, which are not involved in the PE analysis. This extra inflation is employed to compensate for constant parameter evolution, which is crucially important for GCMs.
The NCAR-CESM (version 1.2.1) is used in this study. This model is a fully coupled earth system model, whose atmospheric and oceanic components are the Community Atmosphere Model (CAM) and the Parallel Ocean Program version 2 (POP2; Smith et al., 2010) respectively. The resolution of the atmosphere component is horizontally ${0.9}^{\circ } \times {1.25}^{\circ }$, with 26 vertical levels for the CAM model. The ocean component is integrated on a nominal resolution of $ {1}^{\circ } $. Enhanced meridional resolution of $ {0.5}^{\circ } $ is employed in the equatorial region, and the ocean model is run with 60 vertical levels. The Data Assimilation Research Testbed (DART, Karspeck et al., 2018) is employed to assimilate ocean observations into the CESM model, in which the adaptive inflation method (Anderson, 2009) is available for model states and augmented parameters.
It is well acknowledged that one of the largest uncertainties in ocean general circulation models (OGCMs) is related to the parameterization of vertical mixing processes; poorly specified schemes can lead to a large bias in SST simulations (Zhu and Zhang, 2018). The K-Profile Parameterization (KPP), (Large et al., 1994) is used for vertical mixing parameterizations in the POP2 model. The background diffusivity, representing the integrated effects of diapycnal mixing processes in the ocean interior, is assigned with a constant value magnitude of ${O}\left({10}^{-5}{\mathrm{m}}^{2}/{\mathrm{s}}\right)$ in this model. Also, the diapycnal diffusivity is reduced near the equator, with a magnitude of ${O}\left({10}^{-6}{\mathrm{m}}^{2}/{\mathrm{s}}\right)$ (Gregg, 2003). And the Banda Sea diffusivity is modified with a magnitude of ${O}\left({10}^{-4}{\mathrm{m}}^{2}/{\mathrm{s}}\right)$ (Ffield and Gordon, 1992).
In this work, the ensemble data assimilation methods are used to estimate the background vertical diffusivity coefficient (call bckgrnd_vdc hereafter) in CESM (Smith et al., 2010), the temperature and salinity profiles are assimilated to reduce the errors and uncertainties in both model states and parameters.
To verify the performance of state and parameter estimation methods, the observing system simulation experiment (OSSE) is conducted in this study. In OSSE, the reference states (which are regard as the true states) are derived from the “truth run”, in which the reference initial conditions and parameters are used. Observations are selected from the true states with a prescribed observation network. The parameter estimation experiment is conducted with biased bckgrnd_vdc, and the synthetic ocean temperature (T) and salinity (S) observations are assimilated to estimate the parameter. The procedure of the OSSE can also be seen from the flow-diagram in Fig. 1, where the reference bckgrnd_vdc = 0.16 $\mathrm{c}{\mathrm{m}}^{2}/{\mathrm{s}}$, and the initial guess of it is 0.10 $\mathrm{c}{\mathrm{m}}^{2}/{\mathrm{s}}$ for both control run and state estimation experiment.
To better simulate the reality, the TS-profile locations and error variances are adopted from the EN4 quality controlled ocean data (https://www.metoffice.gov.uk/hadobs/en4/download-en4-2-1.html) over the years 2010 and 2011. Since the assimilation time window is 10 days, all temperature and salinity data over the period are assimilated at the same time. To prevent the over-fitting due to excessive observations, those profile data are typically merged into $ {1}^{\circ }\times {1}^{\circ } $ grids with at most 42 vertical levels (Good et al., 2013) in assimilation, in which the mean value and variance of the data in each grid block are regarded as its observational value and error variance respectively. In this experiment, the synthetic observations are created using the locations and error variances of the EN4 profiles. As an example, Fig. 2 shows the locations of the period January 1−10, 2010, which vary at each assimilation step.
To perform the SE and PE, an ensemble size of 20 is used in all of the experiments. The initial conditions for the experiment is generated from a 150-years simulation using the built-in initial condition of the coupled model, which could ensure that the initial conditions were adequately spun-up. The pseudo-random fields (Evensen, 2003) with 20 layers and with a magnitude of $ {{10}^{-2}}$°C are added to the temperature of each ensemble member. And then the model is integrated for a spinup period of 1 year to ensure the ensemble spread develop sufficiently.
One of the 20 members is selected as the initial condition for the truth run. If these ensemble members are used for data assimilation experiments, the uncertainty of initial states would be relatively small, such that the SE can soon reach the quasi-equilibrium, a situation that the uncertainty of model states is sufficiently constrained by observations. As Zhang et al. (2012) suggests, only when the SE reaches the quasi-equilibrium, the covariances between parameters and model states can be signal dominant, and thereby the parameter estimation could be activated to effectively correct the parameters. So this data assimilation enhancive parameter correction (DAEPC) strategy is adopted in the experiment to start with a pure SE and activate PE when the assimilated states reach quasi-equilibrium.
To show the parameter bckgrnd_vdc do have an impact on the model simulation. The parameter sensitivity experiment is conducted at the first stage. The models are run with the same initial conditions for 1 year, but the parameters are slightly perturbed from the initial guess 0.10 with 20 random values with a standard deviation of 0.08. As an example, the SST and SSS spreads of the 1st, 4th, 7th, and 10th month are shown in Fig. 3. The perturbation of parameters leads to a significant growth of the spread for both temperature and salinity. The most significant growth of SST spread appears in the northern Pacific, whereas the most significant increase of SSS spread appears in the western tropical Pacific. Deeper layers’ temperature and salinity also have similar patterns of spreads (not shown), which implies that assimilating the TS profiles can potentially correct the parameter bias.
The data assimilation is performed every 10 days, and the duration is two years. The temperature and salinity observations are created from the EN4 profile data of the years 2010 and 2011. The horizontal localization strategy is applied with a cut-off radius of about 1 200 km distance. Meanwhile, the vertical localization is also applied with a cut-off distance of approximately 300 m. The DAEPC method (Zhang et al., 2012) is adopted to perform pure state estimation for 6 months and activate the parameter estimation at the beginning of the 7th month. For both state and parameter estimation, the spatial-varying adaptive inflation proposed by Anderson (2009) is used, such that the inflation value for each model variable at each model grid is different. The root-mean-squared-errors (RMSE) of prior and posterior innovations as $ |{y}^{o}-H(x\left)\right| $ are shown in Fig. 4 for different levels. It is shown that the errors decrease rapidly at the first few data assimilation steps for both temperature and salinity at each level. After about one or two months, the error reduction from data assimilation tends to a saturated value such that each analysis step reduces a similar amount of errors. This is a clear indication that the state estimation reaches the quasi-equilibrium state and parameter estimation could be activated. It is also noteworthy that the prior and posterior RMSEs are very close to each other at a depth of 1 000 m. This is because the model errors are small and the uncertainties of observations are large at depth.
Comparing to the control run without data assimilation, the state estimation significantly reduces the errors of each model state. Figure 5 has shown the RMSE of SST, SSS, and SSH from the control run and state estimation of the first 6 months. It can be seen that the errors of not only the temperature and salinity but also SSH are reduced by the multivariate data assimilation with EAKF.
For parameter estimation, a spatial uniform parameter field is augmented to the state vector, and updated by incorporating the same observations. The posterior parameter field is spatial-varying, such that the averaged parameter of a specified region is used to integrate the model for the next cycle. The so-called adaptive spatial average method is employed to select the location with good values in which the ratio of posterior to prior standard deviation $ h $ exceeds a threshold value. In this experiment, the threshold value is adaptive to ensure that at least 10 000 model grids out of about 80 000 ocean grids are used to compute the average. In this experiment, the adaptive spatial averaging is performed as follows:
i) Compute $h=\sigma _{\theta _{{\rm m}'}}^{{u}}/\sigma _{\theta'_{\rm{m}}}^{{p}}$ for each $\mathrm{m}'$.
ii) Set the initial threshold value $ {h}_{0}=0.68 $.
iii) Count the number of grid points which satisfies $ h > {h}_{0} $.
iv) If the number of grid points are greater than 10 000 or $ {h}_{0}=0.98 $, exit the procedure; otherwise, add $ {h}_{0} $ by 0.05.
v) Repeat steps iii) and iv) until the procedure ends.
Figure 6 shows an example of the spatial averaging scheme for the first parameter estimation step. Figure 6a displays the ${h}$ values for each model grid, computed from the ensemble members. Since h measures the proportions of posterior error standard deviations to prior error standard deviations of the parameter, its value is smaller in areas where observational errors are more sensitive to the parameters. The solid lines indicates some threshold values (e.g., $ {h}_{0}=0.73 $, 0.83). It can be seen that if $ {h}_{0}=0.83 $, a large area with more than 10 000 model grids are included, such that the adaptive procedure could be stopped. Figures 6b–d show the two-dimensional posterior parameter fields $ \theta $ for the first 3 ensemble members, respectively. The parameters in those areas enclosed by solid lines are averaged to compute $ \mathrm{\eta } $, whose ensemble is then used in the model integration. Although the members have different posterior patterns, they are averaged at the same region in the same parameter estimation step.
We first use one-stage inflation to perform PE, i.e., only Eqs (9) and (10) are used to inflate the state and parameter ensemble, respectively. The inflation factors are spatial-varying. The first three rows of Fig. 7 show the inflation factors for temperature and salinity at different depths after the first analysis step with parameter estimation. The inflation factors at most locations are slightly larger than 1. However, in particular regions, such as the tropical Pacific and the Indian Ocean, the inflation factors have values larger than 2. If we compare that with Fig. 6a, it can be found that those regions with large inflation factors are mostly with large observational impacts. Moreover, the inflation factors are smaller for deeper temperature and salinity variables, because the assimilation is insignificant in-depth, as Fig. 4 reveals.
The inflation factors for the two-dimensional parameter field are also computed during the assimilation. Figure 7 also shows the inflation factors for the parameter field, it is shown that the distribution of them is similar to that of SST or SSS. However, although the inflation for the parameter field is updated by the adaptive inflation algorithm in the analysis step, it is not used in the model integration due to the ASA method.
Figure 8a shows the PE results using one-stage inflation. The black dashed line is the initial guess of the bckgrnd_vdc, which is also the mean value of the initial parameter ensemble. The red solid line represents the mean value of the parameter ensemble, and the red dotted lines represent the mean value plus/minus one standard deviation of the parameter ensemble. The PE experiment uses 20 perturbed parameters (with standard deviation 0.08) to integrate the model, conduct a spinup process for pure state estimation for 6 months, therefore the parameter ensemble remains unchanged for the first 6 months. As soon as the parameter estimation is activated in the 7th month, the variance of the parameter ensemble decreases rapidly. The CCI technique (Aksoy et al., 2006) is applied with a standard deviation limit set at 1/5 of the initial spread 0.08. It can be seen from Fig. 8b that the parameter ensemble spread shrinks to the standard deviation limit 0.016 with only 4 PE steps. And the parameter error stops decreasing after that. In this case, it is shown that the parameter fails to converge to the true value, and the error of the parameter ensemble mean is over 30%. The main reason for this failure is the under-estimation of the parameter ensemble spread due to the lack of spread growth during the model integration. It can be expected that a relatively small spread would weaken the impact of the observational increment, making insufficient correction to the parameter ensembles, as the red solid line in Fig. 8a indicates.
To compensate for the parameter evolution model, a two-stage inflation method by Eqs (10) and (11) is used to inflate parameter ensemble, while the one-stage inflation by Eq. (9) is used to inflate model state ensembles.
The inflation factor $ \mathrm{\alpha } $ in Eq. (11) should be determined before activating the parameter estimation. Considering the extra parameter inflation is employed to compensate for constant parameter evolution, the value of $ \mathrm{\alpha } $ should be given according to the state ensemble spread growth ratio.
Figure 9 measures the spread growth ratio (call SGR hereafter) of temperature and salinity variables at each model integration. The duration of model integration is 10 days; therefore the global mean spreads of temperature and salinity are increased by different amounts at different depths. Take temperature at 5 m depth as an example, the global mean spread grows from about 0.15 to 0.25 in the first cycle, and the corresponding SGR is about 1.6 (Fig. 9a). However, the spread itself and the SGR are smaller at deeper layers. Figure 9g shows the SGR with different depths, it becomes very close to 1 when the depth is larger than 500 m. Comparing to temperature, the SGR of salinity is relatively smaller. For instance, Fig. 9h shows that the SGR for salinity barely exceeds the value 1.3.
According to Fig. 9, to inflate the parameter ensemble such that its spread will be comparable to those of state variables, the value of $ \mathrm{\alpha } $ should be neither too large nor too small. The median value of the SGR of temperature and salinity can be used for the extra parameter inflation factor. To simplify the discussion, $ \mathrm{\alpha } $ is kept constant until it reaches the standard deviation limit to turn on CCI. Four different $ \mathrm{\alpha } $ values as 1.1 (median value of salinity SGR), 1.3 (median value of temperature SGR), 1.2 (mean of both median values), and 1.5 (the maximum SGR) are used and compared.
Figure 10 shows the PE results with $ \mathrm{\alpha }=1.1,\;1.2,\;1.3 $ and $ 1.5 $ for the extra parameter inflation respectively. For the first three cases, it takes about 6, 8 and 12 data assimilation steps respectively for the parameter spread to shrink to the standard deviation limit. Take $ \mathrm{\alpha }=1.1 $ for example, even though the parameter spread maintains its value after it reaches the threshold value 0.016, the parameter estimation can still correct the parameter bias effectively, such that the error decreases until it converges to the true value in the 7th month of the 2nd year. This is different from the case $ \mathrm{\alpha }=1.0 $ in Fig. 8a, in which the parameter ensemble spread can not match the state ensemble spread. The cases $ \mathrm{\alpha }=1.2 $ and 1.3 show similar results. By checking the ensemble members, it is found that the ensembles in cases with $ \mathrm{\alpha }=1.1,\;1.2,\;1.3$ are more reliable than the case with $ \mathrm{\alpha }=1.0 $ when the ensemble spread shrinks to its threshold value, therefore the filter can further correct the parameters.
Figure 10d shows the consequence of over-inflating in PE. If the extra parameter inflation factor is too large, e.g., $ \mathrm{\alpha }=1.5 $, it will inflate the parameter ensemble too much, such that the spread has no chance to approach the std limit of CCI. The uncertainty of the parameter is over-estimated, such that the observational increments bring excessive adjustments to the parameter. As shown by the figure, the parameter reduces its bias at the first stage, but it does not converge to true value even though more data assimilation cycles are involved.
It concludes that either $ \mathrm{\alpha }=1.1,\mathrm{ }1.2 $ or $ \mathrm{\alpha }=1.3 $ provides successful PE that corrects the biased bckgrnd_vdc parameter. The pure SE experiment is also conducted, in which each member uses the same biased parameter bckgrnd_vdc = 0.10 cm2/s and the same initial ensemble with PE experiment is used. Therefore one can compare the quality of analyses from SE and PE with two-stage inflation ( $ \mathrm{\alpha }=1.1 $). As indicated by Fig. 10a, the parameter approaches the true value in the last 6 months of the second year. So the RMSEs of temperature and salinity over a specified region for those months are compared. As Fig. 3 shown, the most sensitive regions for temperature and salinity to the parameter are different. Therefore, the temperature RMSEs are computed between the equator and 60°N, while the salinity RMSEs are computed between 30°S and 30°N.
Figure 11 shows the temperature and salinity RMSE for each month using the data assimilation results from PE and SE, respectively. The monthly mean temperature and salinity are subtracted from the reference states and averaged in each layer. In each month, the PE result seems to have smaller errors than the SE result for both temperature and salinity, the advantage is more obvious in the mixed layer. With more details, Fig. 12 shows the absolute errors of the SST in the 9th month and SSS in the 7th month respectively. In the Northwest Pacific, in which SST is sensitive to the uncertainty of bckgrnd_vdc, the SST errors are significantly reduced with adjusted background vertical diffusivity coefficient. For salinity, similar error reductions can be observed in the equatorial western Pacific. It can be concluded that PE can reduce the errors introduced by the parameter bias effectively and provide analyses with high quality.
The PE problem is extremely complicated since it is nonlinear even if the dynamical model itself is linear, and it is in many cases difficult to find a proper solution. It is more challenging when dealing with CGCMs. The state augmentation technique, in which the state vectors are augmented with the poorly known parameters, provides a general framework to handle the parameter estimation problem using data assimilation methods. To apply the coupled PE in CGCMs, several issues are reported in previous studies, and many new techniques are developed in recent years (Zhang et al., 2020). For example, Zhang et al. (2012) introduced the DAEPC to enhance the quality of state-parameter covariance. Liu et al. (2014a, b) developed the ASA to update the single-value parameter with localized observations. Askoy et al. (2006) use the CCI to prevent parameter variance narrowing.
This paper focuses on the constant evolution model for parameters during model integration. It is one of the most widely used assumptions in PE that parameter remains unchanged during the state model integration. This assumption is introduced in the early stage of PE using the data assimilation method ( Evensen et al., 1998; Kivman, 2003; Annan and Hargreaves, 2004; Annan et al., 2005), and works well when the model is simple. As the state model becomes more and more complicated, this assumption brings several issues. In many cases, the parameter ensemble spread decreases with time and goes to zero eventually. To avoid that, Aksoy et al. (2006) propose the CCI in which the parameter ensemble spread is not allowed to be smaller than a prescribed threshold. However, the result in Fig. 8 shows that CCI is not able to avoid the divergence of PE in many circumstances. The lack of spread growth in parameter ensembles will make them under-inflated relative to the model state ensembles, and the underestimation of parameters uncertainty will weaken the impact of observations on correcting parameters.
Koyama and Watanabe (2010) use multiplicative inflation in PE with simple models, with an inflation factor for the parameter ensemble different from that used for the state variables. The current work extends their idea, to make it feasible for PE with CGCMs. Since the global scalar parameters are transformed to fields in the analysis step and turn back into scalar before model integration, a two-stage inflation method for parameters is designed. The first stage used Eq. (10) to inflate the parameter field, while the second stage used Eq. (11) to inflate the spatial-averaged scalar parameter. The second stage is extra inflation to compensate for constant parameter evolution during model integration, so its inflation factor can be determined according to the spread growth ratio of the observed variables, as Fig. 9 shown.
The PE results in Fig. 10 show that the two-stage inflation method with proper inflation factors can help the parameter to converge to its optimal value. Moreover, Figs 11 and 12 have compared the PE and SE results, revealing that the bias due to uncertain parameters is significantly reduced by PE, and the quality of PE analyses is better.
However, this paper use OSSE to verify the two-stage inflation method, and only a single parameter estimation is performed. The situation will be more complicated when more parameters are estimated simultaneously. The KPP parameterization in CESM also involves more parameters than the background vertical diffusion coefficient (Jochum, 2009), more attention should be paid to them. Moreover, it is even more challenging to estimate CGCM parameters using real observations (Zhao et al., 2019). As Zhang et al. (2020) reviewed, coupled PE (CPE) is still in the research stage, and the application of CPE in CGCMs with real observations to improve coupled model reanalysis and prediction still has a long way to go. This goal will be the focus of our future work.
  • The National Key Research and Development Program under contract No. 2017YFA0604202; the Fundamental Research Funds for the Central Universities under contract No. B210201022; the National Natural Science Foundation of China under contract Nos 42176003, 41690124, 41806032 and 41806038.
Aksoy A, Zhang Fuqing, Nielsen-Gammon J W. 2006. Ensemble-based simultaneous state and parameter estimation with MM5. Geophysical Research Letters, 33(12): L12801, doi: 10.1029/2006GL026186
Anderson J L. 2001. An ensemble adjustment Kalman filter for data assimilation. Monthly Weather Review, 129(12): 2884–2903, doi: 10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2
Anderson J L. 2009. Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus A, 61(1): 72–83, doi: 10.1111/j.1600-0870.2008.00361.x
Annan J D, Hargreaves J C. 2004. Efficient parameter estimation for a highly chaotic system. Tellus A, 56(5): 520–526, doi: 10.3402/tellusa.v56i5.14438
Annan J D, Hargreaves J C, Edwards N R, et al. 2005. Parameter estimation in an intermediate complexity Earth system model using an ensemble Kalman filter. Ocean Modelling, 8(1–2): 135–154
Banks H T. 1992. Computational issues in parameter estimation and feedback control problems for partial differential equation systems. Physica D: Nonlinear Phenomena, 60(1–4): 226–238.
Evensen G. 2003. The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dynamics, 53(4): 343–367, doi: 10.1007/s10236-003-0036-9
Evensen G, Dee D P, Schröter J. 1998. Parameter estimation in dynamical models. In: Chassignet E P, Verron J, eds. Ocean Modeling and Parameterization. Dordrech, the Netherlands: Springer, 373–398
Ffield A, Gordon A L. 1992. Vertical mixing in the Indonesian thermocline. Journal of Physical Oceanography, 22(2): 184–195, doi: 10.1175/1520-0485(1992)022<0184:VMITIT>2.0.CO;2
Furue R, Jia Yanli, McCreary J P, et al. 2015. Impacts of regional mixing on the temperature structure of the equatorial Pacific Ocean. Part 1: vertically uniform vertical diffusion. Ocean Modelling, 91: 91–111, doi: 10.1016/j.ocemod.2014.10.002
Gaspari G, Cohn S E. 1999. Construction of correlation functions in two and three dimensions. Quarterly Journal of the Royal Meteorological Society, 125(554): 723–757, doi: 10.1002/qj.49712555417
Good S A, Martin M J, Rayner N A. 2013. EN4: quality controlled ocean temperature and salinity profiles and monthly objective analyses with uncertainty estimates. Journal of Geophysical Research: Oceans, 118(12): 6704–6716, doi: 10.1002/2013JC009067
Gordon N J, Salmond D J, Smith A F M. 1993. Novel approach to non- linear/non-Gaussian Bayesian state estimation. IEE Proceedings F: Radar and Signal Processing, 140(2): 107–113, doi: 10.1049/ip-f-2.1993.0015
Gregg M C, Sanford T B, Winkel D P. 2003. Reduced mixing from the breaking of internal waves in equatorial waters. Nature, 422(6931): 513–515, doi: 10.1038/nature01507
Jochum M. 2009. Impact of latitudinal variations in vertical diffusivity on climate simulations. Journal of Geophysical Research: Oceans, 114(C1): C01010
Karspeck A R, Danabasoglu G, Anderson J, et al. 2018. A global coupled ensemble data assimilation system using the Community Earth System Model and the Data Assimilation Research Testbed. Quarterly Journal of the Royal Meteorological Society, 144(717): 2404–2430, doi: 10.1002/qj.3308
Kivman G A. 2003. Sequential parameter estimation for stochastic systems. Nonlinear Processes in Geophysics, 10(3): 253–259, doi: 10.5194/npg-10-253-2003
Koyama H, Watanabe M. 2010. Reducing Forecast Errors Due to Model Imperfections Using Ensemble Kalman Filtering. Monthly Weather Review, 138(8): 3316–3332, doi: 10.1175/2010MWR3067.1
Large W G, McWilliams J C, Doney S C. 1994. Oceanic vertical mixing: a review and a model with a nonlocal boundary layer parameterization. Reviews of Geophysics, 32(4): 363–403, doi: 10.1029/94RG01872
Liu Yun, Liu Zhengyu, Zhang Shaoqing, et al. 2014a. Ensemble-based parameter estimation in a coupled GCM using the adaptive spatial average method. Journal of Climate, 27(11): 4002–4014, doi: 10.1175/JCLI-D-13-00091.1
Liu Yun, Liu Zhengyu, Zhang Shaoqing, et al. 2014b. Ensemble-based parameter estimation in a coupled general circulation model. Journal of Climate, 27(18): 7151–7162, doi: 10.1175/JCLI-D-13-00406.1
Liu J, West M. 2001. Combined parameter and state estimation in simulation-based filtering. In: Sequential Monte Carlo Methods in Practice. New York, NY, USA: Springer, 197–223
Ruiz J J, Pulido M, Miyoshi T. 2013. Estimating model parameters with ensemble-based data assimilation: a review. Journal of the Meteorological Society of Japan, 91(2): 79–99, doi: 10.2151/jmsj.2013-201
Santitissadeekorn N, Jones C. 2015. Two-stage filtering for joint state-parameter estimation. Monthly Weather Review, 143(6): 2028–2042, doi: 10.1175/MWR-D-14-00176.1
Smith R, Jones P W, Briegleb B, et al. 2010. The Parallel Ocean Program (POP) reference manual, ocean component of the community climate system model (CCSM). Technical Report LAUR-10-01853. Boulder, CO, USA: National Centre for Atmosphere Research
Tong Mingjing, Xue Ming. 2008. Simultaneous estimation of microphysical parameters and atmospheric state with simulated radar data and ensemble square root Kalman filter. Part I: sensitivity analysis and parameter identifiability. Monthly Weather Review, 136(5): 1630–1648, doi: 10.1175/2007MWR2070.1
Wu Xinrong, Han Guijun, Zhang Shaoqing, et al. 2016. A study of the impact of parameter optimization on ENSO predictability with an intermediate coupled model. Climate Dynamics, 46(3–4): 711–727
Wu Xinrong, Zhang Shaoqing, Liu Zhengyu, et al. 2013. A study of impact of the geographic dependence of observing system on parameter estimation with an intermediate coupled model. Climate Dynamics, 40(7): 1789–1798
Zhang Shaoqing, Liu Zhengyu, Rosati A, et al. 2012. A study of enhancive parameter correction with coupled data assimilation for climate estimation and prediction using a simple coupled model. Tellus A, 64(1): 10963, doi: 10.3402/tellusa.v64i0.10963
Zhang Shaoqing, Liu Zhengyu, Zhang Xuefeng, et al. 2020. Coupled data assimilation and parameter estimation in coupled ocean–atmosphere models: a review. Climate Dynamics, 54(11–12): 5127–5144
Zhang Xuefeng, Zhang Shaoqing, Liu Zhengyu, et al. 2016. Correction of biased climate simulated by biased physics through parameter estimation in an intermediate coupled model. Climate Dynamics, 47(5–6): 1899–1912
Zhao Yuchu, Liu Zhengyu, Zheng Fei, et al. 2019. Parameter optimization for real-world ENSO forecast in an intermediate coupled model. Monthly Weather Review, 147(5): 1429–1445, doi: 10.1175/MWR-D-18-0199.1
Zhu Yuchao, Zhang Ronghua. 2018. An Argo-derived background diffusivity parameterization for improved ocean simulations in the Tropical Pacific. Geophysical Research Letters, 45(3): 1509–1517, doi: 10.1002/2017GL076269
Year 2022 volume 41 Issue 2
PDF
47
26
Cite this Article
BibTeX
Article Info
doi: 10.1007/s13131-021-1856-5
  • Receive Date:2021-02-10
  • Online Date:2025-11-20
  • Published:2022-02-25
Article Data
Affiliations
History
  • Received:2021-02-10
  • Accepted:2021-05-20
Funding
The National Key Research and Development Program under contract No. 2017YFA0604202; the Fundamental Research Funds for the Central Universities under contract No. B210201022; the National Natural Science Foundation of China under contract Nos 42176003, 41690124, 41806032 and 41806038.
Affiliations
    1 Key Laboratory of Marine Hazards Forecasting of Ministry of Natural Resources, Hohai University, Nanjing 210098, China
    2 College of Oceanography, Hohai University, Nanjing 210098, China
    3 Southern Marine Science and Engineering Laboratory (Zhuhai), Zhuhai 519080, China
    4 Environmental Science and Engineering, University of Northern British Columbia, Prince George V2N 4Z9, Canada

Corresponding:

* ytang@sio.org.cn
References
Share
https://castjournals.cast.org.cn/joweb/aos/EN/10.1007/s13131-021-1856-5
Share to
QR

Scan QR to access full text

Cite this article
BibTeX
Citations
表12种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏
  • BibTeX
  • EndNote
  • RefWorks
  • TxT