收藏切换
An ensemble learning method to retrieve sea ice roughness from Sentinel-1 SAR images
收藏切换
PDF
Pengyi Chen1, Zhongbiao Chen1, 2, *, Runxia Sun1, Yijun He1
Acta Oceanologica Sinica | 2024, 43(5) : 78 - 90
Less
收藏切换
Acta Oceanologica Sinica | 2024, 43(5): 78-90
Physical Oceanography, Marine Meteorology and Marine Physics
An ensemble learning method to retrieve sea ice roughness from Sentinel-1 SAR images
Full
Pengyi Chen1, Zhongbiao Chen1, 2, *, Runxia Sun1, Yijun He1
Affiliations
  • 1 School of Marine Sciences, Nanjing University of Information Science & Technology, Nanjing 210044, China
  • 2 East Sea Information Center of State Oceanic Administration, Shanghai 200136, China
Published: 2024-05-25 doi: 10.1007/s13131-023-2248-9
Outline
收藏切换

Sea ice surface roughness (SIR) affects the energy transfer between the atmosphere and the ocean, and it is also an important indicator for sea ice characteristics. To obtain a small-scale SIR with high spatial resolution, a novel method is proposed to retrieve SIR from Sentinel-1 synthetic aperture radar (SAR) images, utilizing an ensemble learning method. Firstly, the two-dimensional continuous wavelet transform is applied to obtain the spatial information of sea ice, including the scale and direction of ice patterns. Secondly, a model is developed using the Adaboost Regression model to establish a relationship among SIR, radar backscatter and the spatial information of sea ice. The proposed method is validated by using the SIR retrieved from SAR images and comparing it to the measurements obtained by the Airborne Topographic Mapper (ATM) in the summer Beaufort Sea. The determination of coefficient, mean absolute error, root-mean-square error and mean absolute percentage error of the testing data are 0.91, 1.71 cm, 2.82 cm, and 36.37%, respectively, which are reasonable. Moreover, K-fold cross-validation and learning curves are analyzed, which also demonstrate the method’s applicability in retrieving SIR from SAR images.

2-D Cauchy continuous wavelet transform (CWT)  /  Adaboost Regression  /  sea ice  /  sea ice surface roughness
Pengyi Chen, Zhongbiao Chen, Runxia Sun, Yijun He. An ensemble learning method to retrieve sea ice roughness from Sentinel-1 SAR images[J]. Acta Oceanologica Sinica, 2024 , 43 (5) : 78 -90 . DOI: 10.1007/s13131-023-2248-9
Sea ice roughness (SIR) affects the drag coefficient of the ice cover, which influences the energy transfer between the ocean and the atmosphere. Therefore, it plays a crucial role in the Arctic climate system (Prasad et al., 2021). Numerical ocean model simulations have shown that the heterogeneity in SIR significantly impacts the spatial distribution and trends of ocean surface stress during the last few decades (Martin et al., 2016). So understanding the small-scale changes in SIR will be beneficial for modeling ocean-ice processes.
Airborne lidar is often used to map the elevation of sea ice cover, and then the SIR can be estimated. The Airborne Topographic Mapper (ATM) has provided highly accurate ice surface elevations and has been used in numerous studies of sea ice (Studinger, 2014). Beckers et al. (2015) presented the SIR measurements in the Arctic regions using an airborne laser scanner and a single-beam laser altimeter. They found that in order to achieve a similar statistical distribution of SIR between the scanner and altimeter, the measurement surveys had to be longer than 5 km north of Svalbard and longer than 15 km in Fram Strait. Therefore, the survey of lasers must be carefully designed. Besides, the spatial coverage of laser is sparse and the flight of the aircraft is seriously affected by the weather (Nolin and Mar, 2018).
Because of the severe weather conditions in the Arctic, conducting in-situ and airborne measurements of SIR is challenging. In recent years, satellite remote sensing technologies have been used to map sea ice characteristics. SIR can serve as a reflection of other sea ice characteristics, such as sea ice types (Landy et al., 2020), sea ice thickness and sea ice albedo (Grenfell and Perovich, 1984). It is also helpful for improving the retrieval of sea ice thickness (Landy et al., 2020).
Radiometers onboard satellites were mostly used to estimate SIR. Hong (2010) proposed a novel approach to estimate the small-scale roughness of sea ice from the Advanced Microwave Scanning Radiometer Earth (AMSR-E) daily observations. The small-scale surface roughness retrieved from AMSR-E 6.9 GHz, ranging from 0.25 cm to 0.5 cm, showed reasonable agreement with the known observations of sea ice in the Antarctic and Arctic regions, which ranged from 0.2 cm to 0.6 cm. Gupta and Barber (2015) estimated the sub-pixel (<5.4 km) SIR by using the 89 GHz AMSR-E brightness temperature (Tb) and the SIR acquired from a helicopter-based laser system. Nolin and Mar (2018) used Multi-angle Imaging SpectroRadiometry (MISR) to characterize sea ice surface, and the SIR was retrieved by an empirical model that correlated the SIR derived from the ATM with the reflectance values from three MISR cameras (Ca, Cf, and An). Mosadegh and Nolin (2020) developed two separate backpropagation neural network (BPNN) models to estimate the SIR for summer and winter time, by using the red spectral band from all nine MISR cameras and the ATM data. Mosadegh and Nolin (2022) further used MISR and ATM data to develop a K-nearest neighbor (KNN) regression model for estimating the SIR. The performance of the KNN model was also shown to be superior to the simple linear regression (SLR) model and the polynomial linear regression (PLR) model. However, due to the relatively low resolutions of radiometers, the SIR can only be estimated in large areas (e.g., 25 km by 25 km for AMSR-E and 275 m by 275 m for MISR).
Synthetic Aperture Radar (SAR) was also used to retrieve SIR. Carlström et al. (1994) fitted an empirical relationship between the SIR and the backscatter and incidence angle of European Remote Sensing-1 (ERS-1) SAR. However, there was a deviation of approximately 50% between the model estimate and the measured root-mean-square height of the ice surface. Wen et al. (2011) constructed an analytical surface backscattering model to estimate the SIR from Radarsat-2 SAR data. In their model, the SAR backscatter and incidence angle were linked to the SIR. Cafarella et al. (2019) conducted a study on the relationship between the winter first-year SIR and C-band Radarsat-2 and L-band Advanced Land Observing Satellite-2 (ALOS)-2 Phased Array Type L-band Synthetic Aperture Radar-2 (PALSAR-2) backscatter measurements in the Canadian Arctic. They discovered a strong correlation between C-band HH-polarization backscatter, L-band HH-polarization backscatter, and L-band VV-polarization backscatter with SIR at shallow incidence angles. Additionally, a linear regression model was developed to estimate SIR based on SAR backscatter. Segal et al. (2020) conducted a study on the SIR in the Canadian Arctic archipelago by using both Sentinel-1 SAR and MISR data. The study involved the development of linear regression models to estimate the SIR using two variables: the normalized difference angular index (NDAI) from MISR and the backscatter from SAR.
Therefore, the models developed by previous studies are mainly based on the statistical relationship between SIR and the parameters measured directly by different sensors, such as the reflectance (Nolin and Mar, 2018) of radiometer and the backscatter of SAR (Segal at el., 2020) and the incidence angles of the sensors. However, it does not take into account the spatial distribution of sea ice, which can be obtained from high-resolution SAR images. Besides, the microwave backscattering models of sea ice (Carlström, 1997; Liu et al., 2016), have shown that the predicted backscatter coefficient obtained by nonlinear scattering models has a strong agreement with C-band SAR data for incidence angles between 25° and 50° and small-scale ice roughness. So nonlinear models may be more suitable for complex ice conditions in Arctic summer, when ice and melt ponds are mixed.
Ensemble learning methods have gained significant attention in the last decade due to their ability to handle nonlinear problems (Gu and Angelov, 2022). Ensemble learning methods include Bagging and Boosting (Zhu et al., 2020). Bagging and Boosting are two ensemble learning techniques used to reduce prediction errors in regression by using multiple predictors (Drucker, 1997). Predictors can be selected from various basic regression models, such as the linear model, Classification and Regression Trees (CART) (Efendi et al., 2020), support vector machine (SVM) (Yan and Huang, 2019; Li et al., 2021). Additionally, the predictors can be a combination of different models. Bagging methods, such as Random Forest (RF), are widely used in sea ice classification (Jiang et al., 2022; Marbouti et al., 2018) and sea ice drift monitoring (Palerme and Müller, 2021). However, Bagging methods often require more memory space than Boosting. And it is impossible for weak learners to use the information obtained from trained weak learners (Hsieh, 2023; Shanmugasundar et al., 2021). Therefore, Boosting methods are better suited for complex nonlinear problems (Xiao et al., 2019).
In this study, a new method is proposed for retrieving small-scale SIR from Sentinel-1 SAR images. In Section 2, the data and preprocessing methods are illustrated. In Section 3, the method for retrieving SIR from SAR is presented, including two-dimensional continuous wavelet transform (CWT) and the Adaboost Regression model. In Section 4, the proposed method is evaluated using ATM data, and the role of two-dimensional (2-D) CWT and the applicability of the proposed method are discussed in Section 5. Finally, the conclusions are presented in Section 6.
This study focuses on the western Beaufort Sea, north of Alaska, which is the boundary between first-year ice (FYI) and multi-year ice (MYI). The spatial coverage of the study area is 73°−77°N, 158°−162°W (Fig. 1). The surface of the ice floe contained hummocks and wide melt ponds (Kim et al., 2020). In June and July, MYI concentrations rose while FYI concentrations declined significantly due to sea ice melt as temperatures increased (Babb et al., 2019). Because much of the ice did not survive the summer melt season, thinner and younger ice covers resulted from the melt season. Therefore, the FYI and MYI were mixed in this area, making it difficult to measure or predict the SIR.
In this study, the Extra Wide (EW) Ground Range Detected Medium Resolution (GRDM) Sentinel-1 SAR images in HH-polarization and HV-polarization are used to retrieve SIR. These images have a coverage of over a 400 km swath and a spatial resolution of 40 m (Torres et al., 2012). To evaluate the proposed method, two images in extra wide swath and GRDM mode that were sensed on July 13, 2016 were selected. These images covered the study area and their related imformations are listed in Table 1.
As an example, a HH-polarization band SAR image in the study area is shown in Fig. 2. To investigate the radar backscatter from sea ice ($ {\sigma _{{\text{SI}}}} $) and open water (${\sigma _{{\text{OW}}}}$), three areas with different incidence angles were chosen from Fig. 2, as depicted in Fig. 3 (Areas 1–3 in Fig. 2). The incidence angles of these three areas are also marked in the Fig. 2. Under small incidence angles (Fig. 3a), the radar backscatter of sea ice (i.e., the black patches) is smaller than that of open water (i.e., the bright background). Under middle incidence angles (Fig. 3b), it is difficult to distinguish sea ice from open water because $ {\sigma _{{\text{SI}}}} $ is close to ${\sigma _{{\text{OW}}}}$; under large incidence angle (Fig. 3c), bright sea ice can be easily discriminated from the dark open water (i.e., $ {\sigma _{{\text{SI}}}} $ is larger than ${\sigma _{{\text{OW}}}}$). The reason for this phenomenon is that the roughness of the sea surface at small or medium wind speeds is less than that of the sea ice, which will enhance the specular reflection of the smooth sea surface when the incidence angle is small (Jackson and Apel, 2004). Accordingly, the relationship among radar backscatter, incidence angle, and the SIR in summer is nonlinear. Therefore, traditional regression models, such as the linear regression model, may not be suitable for this problem. Besides, Fig. 3 also shows that the spatial patterns of sea ice are distinct, including the scales and directions of thaw holes, cracks, and leads, etc.
The airborne lidar observations collected during the Operation IceBridge (OIB) campaign are widely used in several previous studies. To study the statistical properties of the sea ice surface, the sea ice elevation measured by OIB ATM is selected. The ATM L2 Icessn Elevation, Slope, and Roughness V002 product collected on July 13, 2016 and July 14, 2016 (Studinger, 2014) was used in this work. It can provide the relative height above the World Geodetic System 1984 (WGS-84) ellipsoid of eachfootprint, which will be needed to convert to orthometric height using the Earth Gravitational Model-5 (EGM2008-5) model in Matlab.
The root-mean-square (rms) height is widely used to represent SIR in previous work (Segal et al., 2020) as a statistical indicator of sea ice surface. Root-mean-square height sq is the population standard deviation of the of all the footprints, it is calculated as follows:
$ s_q=\sqrt{\frac{1}{N} \sum_{i\;=\;1}^{N}\left(z_{i}-\overline{Z}\right)^{2}}, $
where $ \overline Z=\displaystyle\frac{1}{N} \sum_{i\;=\;1}^{N} z_{i} $ is the average height of all the footprints, N is the number of points, i is the i-th point .
It can be regarded as the overall standard deviation of sea ice surface height. However, only a few footprints on the sea ice surface can be obtained. If the method of calculating the overall standard deviation is used to calculate the SIR based on the footprint height, the SIR will be underestimated. In fact, the sample standard deviation is an unbiased estimate of the overall standard deviation. Therefore, the root-mean-square deviation of footprint height is used to estimate the SIR, which is calculated as follows:
$ \sigma=\sqrt{\frac{1}{N-1} \sum_{i\;=\;1}^{N}\left(z_{i}-\overline{Z}\right)^{2}}. $
To capture the small-scale changes of SIR, the standard deviation of the nearest 100 points is used here.
As an example, Fig. 4a shows the ATM orthometric heights that were acquired at 19:54:29 on July 13, 2016. The spatial coverage of the area is 75.149°−75.600°N, 159.733°−160.746°W. Figure 4b shows that the majority of the SIR in the area is less than 0.6 m.
To evaluate the SIR retrieved from SAR images using the ATM data, the SAR images and ATM data are matched as follows.
(1) For each SIR measured by ATM, find the nearest points measured by SAR where the distance between the two measurements is less than 40 m.
(2) If there are multiple roughness measurements obtained by ATM for one radar backscatter measured by SAR, average these roughness values.
To retrieve small-scale SIR from SAR images, the SAR image is first preprocessed, Then, a 2-D CWT is applied to extract the spatial information of the sea ice. Finally, an Adaboost Regression model is used to retrieve SIR from the spatial information obtained from the SAR images.
The SAR images are first preprocessed using the Sentinel Application Platform (SNAP) software package developed by the European Space Agency (Filipponi, 2019). The following shows the five steps applied.
(1) Apply orbit file.
(2) Thermal noise removal.
(3) Calibrate the Product.
(4) Speckle filtering.
(5) Converts the gray value backscatter to dB.
The orbit state information in the metadata file is not very accurate. So the operation of applying the orbit file is performed to automatically update the Sentinel-1 satellite orbit state information of the metadata file (.xml). This operation only updates the metadata file (.xml), while the gray value backscatters remain unchanged.
Due to the active imaging of SAR, the influence of the heat caused by the antenna and other devices of SAR cannot be ignored. Therefore, the thermal noise is removed by using SNAP.
After removing thermal noises, the gray value backscatters of SAR images are calibrated into the normalized radar backscatter coefficient (NRCS), $ \sigma_{0} $.
To suppress the speckle noise in SAR images, the Refined Lee speckle filter is applied with a window size of 7 by 7 pixels to the NRCS (Lee et al., 1994). The Refined Lee filter averages the image while preserving edges, so the patterns of sea ice will not be affected.
The original backscatter coefficients are represented as gray values of 8-bit, and they are converted into decibels for the purpose of data storage and analysis.
Due to the high spatial resolution of SAR images, the spatial patterns of sea ice are distinct, such as the scales and directions of thaw holes, cracks, and leads, particularly during the Arctic summer. To obtain more information about sea ice, a 2-D CWT is applied to SAR images.
2-D CWT is widely used for the detection, extraction, or classification of various features in SAR images. It can not only measure the scale of a signal, but also the directionality of the signal (Antoine et al., 1999). 2-D Cauchy wavelet, which is strictly supported in a narrow convex cone in spatial frequency space, has shown great performance in analyzing directional features (Antoine et al., 1999). Therefore, 2-D Cauchy wavelets are utilized in this study.
For a 2-D spatial image $ \vec{x} $, its 2-D wavelet transform can be expressed as follows (Antoine and Murenzi, 1996; Daubechies, 1992):
$ \begin{split}S(a, \phi, \vec{b}) =&\frac{a}{\sqrt{c_{{\text{ψ}}}}} \int_{R^{2}} {\text{ψ}}^{*}\left({\boldsymbol{r}}_{-{\text{ϕ}}}\left(\frac{\vec{x}-\vec{b}}{a}\right)\right) s(\vec{x})\; {\mathrm{d}}^{2} \vec{x} \\=& \frac{a}{\sqrt{c_{{\text{ψ}}}}} \int_{R^{2}} {\mathrm{exp}}({{\mathrm{i}}{\vec{b}} \vec{\omega}} )\hat{\;{\text{ψ}}}^{*}\left(a r_{-\phi}(\vec{\omega})\right) \hat{s}(\vec{\omega}) \;{\mathrm{d}}^{2} \vec{\omega},\end{split} $
where $c_{{\text{ψ}}} $ is the admissibility condition that the complex-valued mother wavelet $ {{\text{ψ}}}$ must satisfied, R2 is an n-dimensional Euclidean space on the real number field, ${\boldsymbol{r}}_{-{\text{ϕ}}} $ is the rotation matrix in the case of $-\phi $, $\vec x $ is a vector in 2-D image space and a 2-D l coordinate of pixels on the SAR image, s($\vec{x} $) is the 2D signal function (input SAR image) belonging to the R2 space. $\hat {s}( \vec{\omega}) $ is the fourier transform (FT) of $s(\vec {x}) $, ${\mathrm{i}}=\sqrt{-1} $ is the imaginary unit. $ {a} $ is the scale parameter, $ \vec{b} $ is the translation parameter, $ \vec{\omega} $ is the spatial frequency and $ {\text{ψ}}^{*} $ is the complex conjugate of the wavelet function $ {\text{ψ}}$.
The rotation matrix ${\boldsymbol{r}}_{\text{ϕ}}$ with a rotation angle $ \phi $, which rotates the wavelet in spatial coordinates, is usually defined as
$ {\boldsymbol{r}}_{-{\text{ϕ}}}=\left(\begin{array}{cc}\cos \phi & \sin \phi \\-\sin \phi\;\;\;\; & \cos \phi\end{array}\right),\quad 0 \leqslant \phi < 2 {\text π} . $
The $ \hat{s}(\vec{\omega}) $ is the fourier transform (FT) of $ s(\vec{x}) $, it can be express as
$ \hat {s}(\vec{\omega})=\frac{1}{2\pi}\int_{R^2}s({\vec{x}})\;{\mathrm{exp}}({-{\mathrm{i}}\vec{\omega}\vec{x}})\;{\mathrm{d}}^2\vec{x}, $
where $ \vec{\omega} $ is the spatial frequency, and $\hat{{\text{ψ}}}_{a,{\text{ψ}},\vec{b}}(\vec{\omega}) $, the FT of $ {\text{ψ}}_{a, \phi, \vec{b}}{(\vec{x})} $ is defined as
$ \hat{{\text{ψ}}}_{a,{\text{ψ}},\vec{b}}(\vec{\omega})=a\;{\mathrm{exp}}({-{\mathrm{i}}\vec{b}\vec{\omega}})\;\hat{{\text{ψ}}}\;(ar_{-\phi}(\vec{\omega})). $
The complex-valued mother wavelet $ {\text{ψ}} $ must satisfy the admissibility condition as follows:
$ c_{{\text{ψ}}}=(2\pi)^2\int_{R^2}\frac{|\hat{{\text{ψ}}}(\vec{\omega})|^2}{|\vec{\omega}|^2}{\mathrm{d}}^2\vec{\omega}<\infty .$
In this study, the Fourier transform of 2-D Cauchy wavelet is used (Antoine et al., 1999) as follows:
$ \begin{split}&\hat{{\text{ψ}}}(\omega_x,\omega_y)=[\omega_ x{\cdot}\sin \alpha+\omega_y{\cdot}\cos\alpha]^L\times\\&\qquad[-\omega_x{\cdot}\sin\alpha+\omega_y{\cdot}\cos\alpha]^M\cdot {\mathrm{exp}}\left[{-A\frac{(\omega_x)^2+(\omega_y)^2}{2}}\right],\end{split} $
where $ \omega_{x} $ is the frequency in x direction, $ \omega_{y} $ is the frequency in y direction, $ A $ is the dilation parameter of the wavelet, $ \alpha $ is a half of opening angle of the cone which the 2-D Cauchy wavelet is supported. L and M give the number of vanishing moments of $ \hat{{\text{ψ}}}$ on the edges of the cone, and thus controls the regularity of the wavelet.
The range of each parameter: $ A \in[0, \infty) $, $ \alpha \in\left[0, \dfrac{\pi}{2}\right) $, $ L, M \in [0, \infty) $, and $\omega_x {\cdot}\tan \alpha \geqslant \left|\omega_{y}\right|$. And the value of these parameters used in the study is given in Table 2.
The modulus and phase of the complex wavelet coefficient $ S(a, \phi, \vec{b}) $ are expressed as
$ |S(a, \phi, \vec{b})|=\sqrt{[{{\mathrm{Im}}}(S(a, \phi, \vec{b}))]^{2}+[{{\mathrm{Re}}}(S(a, \phi, \vec{b}))]^{2}} ,$
and
$ \varphi(a, \phi, \vec{b})={\mathrm{arctan}} \left[\frac{{{\mathrm{Im}}}(S(a, \phi, \vec{b}))}{{{\mathrm{Re}}}(S(a, \phi, \vec{b}))}\right] ,$
where Re(·) and Im(·) are the real and imaginary parts of a complex number.
In this study, the range of the scale parameter is selected to be from 1 to 32, and the range of the rotation angle is selected to be from 0 to $ 2 \pi $.
After applying 2-D CWT on HH-polarization SAR images, the wavelet coefficients of each point are derived. The peak scale and peak direction of the point can then be obtained by selecting the maximum wavelet spectrum coefficient. In the sea ice SAR image, the peak direction indicates the main direction of the ice pattern, and the peak scale will be large if the size of the ice pattern is small, while it will be small if the size of the ice pattern is large.
Because the relationship between SIR and the NRCS measured by SAR is nonlinear, the Adaptive Boosting Regression (Adaboost Regression, ABR) is used to retrieve SIR from SAR images. The ABR model is applied as follows.
Firstly, the information measured by SAR and the spatial information derived from SAR images are inputted into the ABR model, this includes the incidence angle $ \theta $, elevation angle $\theta_{s}$ and the HH-polarization NRCS $ \sigma_{0} $ measured by SAR, as well as the peak scale and peak angle of each point derived from the 2-D Cauchy CWT.
To train the ABR model, the SIR derived from ATM data is used as the output.
Unlike the parallel ensemble technique of Bagging methods, ABR is a sequential machine learning ensemble technique used to combine several weak learners based on their weights in order to create a strong learner (Shanmugasundar et al., 2021).
In ABR process, there are T iterations to train these weak learners and get a strong regressor (Pedregosa et al., 2011). In each iteration, a weighted training dataset will be generated and this weighted training dataset is used to train a weak learner that has never been trained. During every training, weight is redistributed to each sample observation. The falsely predicted samples will be given more weight to increase the value of the loss function, thereby reducing the bias towards false predictions.
To train and test the ABR model, the specific steps are as follows:
(1) Divide the dataset in Section 3.3.1 into a training set and a testing set. In this study, the training set and the testing set are randomly split in a ratio of 70%:30% .
(2) Initialize the weights of all training samples to $ \dfrac{1}{m} $ (where $ m $ is the size of the training set).
(3) Train the weak learner $ h_t $ with the training set ($ t $ refer to the number of iterations).
(4) Calculate the error rate of the weak learner $ h_t$.
(5) Update the weight of each sample.
(6) Repeat Steps (3) to (5) until $ t $ reaches $ T $.
(7) Get the final strong learner.
(8) Apply the strong learner to the testing set.
The details of the algorithm flow are shown in Algorithm 1.
In this work, 100 Decision Tree regressors are selected as the base estimators, indicating that the number of iterations T is set to 100. The maximum depth of the decision tree is 10. The loss function is chosen as follows:
$ L\left(y_{i}, \vec{x}_{i}\right)=\left|{y}_{i}\left(\vec{x}_{i}\right)-y_{i}\right|, $
where $ \vec{x}_{i} $ is a vector including all features in the input data of the i-th sample, $ y_{i} $ is the true value (also represent the SIR), $\vec{x}_{i} $ is a vector contains all features in one sample and $ \vec y_{i}\left(x_{i}\right) $ is the predicted SIR by using $ x_{i} $ through the weak decision tree regressor $ h_{t} $.
To assess the accuracy of the proposed method, the widely used evaluation metrics which are used here, listed in Table 3.
To further evaluate the performance of the ABR model and mitigate the bias caused by data selection in the splitting of the training set and testing set, K-Fold cross-validation is conducted.
K-Fold cross-validation splits the dataset into k consecutive folds. Each fold is used once as the validation set, while the other k – 1 folds are used as the training set. K-Fold cross-validation is an effective way to evaluate the model’s generalization ability.
In summary, the flowchart illustrating the proposed method for retrieving SIR from SAR images is presented in Fig. 5.
In this section, the proposed method is first used to retrieve SIR from SAR images using different inputs. The method is then evaluated by comparing it with the SIR measured by ATM.
Figure 6 shows the relationship among SIR, the NRCS $ \sigma_{0} $ and incidence angle $ \theta $. Because the correlation between elevation angle $ \theta_{s} $ and incidence angle $ \theta $ is large, the elevation angle is not shown. It is evident that the relationship between SIR and variables $ \sigma_{0} $ and $ \theta $ is nonlinear during the summer in the study area. This finding aligns with the analysis presented in Section 2.2, indicating that linear models are not suitable for estimating SIR in this context.
By using the incidence angle, elevation angle and NRCS measured by SAR as inputs to the proposed method, the SIR is estimated. These estimates are then compared with those measured by ATM, as shown in Table 4 and Fig. 6. Table 4 shows that although the performance metrics of the testing data are relatively reasonable, MAE, RMSE, and MAPE values are significantly larger than those of the training data, and R2 values are significantly smaller than those of the training data.
Figure 7 compares the SIR measured by ATM with the SIR estimated by ABR. The validation set is obtained through K-Fold cross-validation (Section 3.3.4) by combining each fold together. It is clear that most of the estimated values generally follow the 1:1 line well in each dataset. Additionally, the performance metrics of the training set are slightly better than those of the other datasets. The values of R2, MAE, RMSE, and MAPE is 0.74, 2.93 cm, 4.85 cm, and 61.98% for the testing dataset, respectively. So Adaboost Regression model can yield reasonable SIR.
To further evaluate the method, the K-Folds cross-validation (where k = 5) is selected to split the training set. Considering that ATM products are collected from various locations, we shuffle the dataset before splitting it to ensure that samples from different regions are randomly selected.
Table 5 displays the performance metrics for each fold and the corresponding mean value for each metric. The performance of each fold is almost the same as the average performance, and it also approaches the performance of the validation set in Fig. 7. Therefore, the performance of Adaboost Regression is independent of the data partitioning method.
However, there are some problems associated with solely relying on information from individual pixels of a SAR image. Firstly, differences between the performances of training set and those of other datasets are large, which is often considered to be a feature of overfitting. Secondly, the errors of the samples with larger SIR are usually large, which means poor generalization performance.
Considering the presence of numerous thaw holes, melt ponds, cracks, and leads in the SAR image of summer sea ice, the 2-D CWT is applied to obtain the spatial information about the sea ice. This includes the peak scale and peak angle of each point on the SAR images, which are then used to derive the SIR. Comparisons of the estimated SIR with that measured by ATM are shown in Table 6 and Fig. 8.
Comparing Table 6 with Table 4, it becomes evident that all the performance metrics derived using 2-D CWT are significantly superior to those derived without it. The performance of the testing set is also close to those of the training set.
Figure 8 compares the SIR measured by ATM with those estimated by ABR using 2-D CWT. The validation set is also obtained through K-Fold cross-validation and combines different folds together. It is evident that all the estimated values followed the 1:1 line across various datasets, particularly in the training set. More importantly, the performance differences are small across different datasets. Table 6 and Fig. 8 also show that the R2 values exceed 0.90, while the MAE, RMSE, and MAPE values are below 1.82 cm, 3.02 cm, and 37.54%, respectively, for all datasets. Therefore, the peak scales and peak angles of sea ice derived from the 2-D Cauchy CWT improve greatly the performance metrics for all datasets. Thus, the 2-D Cauchy CWT is a valuable method for extracting spatial information of sea ice from SAR images.
To further evaluate the method, the K-Folds cross-validation (where k = 5) is also carried out. Table 7 shows that the differences in R2 for each fold are smaller than 0.03. And the average performance metrics are also approach the performance metrics of the validation set in Fig. 8. The variance of all these metrics is significantly less than those derived without 2-D Cauchy CWT. Therefore, the generalization ability of Adaboost Regression improves after conducting 2-D CWT in the retrieval of SIR, and the risk of overfitting decreases as well.
To investigate the effectiveness of 2-D CWT on sea ice SAR images. The Area 4 shown in Fig. 2 is chosen as a representative sample, as depicted in Fig. 9a.
After applying 2-D Cauchy CWT on the SAR image in Fig. 9a, the peak scale and peak angle of each pixel are shown in Figs 9c and d. At the edge of the dark patch in the upper part of Fig. 9a, the magnitude of the NRCS changes sharpest (as shown by the largest value of the gradient in Fig. 9b, the peak scales are smallest (Fig. 9c); at the lower part of the image in Fig. 9a, there is no significant change in NRCS, so the peak scales are larger (Fig. 9c), which coincides well with the spatial patterns of sea ice. The peak angles are roughly similar to the direction of the gradient image, because the directional resolution of 2-D CWT is selected to be $\pi/8 $ for reducing time in this study, and the performance may be better by increasing the directional resolution.
For each iteration in the method (Section 3.3.2), in order to improve the prediction accuracy for these samples, the weight of the falsely predicted samples will be increased. However, this approach may lead to overfitting if the data contains a significant amount of noise. In this subsection, the observation learning curve (Mohr and van Rijn, 2022) is used to evaluate the problem.
By using the K-Fold cross validation method, the training data and testing data with different sizes are firstly obtained from the dataset (Section 3.3.2), where k = 5. The coefficient of determination (R2) is selected as the score. The learning curves of the present method with different sizes of the dataset are shown in Fig. 10, and it can be observed as follows.
(1) As the number of samples increases, the scores of the training data change little, while the scores of the testing data increase rapidly. Eventually, both of them converge to a maximum performance when the number of samples reach maximum. So the predictive performance will not be significantly improved by using more data, and the model has fully generalized to unseen data.
(2) The difference between the performance of training and validation reaches a minimum at the converged limit point. The current model is a good fit, and it has successfully avoided overfitting and underfitting.
(3) The green shaded area of the curves indicates that the standard deviation generally becomes smaller as the number of input samples increases, suggesting that the method is reliable.
In addition, when comparing Fig. 10a with Fig. 10b, it can be observed that the scores of the training data with 2-D CWT increase more rapidly than those without 2-D CWT. Furthermore, the standard deviations of the training data with 2-D CWT are smaller compared to those without 2-D CWT. Therefore, the method with 2-D CWT is more applicable than the method without 2-D CWT. And the spatial information of sea ice derived by 2-D Cauchy CWT is necessary for the retrieval of SIR from SAR images.
Center Arctic is far from the study area. In this region, sea ice is primarily made up of multi-year ice during the summer, which is distinct from the mixing of FYI and MYI in the study area. Therefore, this region is highly suitable for testing the generalization ability of the proposed method.
Three Sentinel-1 SAR images sensed in July 2017 (Table 8) cover the region (spatial coverage: 83°−87°N, 86°−90°W) in the Center Arctic. And the corresponding ATM data is also collected.
Considering that there is insufficient data for model training due to the absence of ATM data in the summer, it will be challenging for the model to generalize across different regions. Moreover, the sample points above the 84°N are quite distant from the other sample points in space (Fig. 11). Therefore, only the samples located in the region above the 84°N have been used for independent testing. Other samples in Center Arctic and the samples in study area have been used to train the model.
Due to the increase in the amount of training data, deeper decision trees are needed to improve the model’s accuracy. Therefore, the maximum depth of the decision tree is increased to 15, while the number of decision trees remains unchanged.
Table 9 and Fig. 12 show that the values of R2, MAE, RMSE and MAPE are 0.95, 0.89 cm, 1.46 cm, and 5.71 % for the testing dataset, respectively. In Fig. 12, it can be clearly seen that all the estimated values followed the 1:1 line well for the independent test set. This performance is even better than the performance on the training set. It means that the method provided in this paper has good generalization if other data from the same region are introduced into the model.
In this study, a new method is proposed to estimate small-scale ice roughness (40 m by 40 m) from SAR images. In the summer Arctic, there are thaw holes, cracks, melt ponds, and leads in sea ice SAR images. The relationship between NRCS and SIR is nonlinear, which complicates the retrieval of SIR from SAR images in the summer Arctic. Therefore, the 2-D CWT and AdaBoost Regression model are used to derive the SIR from SAR images. This method is evaluated by using the SIR measured by ATM, which demonstrates satisfactory results for the summer sea ice in the Beaufort Sea. The main findings of the study are as follows:
(1) The relationship between SIR and the radar backscatter of SAR is nonlinear, and the Adaboost Regression model based on ensemble learning is suitable for solving this problem.
(2) 2-D Cauchy CWT is useful to derive the spatial information about sea ice from SAR images. This includes identifying the scales and directions of thaw holes, cracks, and leads, which are important for retrieving SIR from SAR images in the Arctic summer.
In the future, the proposed method will be evaluated using more data from different time periods and locations. The applicability to various types of sea ice (FYI, MYI, deformed FYI (DFYI) etc.) will also be assessed. Furthermore, the ensemble learning method shows great potential for retrieving other sea ice characteristics with high spatial resolution and accuracy.
  • The National Key Research and Development Program of China under contract No. 2021YFC2803301; the National Natural Science Foundation of China under contract No. 41977302; the National Natural Science Youth Foundation of China under contract No. 41506199; the Natural Science Youth Foundation of Jiangsu Province under contrant No. BK20150905; the Science and Technology Project of China Huaneng Group Co., Ltd. under contract No. HNKJ20-H66.
Antoine J P, Murenzi R. 1996. Two-dimensional directional wavelets and the scale-angle representation. Signal Processing, 52(3): 259–281, doi: 10.1016/0165-1684(96)00065-5
Antoine J P, Murenzi R, Vandergheynst P. 1999. Directional wavelets revisited: Cauchy wavelets and symmetry detection in patterns. Applied and Computational Harmonic Analysis, 6(3): 314–345, doi: 10.1006/acha.1998.0255
Babb D G, Landy J C, Barber D G, et al. 2019. Winter sea ice export from the Beaufort Sea as a preconditioning mechanism for enhanced summer melt: A case study of 2016. Journal of Geophysical Research: Oceans, 124(9): 6575–6600, doi: 10.1029/2019JC015053
Beckers J F, Renner A H H, Spreen G, et al. 2015. Sea-ice surface roughness estimates from airborne laser scanner and laser altimeter observations in Fram Strait and north of Svalbard. Annals of Glaciology, 56(69): 235–244, doi: 10.3189/2015AoG69A717
Cafarella S M, Scharien R, Geldsetzer T, et al. 2019. Estimation of level and deformed first-year sea ice surface roughness in the Canadian Arctic archipelago from C- and L- band synthetic aperture radar. Canadian Journal of Remote Sensing, 45(3/4): 457–475, doi: 10.1080/07038992.2019.1647102
Carlström A. 1997. A microwave backscattering model for deformed first-year sea ice and comparisons with SAR data. IEEE Transactions on Geoscience and Remote Sensing, 35(2): 378–391, doi: 10.1109/36.563277
Carlström A, Ulander L M H, Hakansson B. 1994. Model for estimating surface roughness of level and ridged sea ice using ERS-1 SAR. In: 1994 IEEE International Geoscience and Remote Sensing Symposium. Pasadena, CA, USA: IEEE,168–170
Daubechies I. 1992. Ten Lectures on Wavelets. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics
Drucker H. 1997. Improving regressors using boosting techniques. In: Proceedings the 14th International Conference on Machine Learning. Nashiville: Morgan Kaufmann Publishers Inc., 107–115
Efendi A, Fitriani R, Naufal H I, et al. 2020. Ensemble Adaboost in classification and regression trees to overcome class imbalance in credit status of bank customers. Journal of Theoretical and Applied Information Technology, 98(17): 3428–3437
Filipponi F. 2019. Sentinel-1 GRD preprocessing workflow. Proceedings, 18(1): 11
Grenfell T C, Perovich D K. 1984. Spectral albedos of sea ice and incident solar irradiance in the southern Beaufort Sea. Journal of Geophysical Research: Oceans, 89(C3): 3573–3580, doi: 10.1029/JC089iC03p03573
Gu Xiaowei, Angelov P P. 2022. Multiclass fuzzily weighted adaptive-boosting-based self-organizing fuzzy inference ensemble systems for classification. IEEE Transactions on Fuzzy Systems, 30(9): 3722–3735, doi: 10.1109/TFUZZ.2021.3126116
Gupta M, Barber D G. 2015. Sub-pixel evaluation of sea ice roughness using AMSR-E data. International Journal of Remote Sensing, 36(3): 749–763, doi: 10.1080/01431161.2014.1001081
Hong S. 2010. Detection of small-scale roughness and refractive index of sea ice in passive satellite microwave remote sensing. Remote Sensing of Environment, 114(5): 1136–1140, doi: 10.1016/j.rse.2009.12.015
Hsieh W W. 2023. Decision trees, random forests and boosting. In: Introduction to Environmental Data Science. Cambridge: Cambridge University Press, 473–493
Jackson C R, Apel J R. 2004. Synthetic Aperture Radar Marine User’s Manual. Washington, DC, USA: National Oceanic and Atmospheric Administration, 377–379
Jiang Mingzhe, Clausi D A, Xu Linlin. 2022. Sea-ice mapping of RADARSAT-2 imagery by integrating spatial contexture with textural features. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15: 7964–7977, doi: 10.1109/JSTARS.2022.3205849
Kim S H, Kim H C, Hyun C U, et al. 2020. Evolution of backscattering coefficients of drifting multi-year sea ice during end of melting and onset of freeze-up in the western Beaufort Sea. Remote Sensing, 12(9): 1378, doi: 10.3390/rs12091378
Landy J C, Petty A A, Tsamados M, et al. 2020. Sea ice roughness overlooked as a key source of uncertainty in CryoSat-2 ice freeboard retrievals. Journal of Geophysical Research: Oceans, 125(5): e2019JC015820, doi: 10.1029/2019JC015820
Lee J S, Jurkevich L, Dewaele P, et al. 1994. Speckle filtering of synthetic aperture radar images: A review. Remote Sensing Reviews, 8(4): 313–340, doi: 10.1080/02757259409532206
Li Xiaoming, Sun Yan, Zhang Qiang. 2021. Extraction of sea ice cover by Sentinel-1 SAR based on support vector machine with unsupervised generation of training data. IEEE Transactions on Geoscience and Remote Sensing, 59(4): 3040–3053, doi: 10.1109/TGRS.2020.3007789
Liu Mengjie, Dai Yongshou, Zhang Jie, et al. 2016. The microwave scattering characteristics of sea ice in the Bohai Sea. Acta Oceanologica Sinica, 35(5): 89–98, doi: 10.1007/s13131-016-0861-6
Marbouti M, Antropov O, Eriksson P, et al. 2018. Automated sea ice classification over the Baltic Sea using multiparametric features of Tandem-X InSAR images. In: 2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE, 7328–7331
Martin T, Tsamados M, Schroeder D, et al. 2016. The impact of variable sea ice roughness on changes in Arctic Ocean surface stress: A model study. Journal of Geophysical Research: Oceans, 121(3): 1931–1952, doi: 10.1002/2015JC011186
Mohr F, van Rijn J N. 2022. Learning curves for decision making in supervised machine learning—A survey. arXiv: 2201.12150
Mosadegh E, Nolin A W. 2020. Estimating Arctic sea ice surface roughness by using back propagation neural network. In: AGU Fall Meeting 2020. San Francisco, CA, USA: AGU, C014–0005
Mosadegh E, Nolin A W. 2022. A new data processing system for generating sea ice surface roughness products from the multi-angle imaging spectroradiometer (MISR) imagery. Remote Sensing, 14(19): 4979, doi: 10.3390/rs14194979
Nolin A W, Mar E. 2018. Arctic sea ice surface roughness estimated from multi-angular reflectance satellite imagery. Remote Sensing, 11(1): 50, doi: 10.3390/rs11010050
Palerme C, Müller M. 2021. Calibration of sea ice drift forecasts using random forest algorithms. The Cryosphere, 15(8): 3989–4004, doi: 10.5194/tc-15-3989-2021
Pedregosa F, Varoquaux G, Gramfort A, et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12: 2825–2830
Prasad S, Haynes R D, Zakharov I, et al. 2021. Estimation of sea ice parameters using an assimilated sea ice model with a variable drag formulation. Ocean Modelling, 158: 101739, doi: 10.1016/j.ocemod.2020.101739
Segal R A, Scharien R K, Cafarella S, et al. 2020. Characterizing winter landfast sea-ice surface roughness in the Canadian Arctic archipelago using Sentinel-1 synthetic aperture radar and the multi-angle imaging spectroradiometer. Annals of Glaciology, 61(83): 284–298, doi: 10.1017/aog.2020.48
Shanmugasundar G, Vanitha M, Čep R, et al. 2021. A comparative study of linear, Random Forest and AdaBoost Regressions for modeling non-traditional machining. Processes, 9(11): 2015, doi: 10.3390/pr9112015
Studinger M. 2014. IceBridge ATM l2 Icessn elevation, slope, and roughness, version 2. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://nsidc.org/data/ILATM2/versions/2 [2023-06-01]
Torres R, Snoeij P, Geudtner D, et al. 2012. GMES Sentinel-1 mission. Remote Sensing of Environment, 120: 9–24, doi: 10.1016/j.rse.2011.05.028
Tschudi M, Meier W N, Stewart J S, et al. 2019. EASE-grid sea ice age, version 4. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://nsidc.org/data/NSIDC-0611/versions/4 [2023-06-01]
Wen Xiaoyang, Xue Cunjin, Dong Qing. 2011. The Arctic sea ice surface roughness estimation and application. In: Proceedings of the 21st International Offshore and Polar Engineering Conference. Maui, HI, USA: ISOPE, 958–961
Xiao Changjiang, Chen Nengcheng, Hu Chuli, et al. 2019. Short and mid-term sea surface temperature prediction using time-series satellite data and LSTM-AdaBoost combination approach. Remote Sensing of Environment, 233: 111358, doi: 10.1016/j.rse.2019.111358
Yan Qingyun, Huang Weimin. 2019. Detecting sea ice from TechDemoSat-1 data using Support Vector Machines with feature selection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(5): 1409–1416, doi: 10.1109/JSTARS.2019.2907008
Zhu Zonghai, Wang Zhe, Li Dongdong, et al. 2020. Geometric structural ensemble learning for imbalanced problems. IEEE Transactions on Cybernetics, 50(4): 1617–1629, doi: 10.1109/TCYB.2018.2877663
Year 2024 volume 43 Issue 5
PDF
72
41
Cite this Article
BibTeX
Article Info
doi: 10.1007/s13131-023-2248-9
  • Receive Date:2023-06-14
  • Online Date:2025-11-18
  • Published:2024-05-25
Article Data
Affiliations
History
  • Received:2023-06-14
  • Accepted:2023-08-29
Funding
The National Key Research and Development Program of China under contract No. 2021YFC2803301; the National Natural Science Foundation of China under contract No. 41977302; the National Natural Science Youth Foundation of China under contract No. 41506199; the Natural Science Youth Foundation of Jiangsu Province under contrant No. BK20150905; the Science and Technology Project of China Huaneng Group Co., Ltd. under contract No. HNKJ20-H66.
Affiliations
    1 School of Marine Sciences, Nanjing University of Information Science & Technology, Nanjing 210044, China
    2 East Sea Information Center of State Oceanic Administration, Shanghai 200136, China

Corresponding:

References
Share
https://castjournals.cast.org.cn/joweb/aos/EN/10.1007/s13131-023-2248-9
Share to
QR

Scan QR to access full text

Cite this article
BibTeX
Citations
表12种不同金属材料的力学参数

Family
属数
Number of
genus
种数
Number of
species
占总种数比例
Percentage of
total species (%)

Genus
种数
Number of
species
占总种数比例
Percentage of total
species (%)
鹅膏菌科Amanitaceae 2 11 5.26 鹅膏菌属 Amanita 10 4.78
小菇科 Mycenaceae 2 12 5.74 丝盖伞属 Inocybe 5 2.39
多孔菌科 Polyporaceae 8 14 6.70 蜡蘑属 Laccaria 5 2.39
红菇科 Russulaceae 3 23 11.00 小皮伞属 Marasmius 6 2.87
小菇属 Mycena 11 5.26
光柄菇属 Pluteus 5 2.39
红菇属 Russula 17 8.13
栓菌属 Trametes 5 2.39
关闭全屏
  • BibTeX
  • EndNote
  • RefWorks
  • TxT