With the advancement of image processing and artificial intelligence, deep learning-based algorithms have become increasingly important in the tasks of image target detection and recognition. In the aerospace domain, satellite remote sensing object detection consistently confronts challenges, including cluttered imaging backgrounds, numerous minuscule targets, and wide dynamic imaging ranges. In recent years, convolutional neural network-based approaches have witnessed significant progress in satellite remote sensing object detection, particularly in fine-grained target recognition. These advancements play crucial roles across domains such as military reconnaissance, postdisaster reconstruction, and resource exploration. Given the challenges of large coverage, small and dense targets, and complex imaging backgrounds in satellite-based remote sensing images, large and complex neural networks have been utilized to represent image features for further target detection. Although large neural networks exhibit certain detection capabilities, they are difficult to deploy in space-based remote sensing tasks because of the high real-time requirements and limited computing resources. To address these issues, this study proposes a lightweight space-based remote sensing image target detection algorithm that integrates multiattention mechanisms in the spatial domain and channels. It deploys remote sensing image data processing and target detection algorithms to a remote sensing edge intelligent computing platform, achieving efficient and accurate target recognition and analysis for remote sensing images. This approach provides a solution for future in-orbit fast target detection algorithm processing and real-time tracking of detection targets.
Based on a You Only Look Once version 11 model (i.e., YOLOv11n), the proposed algorithm integrates the channel prior convolutional attention (CPCA) mechanism, which combines channel and spatial attention mechanisms. It utilizes the channel attention mechanism to generate a channel attention map. Subsequently, this map is multiplied element-wise with the model’s input feature map to produce a channel-weighted feature map. This channel-weighted feature map is then fed into a depthwise convolution module to generate a spatial attention feature map. The CPCA mechanism can dynamically allocate attention weights across channel and spatial dimensions, enriching the network’s target features by extracting channel-wise and spatial attention features, thereby enhancing the network’s feature extraction capability. By employing a 2D convolutional layer based on partial convolution (Pconv), which convolves only a subset of input channels, it leverages redundant compression in interchannel feature maps. This approach avoids the issue of excessive parameters typically introduced by adding attention modules. Consequently, the improved model reduces the parameter count by 0.48 M (approximately 18.53%) compared with the original YOLOv11n. This approach partially addresses the challenge of deploying network models on embedded devices. For ensuring consistent dimensions between the two branches of Pconv, a max-pooling operation is applied to the nonconvolved channels, downsizing the feature maps to half their original dimensions. Through leveraging pointwise convolution to fully utilize the representational capacity of channel-wise features, this design reduces the computational load while preventing significant degradation in the model’s feature extraction capability.
During validation on the DIOR dataset, the proposed algorithm was compared with various YOLO algorithms for object detection. Experimental results demonstrate that real-time detection transformer(RTDETR) has the largest parameter count at 9.42 M, YOLOv11n has 2.59 M parameters, and YOLOv11n_CBAM has 2.74 M. By contrast, the proposed model contains only 2.11 M parameters, accounting for 81.47% of those of the original YOLOv11n. Meanwhile, compared with the original YOLOv11n algorithm, the proposed method achieves a mean improvement of 1.9% in accuracy and 1.2% in recall. The neural network processing unit (NPU) inference latency of YOLOv11n is 19.6 ms, whereas the proposed algorithm achieves only 14.8 ms. This result indicates a reduction of 4.8 ms in comparison with the original model, representing a 24.49% speed improvement. Additionally, the NPU-deployed YOLOv11n model attains an accuracy of 0.799 and a recall of 0.642, whereas the proposed algorithm achieves 0.819 accuracy and 0.652 recall. Accordingly, no potential accuracy degradation occurs during model migration and deployment. Compared with merely adding the CPCA module, the proposed algorithm exhibits a slight accuracy decrease of 0.10% but reduces the parameter count by 0.66 M. When contrasted with solely incorporating the Pconv module, it shows a marginal parameter increase of 0.08 M, yet it improves the accuracy by 1.7%.
Targeting space-based remote sensing minute object detection tasks, this study draws inspiration from the YOLOv11n model to propose a lightweight object detection algorithm that integrates multiattention mechanisms in the spatial domain and channels and contextual information. This approach significantly enhances detection accuracy while effectively reducing model parameters. By refining the attention mechanism in YOLOv11n, we introduce an improved architecture incorporating the CPCA module. This architecture enables comprehensive feature extraction for minute objects across spatial and channel dimensions, effectively mitigating missed detections and false alarms in spaceborne imagery. The conventional 2D convolutional layers in YOLO are replaced with Pconv-based designs, circumventing parameter inflation typically caused by attention modules. This replacement achieves an 18.53% parameter reduction and model lightweighting. Finally, through NPU-optimized deployment, the model’s hardware compatibility is enhanced. Compared with the original YOLOv11n, the proposed algorithm reduces inference time by 4.8 ms while maintaining detection accuracy, meeting real-time monitoring requirements. The solution proves exceptionally resource efficient for space-based engineering deployment with constrained computational resources and memory, providing crucial technical support for onboard implementation in spaceborne remote sensing systems.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |