Home Archive
Archive
2025 Volume 55 Issue 11  Published: 2025-11-05
    Signal and Information Processing
  • Ming LIU , Yongjun FANG , Han WU , Qiankun LI , Dongdong LI , Zhaoyang ZHANG
    doi: 10.3969/j.issn.1003-3106.2025.11.001

    In traffic surveillance systems, radar-camera devices are used to collaboratively perceive and monitor the roadside environment. Due to the principles of perspective imaging, the greater the distance to a target, the smaller its corresponding pixel area in the image. Furthermore, the bounding boxes generated by visual detection exhibit significant jitter. If calibration errors or visual occlusion exist, or if the detection boxes shake, a significant error will be introduced when the target' s position is mapped from the image coordinate system to the radar coordinate system, affecting tracking accuracy. This is especially true for collaborative target sensing and tracking with multiple sensors, which further increases the difficulty. To address these challenges, a multi-sensor, multi-target collaborative perception and tracking method is proposed, leveraging a two-stage matching strategy and an adaptive Kalman filter. This method improves association precision by adding a secondary matching strategy of Perspective View (PV) plane after the Bird's Eye View (BEV) plane is associated with the data of frame before and after. This effectively solves the problem of low tracking accuracy for distant targets caused by significant mapping errors. Based on the relationship model between image points and range-position jitter, an adaptive multi-sensor multi-target tracking method is proposed. By using the relationship model to update the parameters of the Kalman filter, and adaptively selecting the appropriate observation matrix and measurement covariance matrix according to the target sensor data source, the position and velocity parameters of the target are estimated. This effectively improves the real-time prediction accuracy of the target' s spatial position and velocity, and further enhances the accuracy of target association in the BEV plane. Experimental results show that the proposed method improves the Multiple Object Tracking Accuracy ( MOTA) index by 16.3% compared to the method without the two-stage matching strategy and only using the ordinary Kalman filter, significantly improving the accuracy of target perception and tracking in traffic scenes using millimeter-wave radar and vision integrated systems.

  • Signal and Information Processing
  • Mingfang LI
    doi: 10.3969/j.issn.1003-3106.2025.11.002

    Target detection in autonomous driving scenarios faces challenges such as complex environmental interference, multi-scale target distribution and target occlusion, and existing algorithms are still deficient in feature fusion capability, detail characterization accuracy and localization regression performance. To this end, an improved YOLOv8 detection algorithm, DMP-YOLO, is proposed. The original neck structure is optimized using Multi-Branch Auxiliary Feature Pyramid Network (MAFPN) to enhance the multi-scale feature fusion capability in complex traffic scenarios; C2f_DEConv is proposed in backbone network module, which replaces the standard convolution with Detail-Enhanced Convolution (DEConv) to significantly improve the detail capturing ability of small-scale vehicles and occluded targets through high-frequency feature preservation and local texture enhancement; the Powerful Intersection over Union version 2 (PIoUv2) loss function is introduced to optimize the improved bounding-box loss, which improves the regression accuracy of the target bounding-box through the optimization of dynamic scale-sensitive factors and geometric constraints. Experiments on the KITTI dataset demonstrate that DMP-YOLO achieves significant improvements across all key performance metrics, with mAP@0.5 reaching 89.0% (2.6% improvement compared with the baseline YOLOv8) as well as 2.9% improvement for mAP@0.5: 0.95, which provides an effective solution for high-precision real-time detection in autonomous driving scenarios.

  • Signal and Information Processing
  • Shuo CHANG , Shun XU , Meiying WEI
    doi: 10.3969/j.issn.1003-3106.2025.11.003

    Modulation recognition is a critical task in wireless communications. Although deep learning methods have achieved remarkable progress in this field, they still face the challenge of insufficient generalization ability in complex non-cooperative environments—particularly when confronted with varying channel conditions, which can obscure the subtle discriminative features between structurally similar modulation schemes (e. g. 16QAM and 64QAM) and thus degrade recognition performance. To address this unique challenge in the field of modulation recognition, an unsupervised adversarial domain adaptation method named Feature Alignment and Discrimination Domain Adaptation ( FADDA) is proposed. The core of FADDA is the introduction of a contrastive learning-based feature alignment loss on the basis of adversarial training. Adversarial training is responsible for learning domain-invariant features to adapt to channel variations, while the feature alignment. loss fundamentally enhances the model' s ability to distinguish between easily confused modulation types by explicitly reinforcing the compactness of intra-class features and the separability of inter-class features. Experimental results show that without target-domain labels, this method can significantly improve the model's cross-channel modulation recognition performance and demonstrate strong generalization ability.

  • Signal and Information Processing
  • Dan BO , Kai WANG , Yunsheng LIU , Shubin WANG
    doi: 10.3969/j.issn.1003-3106.2025.11.004

    To solve the problem that Automatic Modulation Recognition (AMR) is limited by small-sample data and insufficient fusion of time-frequency multimodal information in practical applications, which in turn leads to low recognition accuracy, the limitations of existing technologies in the AMR field are analyzed and a cross-modal self-supervised learning framework integrating a diffusion model and a contrastive learning mechanism is proposed. By introducing the diffusion model, the framework leverages its generative capability to achieve high-quality data synthesis and augmentation of communication signals, effectively alleviating the constraints of small-sample data on model training. Meanwhile, combined with the cross-modal contrastive learning mechanism, it constructs an inter-modal association learning module to fully explore and utilize the inherent correlations and complementary information between different time-frequency modal representations, thus solving the problem of insufficient multimodal information fusion. Finally, based on the above design, a Diffusion-Contrastive Hybrid Network (DCHN) model is established. Experimental results show that the recognition accuracy of this model on the RML2016.10a dataset is significantly higher than that of other network models, indicating that it possesses excellent recognition capability.

  • Signal and Information Processing
  • Peng PENG , Cifa CHEN , Shang ZHANG
    doi: 10.3969/j.issn.1003-3106.2025.11.005

    A ship infrared image target detection algorithm based on YOLO11n, named AGT-YOLO, is proposed to address the issues of low model accuracy and recall rate, difficulties in identifying small targets, and multi-scale recognition challenges under complex sea conditions. By introducing an improved GhostHGNetv2 network, the background discrimination capability is enhanced; the designed ASFP2 optimized neck network improves detection capabilities for low-resolution images and very small targets; the proposed Tack Adaptive Alignment Detection Head ( TAADH ) replaces the original detection head, enhancing localization and classification performance;meanwhile, the AFGCAttention mechanism is integrated to improve global information processing capability and the model's generalization ability. Experimental results show that compared to the baseline model YOLO11n, AGT-YOLO achieves a 4.4% increase in recall rate and a 3.1% increase in mean average precision at IoU=0.5 ( mAP@ 50), demonstrating strong multi-scale recognition capability and robustness in complex environments.

  • Signal and Information Processing
  • Zijian ZHOU , Qiang LIU
    doi: 10.3969/j.issn.1003-3106.2025.11.006

    An algorithm for weed recognition in beet fields based on improved YOLOv11 model is proposed to address the problems of low efficiency, low accuracy, and missed detection of small targets in complex real-world scenarios. The PoolFormer module and AKConv module are introduced into the backbone network to enhance the model's ability to capture global semantic information to improve detection accuracy, enhancing the detection performance in low resolution images and small objects. The AKConv module improves the feature extraction ability of the model for beets and weeds with irregular growth patterns by dynamically adjusting the convolution kernel parameters and shapes, while the PoolFormer module can effectively segment the edge features of beets and weeds that cover each other. Secondly, the High-level Screening Feature Pyramid Network (HS-FPN) module is added to the head network to enhance the efficiency of multi-scale fusion and improve the feature extraction efficiency and speed of beets and weeds during the seedling stage. Through experiments, it is found that the improved YOLOv11 model achieves increases of 6.9%, 7.8%, 7.9%, and 7.8% in precision, recall, mAP@0.5 and mAP@0.5: 0.95, respectively, compared to the original model. The results show that this algorithm has achieved significant improvement in weed recognition in beet fields, providing a more feasible solution for detecting weeds in beet fields in complex scenarios.

  • TT&C, Remote Sensing and Navigation & Positioning
  • Zhicheng LYU , Shasha GAO , Yue ZHANG
    doi: 10.3969/j.issn.1003-3106.2025.11.007

    Beidou Satellite Navigation System (BDS) navigation receiver has the functions of Positioning, Navigation, and Timing (PNT) and message communication, and has been widely used in various industries. Under the background of the smooth transition from BDS-2 regional system to BDS-3 global system, its influence on the service performance of the navigation receiver and countermeasures are studied. The differences between BDS-2 and BDS-3 in signal type, signal system, constellation scale and service performance are compared, and the specific manifestations and state change trends of BDS smooth transition are expounded. The impact of the smooth transition of BDS on the service performance of navigation receiver is analyzed emphatically, including navigation and positioning, message communication and anti-suppression-jamming ability. The simulation results show that the RDSS message communication service can still be used normally during the smooth transition period of BDS-2 receiver ( PRN01~37). With the progressive retirement of BDS-2 satellite, the number of satellites available in space will gradually decrease from 33 to 18, the average number of satellites visible worldwide will decrease from 11.62 to 6.31, the average Geometric Dilution Precision (GDOP) value will increase from 2.00 to 3.15, and the continuous availability of services will decrease from 93% to 46.46%, which will affect the positioning accuracy and service range. When the power of BDS-3 satellite is enhanced, the navigation receiver can obtain 7~15 dB improvement in anti-suppression-jamming ability. According to the different application scenarios of the navigation receiver, the corresponding countermeasures are given to weaken or eliminate the impact, so that the navigation receiver can continuously provide reliable services for users during the smooth transition of the BDS. The research results can provide reference for the design, development and application of BDS navigation receivers.

  • TT&C, Remote Sensing and Navigation & Positioning
  • Fang WANG , Zhetao ZHANG
    doi: 10.3969/j.issn.1003-3106.2025.11.008

    The BeiDou-3 Global Navigation Satellite System ( BDS-3) can provide data of six frequencies at present, which provides more choices for Multi-frequency Carrier Ambiguity Resolution (MCAR). Focusing on BDS-3, the basic method of Geometry and Ionosphere Free (GIF) model in MCAR is comprehensively studied, including the application of three-frequency GIF model in ambiguity resolution. Based on the theory of three-frequency linear combinations, the basic mathematical model under three-frequency is given. The optimal frequency combination for ambiguity resolution using GIF model is discussed under the possibility of 20 combinations of any three of the six frequencies. Meanwhile, the optimal linear combination of each frequency combination is also systematically discussed. In addition, the high-quality linear combinations for single-epoch ambiguity resolution using GIF model are also analyzed. The experiment is carried out by using the real BDS-3 six-frequency data. Through theoretical analysis and practical demonstration, the results show that when using the GIF model for ambiguity resolution, the optimal frequency combination is (B1C, B3I, B2a). In this method, if the sum of the coefficients of two fixed Extra Wide Lane /Wide Lane (EWL/WL) combinations equals zero, the standard deviation of the ambiguity for the third Narrow Lane (NL) combination is theoretically dependent solely on the frequency characteristics. However, due to the influence of unmodeled errors, the actual results may deviate from theoretical expectations. The GIF model effectively eliminates ionospheric delay effects and avoids Geometry Base ( GB) errors, demonstrating significant advantages. The ambiguity resolution based on GIF model exhibits strong potential particularly in ionosphere-active environments and medium-to-long baseline scenarios.

  • TT&C, Remote Sensing and Navigation & Positioning
  • Siyu YUAN , Guoqin KANG , Xueqiang ZHENG , Qiangqiang ZHOU
    doi: 10.3969/j.issn.1003-3106.2025.11.009

    With the development of technologies such as artificial intelligence, multi-agents ( e. g. , unmanned aerial vehicle swarms) have been increasingly applied in practical combat operations. The Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, designed to solve the coordination problems of multi-agents in cooperative environments, has become one of the mainstream applied algorithms in the multi-agent field owing to its unique Actor-Critic framework. To address the problems in multi-agent collaborative tasks during command and decision-making—including ambiguous role division and slow convergence of the algorithm's policy caused by information overload—an improved MADDPG algorithm incorporating a Dynamic Role Attention(DRA) mechanism, namely DRA-MADDPG, is proposed. This algorithm embeds a DRA module into the Actor-Critic framework, and achieves accurate optimization of division of labor and collaboration by dynamically adjusting the attention weights of each agent towards peers with different roles. Specifically, the role set ( reconnaissance, assault, command) and phase division ( exploration→execution→encirclement) for command tasks are defined, and on this basis, a role coordination matrix and phase adjustment coefficients are constructed. A DRA module is designed in the Critic network to calculate weights and filter key information by leveraging role relevance and task phases. Additionally, the Actor network is improved to generate targeted actions by integrating role responsibilities. Simulation experiments show that compared with MADDPG, the Area Under the Curve (AUC) of the cumulative training reward of DRA-MADDPG increases by 2.4%, and the task completion time decreases by 19.3%. Furthermore, comparative analysis of training reward curves reveals that DRA-MADDPG exhibits better learning efficiency in short-term training. It is demonstrated that this method is suitable for complex command and decision-making scenarios and provides a relatively efficient solution for multi-agent coordination.

  • Electromagnetic Field and Microwave
  • Kang LIU , Yingshu WANG , Binyang YAN , Guanghui ZHANG , Ping WANG , Zhihong YE
    doi: 10.3969/j.issn.1003-3106.2025.11.010

    To meet the requirements of half/full duplex communication systems, a dual-frequency omnidirectional dual-circularly-polarized Multiple Input Multiple Output (MIMO) antenna based on magnetic-electric dipole is proposed. It consists of four identical antenna elements symmetrically arranged to form a 2×2 MIMO array, and each antenna element is composed of circular planar waveguide, four folded electric dipoles, and eight parasitic electric dipoles. The slits on the circular planar waveguide and short-circuit cylinders are employed to form four magnetic dipoles on the opening side of the circular planar waveguide, and their radiated E-field is orthogonal to the E-field of the electric dipole. When a 90° phase difference between the magnetic dipole and the electric dipole is realized due to their spatial distance, a left-handed or right-handed circularly polarized wave with 360° coverage can be produced. The results demonstrate that each antenna element can radiate right-handed circularly polarized wave in low bands (2.42~2.47 GHz) and left-handed circularly polarize wave in high band (5.76~5.85 GHz), its Axial Ratio (AR) is less than 3 dB, and the gain fluctuation is less than 2.1 dB and 7.3 dB, respectively. Moreover, the isolation is lower than -30 dB in low band and -50 dB in high band respectively owing to symmetrical distributions between adjacent elements.

  • Electromagnetic Field and Microwave
  • Cui WANG , Dan WU , Kui WANG
    doi: 10.3969/j.issn.1003-3106.2025.11.011

    To address the issues of insufficient low-frequency coverage and oversized antennas in radio monitoring systems,a miniaturized ultra-wideband receiving antenna based on improved dipole structure is proposed. The design employs meandering techniques to bend the dipole arms (dimensions: 38.6 mm×134.1 mm×0.8 mm),integrating symmetrical parasitic elements and slotted structures to optimize current distribution and extend bandwidth. CST simulations and measurements demonstrate that the antenna achieves S11<-6 dB across 0.7~0.96 GHz and 1.3~5.3 GHz bands,covering standards such as GSM,DCS-1800,WLAN,WiMAX,and 5G NR n77/n78/n79.The radiation efficiency reaches 96.42% at 2.2 GHz and 95.63% at 4.4 GHz,with 85.4%±6.8% average efficiency in 5G Sub-6 GHz bands. The maximum gain of ( 4.23±0.54) dBi ( 3.3~5.0 GHz) surpasses conventional dipoles by 1.8 dBi. This structural innovation resolves the low-frequency coverage vs miniaturization trade-off,enabling multi-standard communication monitoring.

  • Electromagnetic Field and Microwave
  • Zhongwei LI , Runpeng DONG
    doi: 10.3969/j.issn.1003-3106.2025.11.012

    A dual band dual circularly polarized symmetric array antenna component is proposed to address the issue of independent receiving and transmitting antennas in traditional TT&C fields. A symmetrical array antenna is adopted to achieve dual band operation characteristics, while a 3 dB bridge is used to realize dual circularly polarized radiation characteristics, and a duplexer is stacked below the antenna to achieve high isolation between the receiving and transmitting frequency bands. Based on this design concept, an antenna prototype is manufactured and tested based on the simulation analysis. The results show that the designed antenna can operate in frequency bands covering 1.75~1.85 GHz and 2.2~2.4 GHz, and the axial ratio within these bands is less than 3 dB. The suppression between the receiving channel and the transmitting channel reaches over 70 dB. This provides a feasible design scheme for the shared antenna unit for transmitting and receiving in practical engineering.

  • Electromagnetic Field and Microwave
  • Yingdong WANG
    doi: 10.3969/j.issn.1003-3106.2025.11.013

    A wideband radial power combiner based on Ridge Gap Waveguide ( RGW) is designed for high-power combining applications in the wide millimeter-wave frequency band. The combiner consists of two metallic plates, with the center of the lower plate fed by a coaxial line. The energy is reflected by multiple metallic conical structures positioned above the plate, and directed into the radial transmission lines, achieving equal power division. The radial transmission lines employ ridge gap waveguides, which is built up by placing pin-type Electromagnetic Band-Gap (EBG) structures on the metal bottom plates adjacent to the ridge, effectively reducing coupling among adjacent channels without the need for welding the upper and lower plates. Simulation results show that the combiner operates within the frequency range from 14.7 to 37.5 GHz, with a reflection coefficient of less than -15 dB and a transmission coefficient of approximately -6.1 dB. The measurement results show good agreement with the simulation results.

  • Engineering & Application
  • Tingyu YUAN , Kai LIU , Biaoliang GUAN , Wen YE , Yacui ZHAO , Chaoyang ZHAO , Jinqiao WANG
    doi: 10.3969/j.issn.1003-3106.2025.11.014

    Vision-Language-Action (VLA) models are a core technology for achieving general embodied artificial intelligence, aiming to integrate visual perception, language understanding, and action decision-making within a unified end-to-end framework. The current research status and development trajectory of VLA models are comprehensively and systematically reviewed. The theoretical origins of VLA models are traced, and the paradigm shift from modular designs to unified architectures is clarified. Along the evolutionary path of VLA, representative works such as SpatialVLA, TLA, and GR00T N1 are presented with a focus on multimodal fusion and cognitive hierarchies. A detailed taxonomy of VLA models is constructed from two key dimensions-macro architecture and system hierarchy. Key technologies and design principles are deeply analyzed, ranging from pioneering works such as RT-1, to models introducing large-scale knowledge transfer such as RT-2, OpenVLA, and ECOT, and further to cutting-edge dual-system architectures such as Helix, OpenHelix, DexVLA, and DexGraspVLA. Mainstream simulation environments, core datasets, and benchmarks supporting VLA research are systematically integrated and reviewed. The application status and prospects of VLA models in robotic manipulation, autonomous navigation, and industrial automation are explored. Core challenges in current VLA research are analyzed, including generalization and data efficiency, long-horizon task planning, and real-time responsiveness. Future research directions are discussed, including integration with world models and enhancement of data efficiency.

  • Engineering & Application
  • Mingrong LI
    doi: 10.3969/j.issn.1003-3106.2025.11.015

    The rising incidence of skin lesions has made early screening for skin cancer increasingly critical. However, existing methods for skin lesion image segmentation often suffer from limitations in channel-wise information modeling, structural adaptability, and feature fusion, which can lead to inaccurate boundary delineation and insufficient utilization of crucial contextual information. To address these issues, a skin lesions image segmentation method based on attention mechanism and wavelet transform, termed AW-SkinNet, is proposed. The proposed approach employs a dual-branch collaborative attention module to extract spatial and channel-dependent features, integrates wavelet transform to enhance frequency-domain representations, and incorporates lightweight attention-guided sub-pixel upsampling to improve detail restoration and contextual understanding. Experimental results on the ISIC-2017 and ISIC-2018 skin lesion segmentation datasets demonstrate that the proposed method achieves higher segmentation accuracy compared with existing approaches for skin lesion image segmentation.

  • Engineering & Application
  • Wenjuan YAN , Zhijun GUAN , Yingyi TONG , Su LIU
    doi: 10.3969/j.issn.1003-3106.2025.11.016

    To solve the problems of tracking and positioning of airborne early warning aircraft and other long-range targets,a high-precision passive positioning strategy based on the collaborative networking of UAVs and passive radar is proposed. By analyzing the impact of the layout of multi-station passive radar on positioning accuracy,a layout strategy that can improve positioning accuracy and reduce layout costs is designed. Time Difference of Arrival (TDOA) positioning method is applied in the study,and on this basis,the Geometric Dilution of Precision ( GDOP) for multiple stations is analyzed to evaluate the impact of the layout form on positioning accuracy. A collaborative strategy is proposed that,when the initial direction of the target is unknown,first uses a star-shaped layout for initial positioning and then switches to an inverted triangular layout for high-precision secondary positioning. The optimal secondary stations are selected using the “virtual structure method”and the flight trajectory of the UAV is optimized using an improved Particle Swarm Optimization ( PSO) algorithm to achieve high-precision layout. Simulation results show that this strategy can significantly improve positioning accuracy. Compared with traditional passive radar systems,positioning error is significantly reduced,and system response speed is faster. The research results have certain application value in practice.

  • Engineering & Application
  • Rui DAI , Hongxin ZHANG
    doi: 10.3969/j.issn.1003-3106.2025.11.017

    The application value of Brain-Computer Interface (BCI) and human-machine integration technology in the fields of UAV control and countermeasure equipment operation is explored, the problems faced by these technologies are analyzed, and targeted solutions are proposed to promote their rational application in the development of the low-altitude economy and security protection. By analyzing the role of BCI technology in improving UAV control efficiency and enhancing the accuracy of countermeasure equipment operation, and combining the new application scenarios that BCI technology shapes for the low-altitude economy and the new form of low-altitude security protection it creates, the problems faced by these technologies are sorted out, such as inherent defects of human-machine integration technology, information security risks, and their impacts on low-altitude security, and corresponding countermeasures are further put forward. BCI technology plays a significant role in the fields of UAV control and countermeasure equipment operation: it can improve UAV control efficiency and enhance the accuracy of countermeasure equipment operation. Based on BCI and human-machine integration technology, new application scenarios for the low-altitude economy have been shaped, and a new form of low-altitude security protection has been created.

  • Engineering & Application
  • Yanqiao CHEN , Qiuyang ZHANG , Xiaolong ZHANG , Jianyong YANG , Xinghua CHAI , Yang SU
    doi: 10.3969/j.issn.1003-3106.2025.11.018

    To solve the problem of stable information transmission for unmanned systems in complex electromagnetic environments, the information elastic adaptation method for unmanned systems based on communication quality assessment is proposed. Signal strength, bit error rate, and signal-to-noise ratio at the communication link layer are selected as characteristic parameters for communication quality assessment. Long Short-Term Memory ( LSTM) network is served as the signal prediction model to estimate future signal parameters, while a Support Vector Regression (SVR) model is employed to evaluate real-time communication quality. Based on the communication quality level evaluation model, the communication quality evaluation results are obtained; corresponding level of content is transmitted according to the communication quality level. Simulations using real-world data and field tests demonstrate that the proposed method ensures reliable information transmission for unmanned systems in highly dynamic and contested electromagnetic environments.

  • Engineering & Application
  • Ding CHEN , Zhiyang CHEN , Jiahong XU , Yong CHEN , Cheng JU
    doi: 10.3969/j.issn.1003-3106.2025.11.019

    To address the communication technology requirements for online monitoring and digital operation & maintenance of high-voltage transmission lines, as well as the shortcomings of existing communication methods such as optical fiber, 4G/5G in terms of adaptability to complex environments, coverage integrity, and cost control, the characteristics and requirements of current communication methods for high-voltage transmission lines are analyzed. Relying on a National Key Research and Development Plan project, a solution for a highly reliable broadband ultra-multi-hop wireless ad hoc network communication system is studied and proposed, which overcomes the technical issue of a sharp decline in quality of service after multi-hop wireless transmission, and a secure broadband ultra-multi-hop wireless ad hoc network communication system is constructed, realizing long-distance broadband service transmission with Quality of Service(QoS) assurance. The constructed ultra-multi-hop wireless ad hoc network communication system is simulated and tested though the OMNeT++ simulation platform, 9-node outdoor field tests, and on-site operation in the 220 kV Binxing First Line of State Grid Tianjin. The simulation and test results show that the system can achieve 50-hop broadband wireless data transmission with an end-to-end traffic of no less than 2 Mb/s. Compared with traditional technical route, this system features stronger technical adaptability and lower operation and maintenance costs. It enhances the digital operation and maintenance level of power grids and provides a reliable solution for the construction of communication networks in new-type power systems.

  • Engineering & Application
  • Qian MA , Gang WANG , Shutong LIU , Jinyong CHEN
    doi: 10.3969/j.issn.1003-3106.2025.11.020

    To address the inadequacy of multimodal data fusion and complexities in dynamic constraint optimization for satellite mission requirement decision-making, an intelligent decision model is designed to enhance automation and accuracy. The proposed Retrieval-Augmented Generation (RAG)-based optimization model for satellite mission planning comprises: ① An input layer receiving multimodal data such as user requirement texts and geospatial coordinates, etc. ; ② A processing layer integrating Transformer-architecture Large Language Model ( LLM) with vector databases to enable semantic retrieval and knowledge augmentation; ③ A constraint verification module in the output layer generating feasible solutions; ④ A feedback layer dynamically updating the knowledge base. Experimental validation demonstrates 90% decision accuracy—achieving 20% and 9.8% absolute accuracy improvements over conventional Rule-Based Expert Systems ( RBES) and Machine Learning Models ( MLM ), respectively. The model significantly enhances adaptability in satellite mission decision-making, enables efficient resource allocation under dynamic constraints, and exhibits substantial engineering applicability.