To address the challenges of strong nonlinearity, high uncertainty, and rapid time-varying parameters during the reentry phase of high-speed vehicles, this study proposes an end-to-end intelligent attitude control method based on an improved Twin Delayed Deep Deterministic Policy Gradient algorithm, aligned with the demands of intelligent spacecraft development. To overcome the issues of training instability and convergence difficulties in TD3-based attitude control learning, two key innovations are introduced: a hybrid reward mechanism combining continuous tracking error penalties and sparse task-completion rewards is designed within the Markov Decision Process framework to synergistically guide agent convergence. Prior knowledge constraints derived from modern control theory are incorporated into the training process, proposing a behavior cloning-based optimization strategy for the Actor network to balance expert experience imitation and cumulative reward maximization. Simulation results show that the proposed method can accurately track the three-channel attitude commands under 14 combinations of parameter deviations.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |