For the problems of adaptive cruise control technology, including insufficient environmental adaptability of control algorithm for Deep Reinforcement Learning (DRL), poor model mitigation and generalization ability, this paper proposed the Soft Actor-Critic (SAC) control algorithm based on the principle of maximum entropy and stochastic off-line policy. SAC network was built to fit action value function and action policy function, and auto-adjusting temperature coefficient was used to improve the environmental exploration ability of intelligent agent. For the problem of sparse reward, the reward function was designed by using the idea of reward shaping. In addition, a new experience replay mechanism was proposed to improve the utilization rate of samples. The proposed control algorithm was simulated and tested in different scenes, and compared with Deep Deterministic Policy Gradient (DDPG). The results show that the algorithm has better model generalization ability and migration effect on real vehicles.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |