Autonomous vehicles with self-evolution capabilities are expected to improve their performance through learning algorithms, to automatically adapt to the external environment. However, due to the infinity, complexity, and variability of the actual traffic environment, it is necessary to develop quantitative representation indicators of scenario difficulty and generate targeted scenarios to ensure the evolution gradually, so as to quickly approach the performance limit of the algorithm. Therefore, this paper proposes a data-driven quantitative representation method of scenario difficulty. Specifically, the concept of environment agent is proposed, and a reinforcement learning method combined with mechanism knowledge is constructed for policy search to obtain an agent with an adversarial behavior. The model parameters of the environment agent at different stages in the training process are extracted to construct a policy group, and then agents with different adversarial intensities are obtained, which are used to realize data generation in different difficulty scenarios through the simulation environment. Finally, a data-driven scenario difficulty quantitative representation model is constructed, which is used to output the environment agent policy under different difficulties. Experimental results show the effectiveness of the proposed method. The result analysis shows that the proposed algorithm can generate reasonable and interpretable scenarios with high discrimination and can provide quantifiable difficulty representation without any expert logic rule design. Compared with the rule-based discrete scenario difficulty representation method, the proposed algorithm can achieve continuous difficulty representation. The video link is https://www.youtube.com/watch?v=GceGdqAm9Ys.
| • | This paper proposes the concept of environment agent, combines mechanism knowledge and the RL method to achieve efficient policy search, and obtains agent policies with adversarial behaviors. |
| • | This paper proposes a data generation method for varying difficulty scenarios, which combines the policy groups constructed by model parameters at different stages in the training process to provide information on the quantitative dimension of scenario difficulty. |
| • | This paper proposes a data-driven scenario difficulty quantitative representation model and proves that it can generate highly distinguishable scenarios with reasonable and quantifiable difficulty representations through result analysis. |
| 1. | Overtaking phase: the environment agent accelerates in order to reach the front of the ego vehicle to ensure that it can have an opportunity to influence the vehicle. At this point, state has the highest contribution. |
| 2. | Cut-in phase: the environment agent in front of the ego vehicle generates adversarial behavior as much as possible through the control of the longitudinal movement. At this point, states and have the highest contribution. |
| 3. | Maintenance phase: the environment agent is in the process of confrontation with the ego vehicle. If there is any lateral offset deviation between the environment agent and the ego vehicle, the environment agent must block the ego vehicle and continue to produce adversarial behaviors. At this stage, state has the highest contribution. |
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |