Recent Advances in Interactive Driving of Autonomous Vehicles: Comprehensive Review of Approaches

Recent Advances in Interactive Driving of Autonomous Vehicles: Comprehensive Review of Approaches

PDF

Yanwen Yang¹, Natnael M. Negash¹, James Yang¹

Automotive Innovation | 2025, 8(2) : 304 - 334

Less

Automotive Innovation | 2025, 8(2): 304-334

Recent Advances in Interactive Driving of Autonomous Vehicles: Comprehensive Review of Approaches

Full

Yanwen Yang¹, Natnael M. Negash¹, James Yang¹

Affiliations

¹ Texas Tech University Department of Mechanical Engineering Lubbock TX 79409 USA

doi: 10.1007/s42154-024-00332-w

Outline

Abstract

Less

Interactive autonomous driving is an evolving research domain that demands an an autonomous vehicle (AV) to exhibit adaptability to new environments, cognizance of surrounding traffic conditions, and proficient decisionmaking ability in complex humandominated scenarios to guarantee safe navigation and promote social compatibility. This paper reviews the diverse methodologies utilized in interactive driving for AVs. Various techniques will be investigated for their unique contributions and capabilities in developing AV systems, such as long shortterm memory (LSTM), transformer, artificial potential field (APF), game theory, reinforcement learning (RL)/deep reinforcement learning (DRL), and partially observable Markov decision processes (POMDP), among others. Recent advancements based on these methodologies are summarized to elucidate their application rationale in interactive driving scenarios. The strengths and challenges inherent to each approach within the context of interactive driving are further assessed. Additionally, the resolution of these challenges is explored through integrating different methods. Therefore, a comparative analysis offers crucial perspectives for advancing autonomous driving technologies. This review exclusively focuses on the interactions between AVs and humandriven vehicles (HDVs).

Key words

Interactive driving / Autonomous vehicle / Inter-vehicle interactions / Trajectories prediction / Decision-making / Behavior planning

Cite this Article

Yanwen Yang, Natnael M. Negash, James Yang. Recent Advances in Interactive Driving of Autonomous Vehicles: Comprehensive Review of Approaches[J]. Automotive Innovation, 2025 , 8 (2) : 304 -334 . DOI: 10.1007/s42154-024-00332-w

Full Text

Less

Abbreviations

Less

ABT	Adaptive belief tree
AI	Artificial intelligence
APF	Artificial potential field
API	Application programming interface
AUC-ROC	Area under the receiver operating characteristic curve
AVs	Autonomous vehicles
BDD-OIA	Berkeley deep drive object induced actions
Bi-LSTM	Bidirectional LSTM
BLEU	Bilingual evaluation understudy
CC-LSTM	Clustering convolution-LSTM
CGAN	Conditional GAN
C-IDM	Cooperative IDM
CRFs	Conditional random fields
DDPG	Deep deterministic policy gradient
DDPO	Deep deterministic policy optimization
DDQN	Double deep Q-network
DESPOT	Determined sparse, partially observable trees
DPG	Deterministic policy gradient
DRL	Deep reinforcement learning
DS	Driving score
EOT	Eye-on-traffic
EPSILON	Efficient planning system for AVs in highly interactive environments
FLV	Front left vehicle
FRV	Front right vehicle
FV	Front vehicle
GANs	Generative adversarial networks
GMM-HMM	Gaussian mixture model-hidden Markov model
GNN	Graph neural network
GNSS	Global navigation satellite system
HD	High definition
HDVs	Human-driven vehicles
IDM	Intelligent driver model
IS	Infraction score
KDE-NLL	Kernel density estimate-based negative log likelihood
LCF	Left lane-changing feasibility
LK	Lane-keeping
LLC	Left lane-changing
LSTM	Long-short term memory
LSTM-FIS	LSTM-fuzzy inference system
MAE	Mean absolute error
mAP	Mean average precision
maxACC	Maximum accuracy
MDP	Markov decision process
MDI-POMDP	Multi-modal driving intention POMDP
meanNLL	Average negative log-likelihood
ME-GAN	Map-enhanced GAN
minADE	Minimum average displacement error
minFDE	Minimum final displacement error
minSADE	Minimum self-attention distance error
minSFDE	Minimum self-feature distance error
MODIA	Multiple online decision-components with interacting actions
MPC	Model predictive control
MR	Miss rate
MSE	Mean square error
MVE	Maximum velocity error
OOPOMDP	Object-oriented POMDP
POMCP	Partially observable Monte-Carlo planning
POMDP	Partially observable Markov decision processes
PRDQN	Deep Q-network with prioritized replay
RC	Route completion
RCF	Right lane-changing feasibility
RDE	Relative displacement error
RL	Reinforcement learning
RLC	Right lane-changing
RLV	Rear left vehicle
RNN	Recurrent neural network
RRV	Rear right vehicle
RSS	Responsibility sensitive safety
RV	Rear vehicle
SAC	Soft actor-critic
SARSA	State–action–reward–state–action
SCR	Scene collision rate
TD3	Twin delayed DDPG
TP-EGT	Trajectory prediction network with an Enhanced graph transformer
TraCI	Traffic control interface
V2V	Vehicle-to-vehicle

1 Introduction

Less

Driving induces stress due to the cognitive demands of vehicle operating amidst various travel scenarios. These scenarios include navigating congested traffic, negotiating multi-modal networks necessitating complex coordination with adjacent vehicles, and contending with various driving behaviors. Conversely, autonomous vehicles (AVs) offer a potential solution to mitigate this stress by assuming most driving responsibilities, alleviating the cognitive burden on users, and ensuring a more comfortable travel experience [1, 2]. Moreover, the advent of AVs promises a transformative shift in the transportation landscape. These autonomous driving systems are expected to be deployed on public roads to enhance traffic flow and bolster road safety [3-5]. However, a long transitional period is anticipated before all human-driven vehicles (HDVs) can be replaced by AVs [6, 7]. As the emphasis on safety in autonomous driving grows, there is a heightened need for AVs to collaborate effectively with human drivers in their surroundings. This collaboration is vital to harnessing the advantages offered by autonomous driving technologies amid transitional phases. With the rapid advancement of artificial intelligence (AI) and communication/network technology, vehicles are expected to possess various capabilities, including adaptability to unexplored environments, awareness of surrounding traffic conditions, and adept decision-making in complex multi-agent scenarios to ensure safe navigation [8-13]. Equipped with cutting-edge sensors [14], machine learning algorithms[15-16], and high-speed processing capabilities [17], AVs stand on the brink of transforming mobility, providing safer, more efficient, and more convenient transportation solutions [18-20]. However, the realization of this transformative potential hinges not only on technological progress but also on the crucial ability of AVs to interact with their dynamic environments seamlessly and safely. This integration is pivotal for successfully incorporating AVs into the road [21]. Road transport involves complex interactions among road users competing for space and priority [22]. Modeling these interactions is vital for enhancing the efficiency and safety of autonomous driving, as they are prevalent in various traffic scenarios [23-26]. Particularly, greater emphasis is placed on HDVs, as their uncertain future positions pose significant safety risks for AVs. To mitigate the risk of collisions with these objects, AVs with interactive driving capabilities necessitate enhanced environment perception capability [27]. This entails detecting relevant surroundings in the current environment and selecting optimal maneuvers through the decision-making module and transmitting them to the control module [28-31].

Interactive driving behavior or capability is defined as the ability to handle and revert severe consequences of driving that arise from the conflict with other HDVs over space sharing through motion and behavior prediction ahead of time, as shown in Fig. 1. The situation is that both a silver vehicle and a red vehicle aim to occupy the same spatial region simultaneously in the near future. Figure 1(a) without considering each other’s behavior, both vehicles proceed and collide. Figure 1(b) the red vehicle reduces its speed to allow the silver vehicle to change lanes. Figure 1(c) the silver vehicle opts to return to its previous lane to avoid a collision. The scenarios in Fig. 1(b) and (c) show the interactive driving capabilities of vehicles. The interactions between AVs and HDVs are defined as a scenario wherein the behavior of at least two road users may stem from the prospect that both intend to occupy the same spatial region simultaneously in the near future [32]. These interactions involve multifaceted communication and actions, necessitating comprehensive understanding and adaptation by the AV system. The interaction process of AVs in real traffic scenes exhibits distinct characteristics. Firstly, the actions of AVs are influenced by the behavior of surrounding vehicles, and mutually, surrounding vehicles are impacted by the actions of AVs. The positional definition pertaining to surrounding vehicles is shown in Fig. 2. Secondly, vehicles engage in cooperative maneuvers to prevent collisions and competitive behaviors stemming from diverse driving strategies, resulting in complex and dynamic interactions [33]. In addition, AVs lack insight into the driving policies of human drivers, and they must be inferred through the observation of HDVs’ actions [34]. Human drivers typically observe several vehicles simultaneously to make informed decisions [35]. Inspired by the nuanced anticipation observed in human drivers as they navigate traffic, researchers have embarked on extensive studies to investigate the driving behavior of AVs. These inquiries center on AVs collecting driving data from nearby vehicles, empowering them to choose maneuvers that mitigate risks and enhance operational efficiencies [36].

Detecting the real-time changes of dynamic objects involves determining the object’s current position and forecasting its future position, considering its beliefs and preferences [37, 38]. Predicting vehicle trajectories in dynamic and highly interactive environments is crucial for advancing autonomous driving capabilities [39]. AVs utilize a variety of sensors to perceive their surroundings and make intelligent decisions in real time, thereby adjusting their motion accordingly [40-42]. Enhancing automated systems with the capability to predict the movements of nearby vehicles, including lane-keeping and lane-changing behaviors, is essential. This advancement significantly improves driving performance regarding safety, comfort, and environmental sustainability [43-45]. In autonomous driving, the precision of predicting surrounding vehicles’ trajectories significantly impacts the subsequent planning of AVs [46]. However, accurately forecasting a vehicle’s future trajectory based solely on past states, such as direction and speed, proves to be challenging. This complexity arises from the road conditions and the influence of surrounding vehicles on a vehicle’s behavior. In a real-world traffic environment, each agent has its own time-varying dynamic traits and interacts differently with others depending on the road structures and different surrounding vehicles. Therefore, similar inputs may produce varying prediction outcomes [47-48]. Furthermore, diverse planning strategies adopted by AVs significantly influence the trajectories of surrounding vehicles, leading to notable variations in future trajectory predictions. The accuracy of trajectory prediction in the AV’s planning phase can be improved by accounting for the interactions between surrounding vehicles and the AV’s planning. This interactive approach to planning and prediction enhances the utility of trajectory prediction results for planning purposes, thereby bolstering the safety of autonomous driving tasks [49].

Behavior planning poses a significant challenge, as vehicles must adhere to traffic regulations while averting collisions with other vehicles [50-53]. It encompasses generating the desired trajectory and velocity from the initial position to the destination and entails the decision-making process for selecting the target maneuver [54, 55]. Traditional trajectory planning typically involves optimizing a designated payoff function. The payoff function is defined as a mathematical expression representing the reward given to a single player at the end of the game. However, each vehicle’s payoff function is contingent upon its own actions and those of all other vehicles in the preceding stage. This leads to exponentially increasing computational complexity as the number of vehicles increases. Unfortunately, current studies overlook the shared information between interactions across varied traffic scenarios, hindering the reuse of acquired interaction data for diverse interaction-related tasks [23]. In addition, adjusting the payoff function for various scenarios contributes to repetitive and time-consuming model optimization. Ensuring driving safety, comfort, and traffic efficiency requires interacting with other vehicles by predicting their driving intentions [56].

Driving intention denotes the driver’s behavioral preferences preceding actual driving actions, with significant implications for vehicle movement [57-58]. The motion planning algorithm for AVs necessitates human-like behavior, relying on inferring the intentions of surrounding traffic participants and maintaining situational awareness of the driving environment [59]. Therefore, with complex traffic conditions, establishing safety constraints for trajectory planning, informed by real-time trajectory predictions of nearby vehicles, emerges as pivotal for ensuring the safety and efficiency of AVs.

Despite significant advancements in autonomous driving, several challenges in the field require resolution [60]. Modeling interactive behavior enhances the prediction of human drivers’ intentions and motions. In previous literature reviews, Schwarting et al. [61] reviewed game-theoretic, probabilistic, POMDP, and learning-based approaches to outline emerging trends and challenges within perception, planning, and decision-making for AVs. Wang et al. [62] introduced a variety of approaches for modeling and learning social interaction between AVs and HDVs, including rational utility-based, deep neural networks-based, graph-based, social fields and forces, and computational cognitive approaches. Malik et al. [63] and Abdallaoui [64] categorized decision-making approaches for AVs, while Reda et al. [65] and Song et al. [66] summarized the motion planning methods for autonomous driving. However, recent research that contributes to the advancement of interactive driving has not been included in the previous reviews. Although Wang et al. [62] provided the underlying ideas and principles of modeling interactions, they omitted discussions related to the improvements and applications of interactive driving processes. This paper comprehensively reviews trajectory prediction, decision-making, and behavior planning in interactive driving to address the research gap. The key contributions are outlined as follows:

• This paper summarizes recent advances in long short-term memory (LSTM), transformer, artificial potential field (APF), game theory, reinforcement learning (RL), partially observable Markov decision processes (POMDP), and other approaches in modeling the interaction between AVs and HDVs. These summaries explain the rationale behind applying these approaches in interactive driving.

• This paper evaluates the strengths and challenges associated with each approach employed in interactive driving. Additionally, we explore how to resolve these challenges by integrating different methods, thereby providing valuable insights for future research direction in autonomous driving technology.

The remainder of this paper is organized as follows. Section 2 details the search methods employed to identify relevant papers concerning interactive driving, focusing on interactions between AVs and HDVs. Section 3 discusses the advances of various approaches. Section 4 conducts a comparative summary of the advantages and disadvantages of each approach in interactive driving. Several directions for future research and explore potential enhancements are further outlined through method integration. Finally, conclusions are drawn in Sect. 6. Figure 3 shows the technology roadmap.

2 Paper Search Methods

Less

This literature review aims to give an overview of the existing body of research on interactive driving systems, focusing on their approaches, inter-vehicle interactions, and specific traffic scenarios encountered. The search spanned publications up to January 2024, aiming to encompass a broad spectrum of relevant literature.

A comprehensive search was undertaken across several electronic databases with the combination of search terms, as shown in Table 1. Inclusion criteria were set for studies published in English that conducted primary research on interactive driving systems, specifically their impact on AVs in the context of surrounding vehicle interactions. Exclusions were made for studies focusing exclusively on the interaction between AVs and pedestrians or the environment, such as road constraints or static obstacles. Some studies were excluded based on their year and relevance in detail.

Finally, a total of 240 studies were identified for the comprehensive review. Ultimately, 173 of these studies were selected for inclusion in the final analysis. Emphasis was placed on research classified within the domains of LSTM, transformer, APF, game theory, RL/DRL, and POMDP approaches. The screening flow diagram and the literature keyworks co-occurrence network [67] are shown in Figs. 4 and 5, respectively.

3 Approaches

Less

3.1 Long Short-Term Memory

The advent of the LSTM networks has markedly influenced the development of AV technologies, particularly in the realm of interactive driving [68-70]. Interacting effectively with the surrounding environment is important in autonomous driving, given that AVs navigate dynamic traffic scenarios alongside HDVs [71]. Human drivers consistently observe and respond to the movements of nearby vehicles. To replicate this nuanced driving behavior, AVs are designed to mimic human-like maneuvers based on surrounding vehicles’ trajectories and intentions predictions. Moreover, autonomous driving systems classified as Level 3 or higher assume responsibility for driving tasks, necessitating advanced systems capable of managing temporal information for interactive AV driving. The LSTM networks are crucial for understanding the way in which AVs process sequential data over time, as shown in Fig. 6. Incorporating the LSTM into the literature review highlights the significance of processing temporal data in making predictive and adaptable driving decisions to future states.

An LSTM unit consists of a cell and three gates, including the forget gate, the input gate, and the output gate. The cell within the LSTM network retains values across arbitrary time intervals, with three gates governing the input and output of information. The equations of these three gates are defined by Eqs. (1)-(3). The forget gate determines the information to be discarded from the previous state by evaluating the relevance of the prior state in comparison to the current input. The input gate plays a pivotal role in determining the specific components of new information to be stored in the current state. The output gate governs the selection of information to be output from the current state, assigning a value between 0 and 1 to each piece of information based on considerations of both the previous and current states [72].

(1)

$ {f}_{t} = \sigma \left({{\mathbf{W}}_{\mathrm{f}} \cdot \left\lbrack {{h}_{t -1},{x}_{t}}\right\rbrack + {b}_{\mathrm{f}}}\right) $

(2)

$ {i}_{t} = \sigma \left({{\mathbf{W}}_{\mathrm{i}} \cdot \left\lbrack {{h}_{t -1},{x}_{t}}\right\rbrack + {b}_{\mathrm{i}}}\right) $

(3)

$ {o}_{t} = \sigma \left({{\mathbf{W}}_{\mathrm{o}} \cdot \left\lbrack {{h}_{t -1},{x}_{t}}\right\rbrack + {b}_{\mathrm{o}}}\right) $

where W_f, W_i, and W_o are the weight matrix for the three gate, ${h}_{t -1}$ is the hidden state in the previous time step, ${x}_{t}$ is the current input, ${b}_{\mathrm{f}},{b}_{\mathrm{i}}$, and ${b}_{\mathrm{o}}$ are the bias terms for the three gate. $\sigma \left(\cdot \right)$ is the activation function.

The LSTM networks are introduced to address challenges posed by gradient descent over extended periods. By selectively outputting relevant information from the current state, the LSTM network effectively preserves valuable long-term dependencies, enabling accurate predictions across both current and future timesteps. This literature review explores the pivotal role of LSTM networks in enhancing the interactive driving capabilities of AVs, focusing on trajectory prediction and driver intention recognition.

3.1.1 Trajectory Prediction

A core application of the LSTM networks in autonomous driving is trajectory prediction, which is crucial for safe navigation and strategic planning [73,74]. Li et al. [75] comprehended both the proximity and periodicity of the historical trajectory of moving objects to enhance the precision using an LSTM-based fuzzy logic. Zhong et al. [76] delve into the analysis of surrounding vehicles and their interactions with neighboring vehicles at a deeper LSTM layer, facilitating the exchange of hidden states using clustering convolution-LSTM (CC-LSTM). Relying on precise trajectory predictions, AVs can exhibit appropriate motion in the subsequent time step. Furthermore, the motion of the AV influences the behavior of surrounding vehicles, and the LSTM incorporates new information to enhance the trajectory prediction of these surrounding vehicles.

Accurate trajectory prediction for multiple vehicles within complex social interaction environments is essential for ensuring the safety of AVs and enhancing the quality of their planning and control mechanisms. Hou et al. [77] utilized the structural transformer within a two-layer LSTM encoder-decoder architecture to comprehensively capture interactions among multiple surrounding vehicles across both temporal and spatial dimensions simultaneously. Qiao et al. [78] proposed a method that integrated insights from the embedding and self-attention layers and decoded to generate predicted trajectory outputs, using the LSTM layer to process the sequential dependencies and the social dynamics between vehicles. Yang et al. [79] introduced a trajectory prediction network with an enhanced graph transformer (TP-EGT), using an LSTM-based encoder-decoder architecture to capture the social interactions among agents.

Multi-modal driver behavior contributes to the inaccuracy of trajectory predictions, as drivers may make diverse decisions under identical traffic scenarios. Additionally, interactions between vehicles typically influence their driving behavior [80]. To address this issue through an understanding of interaction, a social pooling mechanism [81-83] is introduced to leverage recurrent neural networks to extract features related to vehicle interactions, resulting in improved long-term trajectory predictions. Moreover, achieving human-like motion planning for AVs is attainable through trajectory prediction incorporating interaction information. Li et al. [84] utilized the structural-LSTM network to enable the prediction of multiple vehicle trajectories and a comprehensive exploration of interactions between the host vehicle and surrounding vehicles, acknowledging the impact of the host vehicle on its environment. For real-time trajectory predictions, fast and precise decision-making is essential. However, the sequential processing inherent in the LSTM networks can introduce delays, making them less suitable for applications that demand low-latency responses. Zhan et al. [47] achieved superior trajectory prediction performance, especially in long-term prediction tasks in dynamic traffic scenarios. Unlike conventional models dependent on instantaneous acceleration and velocity, often challenging to acquire in real-world scenarios, the proposed approach reinforces social interactions using a novel inter-vehicle interaction model grounded in a spatiotemporal graph structure, substantially enhancing the utilization of information embedded in historical trajectories by the social LSTM encoder.

3.1.2 Driver Intention Recognition

Predicting driver intentions is another area where LSTM networks excel. By analyzing sequential data such as vehicle speed, direction, and lane changes, LSTMs can infer potential maneuvers, enhancing the decision-making process in AV systems. The key driving intentions can be broadly categorized into three categories: left lane-changing, right lane-changing, and lane-keeping. The consideration of interactive driving intentions is pivotal in trajectory planning for vehicle motion. Recognizing vehicle motion as maneuvers accommodates the multi-modal nature of future motion. Therefore, intention prediction must encompass not only the vehicle’s own state, including heading angle and acceleration but also its interactions with neighboring vehicles, such as its proximity to surrounding vehicles [85]. The method primarily comprises the Bayesian formulation [86,87], the hidden Markov model [88,89], the Monte Carlo simulation [90,91], and the Kalman filter [92-94].

During the lane-changing process, the trajectory of the AV is invariably influenced by the presence of vehicles in the target lane. The primary interaction with the AV stems from the intention of the target vehicle, shaping the lane-change opportunity. Examples using the Bi-LSTM, LSTM-FIS, and LSTM-RNN can be found in Refs. [46,54,95], respectively. These proposed algorithms adeptly address surrounding targets by incorporating interaction with the AV. Notably, they yield more accurate results than conventional algorithms.

The expert systems mentioned above usually overlook the potential spatiotemporal risks arising from the coordinated longitudinal and lateral motion intentions of surrounding vehicles, such as abrupt right-side braking. Regarding vehicle intention identification, the aforementioned models fail to meet the safety requirements for predicting outcomes in hazardous scenarios. Zhao et al. [96] integrated the LSTM with conditional random fields (CRFs) for accurate predictions of both longitudinal and lateral motion intentions of the surrounding vehicles, thereby enhancing the readiness of AVs for the subsequent timestep motion through comprehensive interaction information. Diverging from previous research that primarily categorized driving intentions without accounting for variations in execution timing, a hierarchical LSTM-based vehicle trajectory prediction method [97] is introduced to address driving intentions and the timing of their execution simultaneously. Additionally, the model integrates information on interactions with surrounding vehicles, adding a further network layer of refinement to the accuracy of trajectory predictions. The results indicate that incorporating interaction information from surrounding vehicles significantly enhances the accuracy of trajectory prediction.

In summary, LSTM networks are recognized as a fundamental technology in developing interactive driving systems for AVs. Their ability to process and predict sequential data makes them invaluable for trajectory prediction and driver intention recognition.

3.2 Transformer

A transformer is a deep learning architecture based on the multi-head attention mechanism [98]. It provides a robust framework for modeling and understanding the complex interactions between multiple vehicles on the road. Different from LSTM, the transformer network is constructed entirely on the multi-head attention mechanism and consists of stacking layers. This architecture is particularly effective for tasks requiring the comprehension of long-range dependencies and can process sequences more efficiently due to its parallel nature [99].

The attention calculation for all tokens can be expressed as a single large matrix operation using the softmax function. This approach is advantageous for training, as computational optimizations in matrix operations with rapid computation, as shown in Eq. (4).

(4)

$\operatorname{Attention}(\boldsymbol{Q}, \boldsymbol{K}, \boldsymbol{V})=\operatorname{softmax}\left(\frac{\boldsymbol{Q} \boldsymbol{K}^{\mathrm{T}}}{\sqrt{d_{\mathrm{k}}}}\right) \boldsymbol{V}$

where Q, K, and V represent the query matrix, the key matrix, and the value matrix respectively. ${d}_{\mathrm{k}}$ is the dimension of the key vectors.

Furthermore, the multi-head attention module calculates correlations at any position in the sequence, as shown in Eqs. (5) and (6). By using multiple heads, this module can integrate features focused on different aspects of the data. This approach allows the multi-head attention module to efficiently capture global information within the sequence, thereby improving its ability to handle long sequence data. Therefore, it allows the model to recognize information across multiple dimensions of the input.

(5)

$\text { MultiheadedAttention }(\boldsymbol{Q}, \boldsymbol{K}, \boldsymbol{V})=\text { Concat }_{i \in[\text { #heads }]}\left(\text { head }_{i}\right) \boldsymbol{W}^{O}$

(6)

$\text { head }_{i}=\operatorname{Attention}\left(\boldsymbol{X} \boldsymbol{W}_{i}^{Q}, \boldsymbol{X} \boldsymbol{W}_{i}^{K}, \boldsymbol{X} \boldsymbol{W}_{i}^{V}\right)$

where #heads represents the number of heads and head_i denotes the self-attention of the i-th head. The matrix X is the concatenation of word embeddings. The matrices $\boldsymbol{W}_{i}^{Q}$, $\boldsymbol{W}_{i}^{K}$, and $\boldsymbol{W}_{i}^{V}$ are projection matrices specific to each attention ${{head}}_{i}$, while ${\mathbf{W}}^{O}$ is the final projection matrix for the whole multi-head attention mechanism.

3.2.1 Surrounding Vehicles’ State Predictions

The application of transformer models in predicting the states of surrounding vehicles-including trajectory prediction, intention prediction, motion prediction, and behavior prediction-has gained significant attention in recent years. Several studies have successfully implemented transformer architectures to enhance these prediction tasks [100-104]. Building on the success of transformer models, researchers have explored ways to improve their efficiency and performance. However, transformers may suffer from accumulative error issues and cannot perform inference in parallel due to the autoregressive decoding procedure [105,106]. To address this, Chen et al. [107] proposed an intention-aware, non-aggressive decoder query generation module that produces the decoder queries based on recognized intentions in a single step, thereby promoting inference speed. In addition to improving efficiency, the application of transformers has extended to multi-agent scenarios. Driving behaviors are influenced not only by the driver’s intentions but also by the complex interactions among multiple surrounding vehicles. Facing the challenge of achieving real-time multi-agent prediction without sacrificing performance or incurring high computational costs, recent advancements in prediction tasks now focus on predicting the motions and trajectories of multiple agents simultaneously rather than a single object [108-111]. This shift highlights the growing importance of sophisticated models capable of handling complex and dynamic traffic scenarios.

Moreover, driving behavior demonstrates inherent randomness and multimodality due to the unpredictability of driver’s intentions. Effective prediction tasks require transformer models to manage multi-modal data interactions, which involves two key challenges. First, the input must integrate multiple modes, including past states of surrounding vehicles and map information. Second, the output is highly multimodal and variable, indicating that a vehicle driver can follow one of many potential intentions or motions. Therefore, an ideal prediction method should represent various possible future state distributions to account for these different intentions. Several studies have explored intention-aware interactive transformers for multimodal prediction tasks [112-116]. These models offer richer information than single-mode predictions, enabling AVs to perform better risk assessments. Additionally, Xu et al. [117] introduced transformer designs and training in the multimodal contexts, offering a helpful and detailed overview. These developments allow for a more comprehensive understanding of agent behavior, leading to more accurate and reliable predictions in mixed traffic environments [118-121].

3.2.2 Decision-Making and Path Planning

The transformer model has significant potential to perform global attention and effectively replicate the learning processes of human drivers. It generates coherent and legally sound sentences to describe driving scenarios, facilitating the generation of appropriate driving decisions in AVs [122]. Liang et al. [123] divided the decision-making framework into two main levels. The upper level controls the vehicle’s lane selection and qualitative speed management by deciding whether to accelerate or decelerate. The lower level handles precise control of driving speed through the magnitude of acceleration or deceleration and the AV’s direction. Utilizing the transformer model, the modules within these levels perceive and adapt to complex and varying road conditions, thereby making high-quality decisions comprehensively.

Nevertheless, most existing methods treat planning and prediction as independent processes, ignoring their interrelation and the dynamic changes in traffic scenarios. To address this challenge, Fu et al. [124] introduced InteractionNet, which leverages transformers to share global contextual reasoning among all traffic participants. This approach captures interactions and interconnects planning and prediction for joint optimization. Additionally, InteractionNet utilizes another transformer to enhance the model’s focus on regions containing critical or unseen vehicles. Huang et al. [125] proposed a transformer encoder to effectively model the relationships between scene elements and introduced a novel hierarchical transformer decoder structure. At each level of decoding, the decoder uses the predictions from the previous level and the shared environmental context to iteratively refine the interaction process. Moreover, they proposed a learning process that adjusts an agent’s behavior at the current level based on other agents’ behaviors from the previous level. This approach ensures a more dynamic and responsive modeling of interactions within the environment.

In summary, transformers provide a powerful tool for enhancing the interactive driving capabilities of AVs. They offer sophisticated mechanisms for processing sequential data, predicting the states of surrounding vehicles, modeling interactions, and making informed decisions and path planning in complex driving environments. The applications of transformers are summarized in Table 2.

3.3 Artificial Potential Field

In previous studies, Khatib [135] first introduced the APF algorithm, conceptualizing the spatial movement of an object as the force-driven motion of a particle. Within the APF framework, the attractive potential field, denoted as ${U}_{\text{att }}$, exerts influence across the global environment, generating a stronger attractive force the farther an object is from its target point. To facilitate obstacle avoidance and ensure adherence to the predetermined path from the starting location to the destination, the attractive potential field $({U}_{\text{att }}$ is formulated as shown in Eq. (7).

(7)

$ {U}_{\text{att }} = \frac{1}{2}{k}_{\text{att }}{d}_{\text{goal }}^{2} $

where ${k}_{\text{att }}$ represents the attractive constant and ${d}_{\text{goal }}$ is the distance between the controlled vehicle and its destination. The attractive force ${F}_{\text{att }}$ applied to the controlled object is calculated as shown in Eq. (8).

(8)

$ {F}_{\text{att }} = -\nabla {U}_{\text{att }} $

The repulsive potential field, denoted as ${U}_{\text{rep }}$, operates within a confined range, as in Eq. (9). When an object enters this range, the APF algorithm generates a repulsive force that prevents the object from nearing the obstacle, thereby creating a collision-free motion trajectory, detailed in Eq. (10).

(9)

$ {U}_{\text{rep }} = \left\{ \begin{matrix} \frac{1}{2}{k}_{\text{rep }}{\left(\frac{1}{d} -\frac{1}{{d}_{0}}\right) }^{2}, & d \leq {d}_{0} \\ 0, & d > {d}_{0} \end{matrix}\right. $

where ${k}_{\text{rep }}$ represents the coefficient of the repulsive potential field, $d$ is the distance constant between the controlled vehicle and the obstacle, and ${d}_{0}$ indicates the operational range of the repulsive potential field ${U}_{\text{rep }}$ given as:

(10) $ {F}_{\text{rep }} = -\nabla {U}_{\text{rep }} $

In the repulsive potential field, its effective radius defines the extent of the repulsive force’s influence. The controlled object is drawn toward its destination by the attractive force $\left({F}_{\text{att }}\right)$. When the distance between the object and an obstacle is less than ${d}_{0}$, the repulsive force activates to prevent a collision. In conclusion, given its similarity to human drivers’ obstacle avoidance behaviors, the APF algorithm serves as an optimal strategy for planning safe local paths. However, the traditional algorithm requires adaptations to effectively apply in traffic scenarios. The improvement of modified APF models is detailed in Table 3.

The application of APF methods in AV navigation has garnered significant attention as a strategy to enable interactive driving in complex environments [57]. The APF model constitutes a mathematical representation of a virtual energy landscape within the vehicles’ operational environment [140]. It characterizes surrounding vehicles as obstacles and uses a repulsive field to describe in Fig. 7. The red vehicle represents the AV under control, while the silver vehicle is characterized as an obstacle. Considering the repulsive forces generated by both the road ${F}_{\text{road }}$ and the obstacle ${F}_{\mathrm{{obs}}}$, alongside the attractive force stemming from the goal ${F}_{\text{goal }}$, a resultant force ${F}_{\text{sum }}$ is produced to navigate the AV without collision. This literature review delves into the utilization of APF in the context of AVs, concentrating on obstacle avoidance and path planning to improve vehicle interaction dynamics.

3.3.1 Obstacle Avoidance

The center of APF’s application in autonomous driving is its ability to dynamically enable vehicles to avoid obstacles. The principle behind traditional APF involves creating virtual forces that repel the vehicle from obstacles and attract it towards the goal, ensuring safe navigation [141]. Unlike the traditional approach by only considering the vehicle’s center of mass as a single point, Cao et al. [142] explored the concept of circle decomposition to represent a more accurate distance cost between the AV and the obstacle vehicle. Due to the different dynamics and safety constraints associated with longitudinal and lateral movements on the road, it is a common practice to consider the longitudinal and lateral potential fields separately. This separation effectively addresses the challenges and dynamics unique to each movement type, optimizing the vehicle’s navigation and safety across various driving scenarios. Wu et al. [143] developed the longitudinal and lateral direction APFs to clarify the modeling of interactions with multiple dynamic surrounding vehicles. This simplifies the computation and real-time updating of paths for obstacle avoidance. However, traditional APF algorithms assign a fixed value to the repulsive potential field range, ignoring the relative motion between vehicles. To address this issue, the potential field radius [144] is defined based on the safety distance between vehicles to ensure safer navigation. It renders the AV to dynamically adjust its safety zone in accordance with its own speed and the surrounding vehicles’ speeds. By accurately adjusting the potential fields’ radius according to safety distances, the AV can avoid excessively conservative or aggressive maneuvers, promoting smoother and more secure interactive driving.

Moreover, to enhance the flexibility and efficacy of the APF method in diverse and dynamic driving environments, an adaptive force coefficient [145] is introduced to achieve predictive obstacle avoidance by automatically adjusting its forces in accordance with the feasible terminal state at each moment. The force coefficient plays a crucial role in determining the strength of both repulsive forces from obstacles and attractive forces towards the goal, significantly influencing how the vehicle reacts to obstacles and goals. Incorporating interaction information into the adaptive force coefficient allows the vehicle to modify the repulsive force as necessary dynamically. This ensures safe navigation around moving obstacles without overly aggressive maneuvers.

3.3.2 Path Planning

APF is also instrumental in path planning for AVs, providing a framework for generating collision-free trajectories. Its simplicity and computational efficiency make it well-suited for real-time applications that demand swift decision-making. However, the circular repulsion field utilized in traditional APF models falls short of accurately reflecting the complexities of vehicle obstacle avoidance as experienced by human drivers. This limitation often results in trajectories that lack steering smoothness, adversely affecting ride comfort. Sun et al. [136] provided a foundation for safe navigation with a static obstacle. The designed repulsive potential field combines road boundary and obstacle repulsive fields. Building on this, Wang et al. [137] refined the method by accounting for multiple surrounding vehicles and introducing variable coefficients based on identified driver characteristics, thereby enhancing the model’s adaptability and effectiveness in complex driving scenarios.

Despite the progress in motion planning for AVs using the APF method, existing approaches often overlook the dynamics of road participants. Traditional APF models, while effective in static environments, fall short in dynamic traffic situations due to a lack of consideration for vehicle dynamics, environmental factors, and road regulations. To address the abovementioned challenges, potential fields around dynamic obstacles are created to assess their impact on the AV [138], thereby ensuring the maintenance of safe distances from surrounding vehicles. The relative speed is also incorporated between the AV and HDVs into the potential field calculations, allowing the AV to modify its trajectory in real time for safe navigation around multiple moving obstacles. To predict the path change of the surrounding vehicles more accurately and facilitate timely adjustments in their own trajectory, rotational factors [139] are introduced to align the potential field with the vehicle’s heading angle, enhancing the model’s responsiveness to lane-changing maneuvers by surrounding vehicles. This adjustment makes the potential field more informative, allowing the host vehicle to decelerate in advance and create a smoother trajectory. Furthermore, navigating complex maneuvers, such as merging onto highways or navigating roundabouts, requires complex trajectory planning. The ability to rotate potential fields equips the AV with the flexibility to adapt to these challenges, ensuring safe and efficient interactions with vehicles executing various maneuvers.

In summary, the APF method presents a viable strategy for managing the complexities of interactive driving in AVs. Through obstacle avoidance and path planning, the APF contributes to the management of dynamic interactions between vehicles and fosters the advancement of safer and more efficient AV systems.

3.4 Game Theory

Game theory offers a strategic framework for analyzing interactions among rational agents, making it particularly relevant for modeling the decision-making processes in interactive driving scenarios involving AVs and HDVs [146, 147]. The interdependence inherent in these interactions compels each player to decide upon potential decisions made by others [148]. The main aim of this approach is to comprehend and forecast the socially aware decision-making processes engaged by individuals or entities in scenarios characterized by conflict or cooperation [149], as shown in Fig. 8. Initially, the AV evaluates the necessity of a lane-changing maneuver based on its lane-changing intention. Should lane-changing conditions not be met, the AV opts for lane-keeping. Conversely, the AV determines the lane-change direction if a lane change is warranted. The payoffs for the two involved vehicles (the AV and the Rear Left Vehicle (RLV) for left lane-changing; the AV and the Rear Right Vehicle (RRV) for right lane-changing) are calculated across various strategic scenarios, incorporating information such as velocity, acceleration, and steering angle. This analysis identifies the optimal strategy for either left or right lane changing. Finally, both vehicles proceed in accordance with the strategies they have adopted. Moreover, game theory applications can be classified into three different types based on the level of information available: perfect and complete information, imperfect and complete information, and incomplete information.

3.4.1 Decision-Making with Perfect and Complete Information

Games of perfect and complete information assume that all players have access to all historical moves made by all players. In the realm of autonomous driving, this corresponds to scenarios where the AV has full knowledge of the environment, including the exact positions, velocities, and intended maneuvers of all surrounding vehicles. In a leader-follower game, the follower cannot promptly respond to the leader’s immediate decision due to reaction delays. Therefore, the follower bases its decision solely on the current state of the vehicles, aiming to maximize its strategy. The leader vehicle is assumed to possess awareness of the surrounding vehicles’ behavior and applies a conservative maximin strategy to secure its rewards. This understanding enables the leader to predict the follower’s decision, allowing for an optimal response based on the interaction information. The interactive behavior modeling of a leader-follower game for uncontrolled intersections and merging can be found in Refs. [150, 151], respectively.

The Stackelberg game framework establishes a hierarchical decision-making model for AVs in mixed-traffic environments [152]. Within this framework, the AV assumes the role of the leader, tasked with formulating its strategy or trajectory guided by the payoff function, considering the expected responses of the HDVs. Concurrently, the surrounding vehicles function as followers, reacting to the leader’s initiatives in alignment with their own objectives and constraints. In summary, the AV, designated as the leader, strategically decides whether to change lanes or maintain its current lane to minimize associated driving safety, ride comfort, travel efficiency, passage risk, and acceleration payoff. The leader’s ensuing actions significantly shape the surrounding vehicles’ conduct [153]. The applications of the Stackelberg game can be found in Refs. [154-156]. Those references illustrate the Stackelberg game’s ability to navigate social interactions with other traffic participants, enabling the AV to formulate judicious and safe decisions and plans. This underscores the feasibility and effectiveness of employing the Stackelberg game in AV decision-making processes.

In the context of sub-game perfect Nash equilibrium, each player, whether an AV or a HDV, makes optimal decisions at every stage of interaction [9]. These decisions necessitate consideration of the strategies chosen by others and potential future actions [157]. This method applies to a tree-like extensive-form game structure with sequential moves and information sets. Therefore, the decisions are optimal for the specific stage and the entire sequence of interactions. This gaming framework holds significant importance in modeling and analyzing AV behavior, especially in complex traffic scenarios where decisions are interrelated and subject to the influence of others’ actions. For example, Yu et al. [158] introduced a lane-changing decision module in the mixed environment of AVs and HDVs. When an AV initiates a lane change, the rear left vehicle in the targeted lane faces decisions, encompassing options to accelerate, decelerate, or maintain its current speed [159]. Simultaneously, the rear right vehicle in the adjacent lane must choose between overtaking or maintaining its position, guided by the principles of Nash Equilibrium. This implies that, at each decision point, autonomous and human drivers meticulously select optimal actions in response to others’ maneuvers, establishing a foundation for consistency and optimality throughout the entire sequence of interactions. In summary, while extensive-form games furnish a structure for delineating sequential decision-making, subgame perfect Nash Equilibrium enhances this framework by guaranteeing strategic optimality at each game stage.

3.4.2 Decision-Making with Imperfect and Complete Information

Although the aforementioned models effectively depict certain aspects of vehicle interactions, they fail to capture the complexities of simultaneous decision-making by players unaware of others’ choices. Imperfect and complete information games acknowledge that while the history of movements is known, there is uncertainty regarding some aspects of the current state. In interactive driving, this might involve scenarios where the AV can observe past actions of other vehicles but cannot precisely determine their driving styles or intentions. In this background, the AV evaluates its payoff and neighboring vehicles’ potential actions and payoffs [160]. Normal-form games leverage Nash Equilibrium as a fundamental concept for imperfect and complete information games, allowing AVs and HDVs to make decisions simultaneously [161]. Lopez et al. [149] used the concept of “times-to-collision” to establish the dynamic relationship between AVs and their surrounding vehicles. The behavioral estimation of the AV hinges on the availability of times-to-collision concerning the rear vehicle, facilitating the execution of a lane-changing maneuver at a higher desired speed. Subsequently, the surrounding vehicles will select an action based on Nash Equilibrium solutions in a normal-form game framework. Moreover, considerations extend to incorporating probability distributions governing the driving styles [162,163], and action [22] of surrounding vehicles in the interactive process. This accommodation of imperfect information enhances the practical applicability of game theory in the context of mixed-traffic environments.

3.4.3 Decision-Making with Incomplete Information

Incomplete information games are characterized by unknown elements about other players, including their available actions, payoffs, or strategies. This situation resembles driving environments where the AV lacks comprehensive information about its surrounding vehicles. This anticipation is achieved by observing driving performance or recognizing driving styles to predict the actions of other vehicles and their responses to the AV’s maneuvers. The Bayesian game emerges as a strategic decision-making model tailored to situations with incomplete information among players. The essential components of this game include a set of players, action sets, type sets, payoff functions, and a probability distribution over all possible type profiles. Diverging from conventional game theory models where players possess complete information, Bayesian games transcend this limitation by integrating uncertainty and probabilistic reasoning into the decision-making process. Within this framework, a player assesses payoffs as expected values derived from a probability distribution encompassing all potential player types. Zhang et al. [164] classified the aggressiveness of surrounding vehicles into three types: equally aggressive, less aggressive, and more aggressive from the perspective of the AV. The probability distributions of these three categories are observable and subject to updates. The AV and its surroundings aim to maximize the expected payoff for each player, considering their beliefs and the strategies employed by other players, ultimately converging to a Bayesian Nash Equilibrium. Therefore, Bayesian games seamlessly blend Bayesian probabilistic reasoning with game theory, serving as a framework to model and simulate interactive behaviors. This integration empowers AVs to dynamically adapt and make well-informed decisions within the context of evolving and uncertain traffic scenarios.

In scenarios devoid of vehicle-to-vehicle communication or specific coordination, the level- $k$ game proves applicable to incomplete information games. Level- $k$ game modeling facilitates a realistic simulation of interactions between AVs and HDVs by considering agents’ different levels of strategic thinking. This approach provides insights into the decision-making processes of both autonomous and human drivers, aiding researchers and engineers in comprehending how varying cognitive levels impact driving behaviors [165]. Additionally, AVs can adjust their decision-making strategies based on the perceived cognitive levels of human drivers, enhancing their ability to navigate mixed traffic scenarios adeptly [166]. Examples of level- $k$ games can be found in Refs. [167-169].

In summary, applying game theory to interactive driving in AVs offers a structured way to analyze and predict the outcomes of multi-agent interactions under various information conditions. By categorizing research according to the type of information available, this review highlights the breadth of game-theoretical approaches to solving the complex challenges of autonomous navigation.

3.5 Reinforcement Learning

RL has been identified as a pivotal methodology in advancing the interactive driving capabilities of AVs. This approach enables AVs to learn optimal behaviors through trial-and-error interactions with their environment, improving safety, efficiency, and adaptability in complex driving scenarios [15]. The agent determines its actions based on its current state and obtains feedback through rewards, as illustrated in Fig. 9. The policy map provides the probability of executing action a of state s at the current time step t, as shown in Eq. (11). The action-value function ${Q}_{\pi }\left(s\right)$ is defined in Eq. (12) as the expected discounted return resulting from initial action a (for position, acceleration, or steering angle) of state s (such as relative velocity or position) and following policy $\pi$. The objective is to learn an optimal policy ${\pi }^{ * }$ as detailed in Eq. (14), which maximizes cumulative rewards over time, thereby enabling safe and efficient navigation [171].

(11) $ \pi \left({a, s}\right) = \Pr \left({{a}_{t} = a \mid {s}_{t} = s}\right) $

(12) $ {Q}_{\pi }\left(s\right) = \mathbb{E}\left({R \mid s, a,\pi }\right) $

where R stands for the discounted return, defined as the cumulative sum of future discounted rewards in Eq. (13). The AV chooses an action ${a}_{t}$ from the set of available actions according to the current state ${s}_{t}$ and reward ${r}_{t}$ received at the current time step t and $t + 1$ represents the next time step.

(13) $ R = \mathop{\sum }\limits_{{t = 0}}^{\infty }{\gamma }^{t}{r}_{t + 1} = {r}_{1} + \gamma {r}_{2} + {\gamma }^{2}{r}_{3} + \ldots $

where $\gamma$ is discount factor, ${r}_{t + 1}$ is the reward obtained for transitioning from ${s}_{t}$ to ${s}_{t + 1}$.

(14) $ {\pi }^{ * } = \mathop{\operatorname{argmax}}\limits_{\pi }{Q}_{\pi }\left(s\right) $

Deep reinforcement learning (DRL) integrates RL with deep learning techniques, aiding scalability in handling high-dimensional states and action spaces [172, 173]. Leveraging deep neural networks, DRL processes intricate sensory data from the vehicle’s sensors, enhancing comprehension of the driving environment [174, 175]. This capability allows the DRL agent to address complex decision-making scenarios by interpreting raw sensory inputs, consequently enhancing maneuvering performance through dynamic and unpredictable road conditions [176]. RL and DRL have been applied across various driving tasks, demonstrating AVs’ capability to learn and adjust to changing surroundings. Through environmental interaction, RL/DRL agents can learn optimal strategies for speed control [177], acceleration [178,179], and steering angle [180], which are critical components for ensuring safe and efficient navigation. This literature review explores the state of the art of RL and DRL applications in interactive driving, focusing on car-following and lane-changing traffic scenarios.

3.5.1 Path Planning for Car-Following

Car-following models are crucial for understanding and predicting the behavior of an AV as it follows another vehicle. The RL agent, responsible for controlling the following vehicle, continually observes environmental states and takes actions to maximize long-term rewards. The reward is designed to maintain an optimal distance from the front vehicle, avoiding collisions and minimizing unnecessary speed changes. The agent regularly adjusts its policy based on environmental feedback through iterative refinement, enhancing its proficiency in car-following maneuvers over successive iterations.

Advanced RL algorithms, including DRL, are adept at navigating the high-dimensional state spaces characteristic of driving scenarios. These algorithms enable vehicles to adjust their behavior dynamically in response to complex traffic conditions and the unpredictable maneuvers of leading vehicles, thereby facilitating efficient and safe car-following behaviors. Wang et al. [181] developed a soft actor-critic (SAC) algorithm to select appropriate acceleration levels to control velocity for interactive driving. The highD dataset serves as the basis for training and testing the algorithm. Throughout the training phase, the algorithm progressively learns to prevent collisions between the leading and following vehicles, ultimately achieving convergence in its decision-making process. Other applications using a deep deterministic policy gradient (DDPG) model for interactive decision-making can be found in Refs. [182-184].

3.5.2 Path Planning for Lane-Changing

Lane-changing is a complex maneuver that involves evaluating multiple factors, including vehicles in the current lane and those in adjacent lanes [185-187]. RL enables AVs to adeptly navigate lane-changing complexities through direct interaction with the environment, including multiple lanes, surrounding vehicles with varying speeds and positions, and diverse road conditions. Through this process, the agents refine their behavior to ensure safety, efficiency, and adherence to traffic regulations while adapting to the dynamic and often unpredictable nature of road traffic [188, 189]. DRL facilitates interactive driving in lane-changing scenarios by harnessing the combined capabilities of deep learning and RL. This approach optimizes acceleration and steering commands in real-time traffic, leading to smoother and more efficient driving behaviors [190]. DRL applications in lane-changing can be found in Refs. [191-193]. These DRL solutions enhance the safety and efficiency of AVs by making informed, real-time control decisions based on the driving state and the behavior of surrounding vehicles.

In summary, RL and DRL present a robust framework for developing autonomous driving systems that learn and adapt to various traffic conditions, as summarized in Table 4. Through continuous interaction and feedback, these systems progressively refine their decision-making strategies, improving their proficiency in executing tasks. The existing surveys [195-198] highlight the transformative potential of RL and DRL in the domain of autonomous driving, offering flexible, adaptive, and intelligent approaches to overcoming the complexities of real-world traffic navigation.

3.6 POMDP

POMDPs have been increasingly applied to tackle complex path planning challenges in autonomous driving, particularly in scenarios characterized by uncertainty and partial observability [199, 200]. A POMDP models the relationship between the AV (agent) and its environment. Formally, a POMDP represented by a 7-tuple $(S, A, T$, $R,\Omega, O,\gamma )$ where $S, A,\Omega$ denote a set of states of AV, actions of AV, and observations of surrounding vehicles, respectively. Additionally, T represents the conditional transition probabilities between states of AV. O represents the conditional observation probabilities regarding the surrounding vehicles. R is the reward function, and $\gamma$ is a discount factor, influencing the valuation of future rewards.

Extensive research has addressed challenges in decision-making and planning for AVs within unsignalized traffic scenarios using POMDPs [201]. Due to the challenges associated with directly measuring certain state information, such as the routes of surrounding vehicles, a probabilistic distribution known as the belief state b is introduced. The belief state captures the probability of being in a particular state. The selection of an optimal action ${a}^{ * }$, based on the current belief state at the current time step t, enables the maximization of the cumulative expected reward, as shown in Eq. (15).

(15) $ {a}^{ * } = \mathop{\operatorname{argmax}}\limits_{{a \in A}}\mathbb{E}\left\lbrack {\mathop{\sum }\limits_{{t = 0}}^{\infty }{\gamma }^{t}R\left({{s}_{t},{a}_{t} \mid {b}_{t}}\right) }\right\rbrack $

Upon executing an action and acquiring new observations, the AV updates its belief state using Bayesian’ rule for the next time step $t + 1$, as shown in Eq. (16).

(16) $ {b}_{t + 1}\left({s}_{t + 1}\right) = {\eta O}\left({{o}_{t} \mid {s}_{t + 1},{a}_{t}}\right) \mathop{\sum }\limits_{{s \in S}}T\left({{s}_{t + 1} \mid {s}_{t},{a}_{t}}\right) {b}_{t}\left({s}_{t}\right) $

where $\eta$ is a normalizing constant. The overall decision-making process under uncertainty for AVs using POMDP is refined in Fig. 10. The online part includes environmental understanding to determine road constraints and a POMDP solver to complete the task. The offline part involves extracting trajectories from datasets to construct a predictive model for the movement of surrounding vehicles. A POMDP solver is used to obtain the optimal action for AVs, integrating insights from both the offline and online components.

The applications of this approach span four different traffic scenarios: lane-changing, merging, T-junctions, and intersections. These scenarios, characterized by increased conflict points compared to signalized traffic scenarios, contribute to more complex interactions in dynamic driving settings, thereby intensifying the complexity of the planning problem [16]. By accounting for uncertainty and partial observability characteristics, POMDPs enable AVs to make informed decisions, improving their ability to interact with other road users and navigate complex traffic environments [202].

3.6.1 Path Planning for Lane-Changing with Uncertainty

Lane-changing is a dynamic scenario where the AV must consider the behavior of multiple vehicles, often in dense traffic conditions. The challenge is associated with the uncertainty about other vehicles’ future actions and the need to make real-time decisions. The POMDP offers a valuable framework for capturing uncertainty in the lane-changing scenario [203, 204]. Li et al. [205] proposed a Multi-modal Driving Intention POMDP (MDI-POMDP) to estimate the multi-modal distribution of surrounding vehicles’ driving intentions, introducing a multi-modal collision safety potential to comprehensively assess collision risks in a multi-lane scenario. However, most of these methodologies undergo validation solely through simulations or well-annotated datasets, raising concerns about their applicability in real-world AVs. The complex escalates when planning for an actual AV with closed-loop execution. Onboard planning necessitates navigating an imperfect world and naturally engaging with other traffic participants. Ding et al. [206] introduced an efficient interaction-aware planning system named EPSILON for AVs, formulating behavior planning using the POMDP. Accordingly, the AV accelerates and executes an active merge into the nearby lane for automatic lane-changing, enhancing travel efficiency.

3.6.2 Path Planning for Other Scenarios with Uncertainty

Merging, unsignalized T-junction, and intersections present significant challenges for AVs due to the complex dynamics of interacting vehicles and the need to make decisions based on incomplete information. In merging scenarios, the AV observes the following vehicle in the left lane either decelerating to facilitate a merge or maintaining its current speed while disregarding the AV. Based on the intent of the left lane following vehicle, the AV needs to decide when to merge. Some applications can be found in Refs. [207, 208], enabling the planning of merging maneuvers based on drivers’ latent behaviors.

In unsignalized T-junction scenarios, Wray et al. [209] introduced an innovative approach to address limited visibility challenges in AVs engaged in interactive driving, drawing on the concept of virtual vehicles. Initially, the AV halts at the stop line, then slowly advances into the intersection, mindful of its restricted visibility. A momentary deceleration occurs while the POMDPs ascertain the absence of actual vehicles. With increased confidence that there are no other vehicles, the AV resumes forward movement, successfully navigating through a T-junction. This strategic sequence ensures a safe and effective maneuver in scenarios with limited visibility. Moreover, the applications using POMDP to ensure a safe and effective maneuver in scenarios with unknown intentions of the surrounding vehicles can be found in Refs. [210, 211].

The left-turn planning at unsignalized intersections poses a prevalent and perilous challenge for AVs, particularly when oncoming vehicles fail to signal their turning intentions [212]. Amidst the inherent uncertainty, employing the POMDP model is a promising approach to enhance efficiency at intersections. The oncoming vehicle typically exhibits three distinct intentions: proceeding straight, executing a right turn, or initiating a left turn [213]. Xia et al. [214] used the belief tracker to deduce intention by analyzing observed vehicle motion. The AV will select the optimal action from three alternatives: acceleration, deceleration, or maintaining the current speed. It transitions into the right-turn lane to ensure a secure distance from surrounding vehicles. Examples of a collision-free trajectory establishment in an intersection scenario can be found in Refs. [215-217]. Round intersections pose increased challenges due to the heightened difficulty in perceiving other vehicles within and near the intersection. Li et al. [218] reported an object-oriented POMDP (OOPOMDP) algorithm focusing on the decision-making processes of AVs when navigating around intersections. In each decision cycle, AVs acquire the vehicle’s observation history, implementing the policy-prediction method to distinguish the policy concerning the last segment of the vehicle’s trajectory. The authors augment each vehicle state with the potential policy, facilitating its utilization in state transitions during the Monte-Carlo Tree Search. Therefore, AVs can effectively navigate around intersections in multi-vehicle scenarios, avoiding collisions.

In summary, POMDPs provide a powerful tool for enhancing the decision-making capabilities of AVs in complex and uncertain traffic scenarios, as summarized in Table 5. POMDPs offer a robust framework for modeling the decision-making process under uncertainty, enabling AVs to make well-informed decisions during incomplete information situations about their surroundings or the intentions of other road users [219].

3.7 Other Approaches

The field of interactive driving for AVs involves extensive techniques, each contributing unique insights and capabilities to the development of AV systems. Beyond the above-discussed methodologies, several other approaches have shown promise in enhancing the interactive driving capabilities of AVs. This section explores these diverse methodologies and their applications in the context of interactive driving.

3.7.1 Model Predictive Control (MPC)

The MPC creates a predictive model of a system’s future behavior and uses optimization techniques to plan a control strategy that follows a desired trajectory while adhering to system constraints. The MPC generally begins with an accurate model of the vehicle’s dynamics. Based on this accurate vehicle dynamic model, it predicts the future state of the vehicle over a specified prediction horizon. This horizon spans several time steps into the future and predicts where the vehicle will be and how it will behave under current and potential control inputs. The core of MPC is an optimization algorithm that determines the optimal control inputs, such as steering angle and acceleration, to minimize a cost function. In addition, the MPC must consider the behavior of other vehicles in scenarios such as dense urban traffic or highways with other drivers. It uses sensors and data inputs to predict the movements of surrounding vehicles and adjusts its control strategy to avoid collisions, often considering multiple potential future scenarios. Finally, the output of the MPC is fed into the vehicle’s control systems, enabling the actuation of steering and acceleration accordingly. Therefore, its application in AVs is essential for precise maneuvering and trajectory planning, which is crucial for navigating complex traffic scenarios. Some applications can be found in Refs. [224-227] to enhance comprehension of the dynamic interactions between an AV and its surrounding vehicles.

3.7.2 Generative Adversarial Networks (GANs)

GANs extend the capabilities of autonomous and interactive driving systems through enhanced data augmentation, improved perception, and the simulation and prediction of complex scenarios. However, the practical implementation must carefully address the challenges related to training stability and computational demands. For example, Guo et al. [39] introduced a map-enhanced GAN (ME-GAN) that incorporates interaction information to enhance the accuracy of vehicle trajectory predictions. Ma and Qu [18] utilized a conditional GAN (CGAN) to refine the action estimation produced by an encoder-decoder framework to improve the stability and efficiency of mixed traffic flows.

3.7.3 Graph Neural Network (GNN)

GNNs offer a powerful framework for modeling the relational data inherent in traffic systems [228]. Specifically, each road user can be represented as a node within a graph, with their historical dynamics-such as position, velocity, and heading angle-being processed by an RNN encoder to serve as node features. The interactions between these users are characterized as edge attributes within the graph [229]. Integrating attention mechanisms with GNNs facilitates the dynamic weighting of importance among the neighbors of nodes. This is a process of critical significance for AVs to prioritize relevant agents within their surroundings. This prioritization improves the accuracy of interaction predictions and decision-making in traffic scenarios [230].

3.7.4 Gaussian Mixture Model-Hidden Markov Model (GMM-HMM)

The GMM-HMM approach combines the strengths of Gaussian Mixture Models and Hidden Markov Models to provide a probabilistic framework for assessing risks and making safe driving decisions. GMMs can model the distribution of various driving behaviors as a mixture of multiple Gaussian distributions, effectively capturing the diversity in driving styles. The HMM then leverages these distributions to predict a driver’s sequential actions based on temporal dependencies. This predictive capability allows the AVs to anticipate and respond to human driver behaviors in surrounding vehicles. This combination involves analyzing vehicles’ spatial and temporal relationships to predict their future positions and actions. For instance, if a leading vehicle is predicted to brake, the model can assess the likelihood of a following vehicle braking in response and prepare the AV for such scenarios [231]. Therefore, the GMM-HMM provides a robust method for modeling complex behaviors and interactions, offering insights into understanding both the distribution of driving behaviors and their temporal progression.

Exploring various approaches, including the MPC, GANs, GNNs, and GMM-HMM, enriches the toolkit available for enhancing interactive driving in AVs. Each methodology brings different processing and decision-making capabilities advantages, underscoring the multidisciplinary nature of advancements in autonomous driving technologies.

4 Comparison and Future Works

Less

4.1 Comparative Analysis of Approaches

This literature review covers many approaches to offer a comprehensive overview of state-of-the-art approaches in interactive driving for AVs. The LSTM is particularly effective in scenarios necessitating the comprehension and prediction of temporal event sequences, such as predicting the movements of surrounding vehicles. Transformers bring substantial benefits in handling sequential data and modeling complex interactions. Nevertheless, their computational complexity, extensive data requirements, and difficulties in real-time application present significant drawbacks. The APF provides straightforward solutions for navigation and obstacle avoidance, but its efficacy diminishes in complex and dynamic environments. Game theory excels in situations requiring strategic decision-making among multiple agents, though its real-time application in driving contexts poses implementation challenges. The RL and DRL demonstrate robust capabilities in learning and optimizing driving strategies through environmental interactions. However, these capabilities are hindered by the need for extensive training and substantial computational resources. Despite their high computational demands, the POMDP is adept at dealing with uncertainty and facilitating informed decision-making within partially observable environments. The MPC provides significant benefits with its predictive capabilities and adaptability, though it also encounters challenges related to computational demands and dependency on accurate models. GANs hold transformative potential for autonomous driving by enhancing data availability and simulation capabilities; however, they pose significant challenges due to training complexity and data reliability. GNNs are effective tools for modeling complex and interconnected data in autonomous driving but are constrained by computational demands. The GMM-HMM offers advantages in modeling and predicting complex and temporal behaviors in interactive driving scenarios, while it also requires careful model tuning to address associated challenges. Table 6 summarizes the pros and cons associated with each method.

4.2 Future Work Discussion

Table 6 illustrates that each approach encounters significant challenges. These challenges include but are not limited to balancing computational efficiency with predictive performance in real-time applications, modeling consistent and reliable interactions to ensure high accuracy and performance across various driving conditions. The following summary outlines potential directions for future research.

4.2.1 Real-Time Processing Efficiency

Communication delays in AVs may lead to overoptimistic performance assessments. This issue underscores the importance of real-time processing efficiency for AVs to operate safely and effectively. The main challenge lies in minimizing latency while managing the extensive and diverse input information required for accurate trajectory prediction, such as road design and vehicle type. Advanced methods must rapidly process large volumes of data from various sources, integrating parallel planning with parallel vision [232] and parallel control [233] to ensure safe and efficient autonomous driving in dynamic traffic scenarios. Future research should aim to optimize algorithms to mitigate adverse factors, such as latency and data loss, and to improve inference efficiency for practical deployment [234]. This could be achieved through edge computing and distributed processing.

4.2.2 Accuracy and Performance Enhancement

Accuracy and performance are critical aspects of modeling interactions between surrounding vehicles and AVs, ensuring that relationships remain consistent and reliable over time [235, 236]. Achieving these goals involves overcoming challenges related to multimodal multiple agents’ predictions, large dataset requirements, loss function design and extensive real-world testing.

Due to the inherent uncertainty in driver behaviors, mathematical models incorporating driver dynamics may not accurately represent actual behaviors. Integrating advanced knowledge from physics or traffic studies, such as detailed driver characteristics [237] and naturalistic driver trajectory databases, can enhance interpretability for multimodal driver behaviors. Moreover, it is essential to extend the current framework to include multi-agent prediction, allowing for the prediction of multiple potential future behaviors for a group of surrounding vehicles [133].

To address the aforementioned complexities and further improve AV performance, incorporating diverse datasets that capture a wide range of driving scenarios and behaviors is essential for enhancing the generalization capability of algorithms in predicting human drivers’ intentions and motions. This approach will also promote the generation of more human-like, secure, and interpretable decision-making behaviors for AV testing across different conditions [238]. Virtual simulation testing is increasingly vital for assessing the feasibility of AV driving strategies. Mainstream methods prioritize testing efficiency by extracting critical scenarios from naturalistic driving datasets. However, these methods often rely on fixed assumptions when defining criticalities within testing tasks, resulting in scenarios that fail to challenge AVs using different strategies [239, 240]. Additionally, there is an imbalance in the driving maneuver classes, and the system lacks adaptive weights to compensate for each unbalanced loss. Enhancing loss function design can significantly improve model performance by better aligning training objectives with real-world outcomes. Future research should involve real-world experiments to evaluate the proposed approaches in practical settings.

5 Conclusions

Less

This paper conducts a comprehensive review of various interactive driving approaches and their applications for AVs. Comparative analysis is discussed in terms of the advantages and disadvantages of each approach in addressing the complexities inherent in autonomous driving for AVs. In addition, a comprehensive strategy is imperative to overcome those challenges associated with interactive driving in AVs. The insights garnered from this survey can steer future research endeavors toward developing more robust, efficient, and safe autonomous driving systems by combining those interactive driving techniques and delving into inter-vehicle interaction concepts.

Appendix A: Nomenclatures of APF

Less

Symbol	Description	Symbol	Description
${k}_{\text{road }}$	Repulsive gain coefficient of the road boundaries	${d}_{\text{road }}$	The shortest distance between the center mass of the vehicle and the boundary of the lane
${k}_{\text{obs }}$	Repulsive gain coefficient of the obstacle	${d}_{\mathrm{w}}$	Width of the lane
${k}_{\text{goal }}$	Attractive gain coefficient of the goal	${d}_{\text{lane }, i}$	Distance between the AGV and the $i$ -th lane line
${k}_{\mathrm{v}}$	Gain coefficient of the velocity potential field	${d}_{\mathrm{s}}$	Safety distance
$x$	Horizontal coordinate of the controlled AV	$y$	Vertical coordinate of the controlled AV
${x}_{\text{obs }}$	Horizontal coordinate of the obstacle	${y}_{\text{obs }}$	Vertical coordinate of the obstacle
$m$	Gravitational field factor	${y}_{\text{goal }}$	Vertical coordinate value of the goal position
$v$	Current speed of the controlled AV	${v}_{\text{obs }}$	Velocity of the dynamic obstacle
$A$	Horizontal acting distances of the repulsive potential field of an obstacle	$B$	Vertical acting distances of the repulsive potential field of an obstacle
${A}_{\text{lane }}$	Gain coefficient of the road potential field	${\sigma }_{r}$	Convergence coefficient of the road potential field
${b}_{1}$	Coefficient of attraction field	${b}_{2}$	Coefficient of attraction field
${A}_{x}$	Horizontal size coefficients	${A}_{y}$	Vertical size coefficients
${R}_{x}$ $\varphi$	$x\cos \varphi -y\sin \varphi$ Yaw of the vehicle	${R}_{y}$	$x\sin \varphi + y\cos \varphi$

Acknowledgements This work was partially supported by the Texas Tech University Graduate School Fellowship.

Declarations

Less

Conflict of interest On behalf of all the authors, the corresponding author states that there is no conflict of interest.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

References

Less

Magana,

V.C.

, et

al.

: Beside and behind the wheel: factors that influence driving stress and driving behavior. Sustainability, 13 (9): 4775(2021)

Khattak,

Z.H.

, Lin,

Z.H.

: Quantifying automated vehicle benefits in reducing driving stress: a simulation experiment approach. Front. Future Transp., (2023).https://doi.org/10.3389/ffutr.2023.1196629

Kim,

K.D.

, Kumar,

P.R.

: An MPC-based approach to provable system-wide safety and liveness of autonomous ground traffic. IEEE Trans. Autom. Control, 59 (12): 3341-3356 (2014)

Nascimento,

A.M.

, et

al.

: The role of virtual reality in autonomous vehicles' safety. Paper presented at the 2nd IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), San Diego, CA : (2019)

Lee,

, et

al.

: Driving safety evaluation of mixed car-following situations by autonomous and manual vehicles at urban interrupted road facilities. Transp. Res. Rec., (2024). https://doi.org/10.1177/03611981231222237

Aoki,

, Rajkumar,

: Safe intersection management with cooperative perception for mixed traffic of human-driven and autonomous vehicles. IEEE Open J. Veh. Technol., 3. 251-265 (2022)

Guo,

X.Y.

, Zhang,

, Jia,

A.F.

: Study on mixed traffic of autonomous vehicles and human-driven vehicles with different cyber interaction approaches. Veh Commun., (2023). https://doi.org/10.1016/j.vehcom.2022.100550

Li,

Q.Y.

, et

al.

: Metadrive: composing diverse driving scenarios for generalizable reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell., 45 (3): 3461-3475 (2023)

Dai,

, et

al.

: Towards a systematic computational framework for modeling multi-agent decision-making at micro level for smart vehicles in a smart world. Robot. Auton. Syst., (2021). https://doi.org/10.1016/j.robot.2021.103859

Li,

, et

al.

: Modeling mixed traffic flows of human-driving vehicles and connected and autonomous vehicles considering human drivers' cognitive characteristics and driving behavior interaction. Phys. A Stat. Mech. Appl., (2023). https://doi.org/10.1016/j.physa.2022.128368

Wang,

, et

al.

: Towards the next level of vehicle automation through cooperative driving: a roadmap from planning and control perspective. IEEE Trans. Intell. Veh., 9 (3): 4335-4347 (2024)

Negash,

N.M.

, Yang,

: Driver behavior modeling towards autonomous vehicles: comprehensive review. IEEE Access., (2023). https://doi.org/10.1109/ACCESS.2023.3249144

Lisowski,

: A synthesis of algorithms determining a safe trajectory in a group of autonomous vehicles using a sequential game and neural network. Electronics, 12 (5): 1236 (2023)

Kim,

, et

al.

: Automated complex urban driving based on enhanced environment representation with gps/map, radar, lidar and vision. Paper presented at the 8th IFAC Symposium on Advances in Automotive Control (AAC), Norrkoping, Sweden(2016)

You,

C.X.

, et

al.

: Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robot. Auton. Syst., 114. 1-18 (2019)

Zhang,

E.T.

, Zhang,

R.X.

, Masoud,

: Predictive trajectory planning for autonomous vehicles at intersections using reinforcement learning. Transp. Res. Part C Emerg. Technol., (2023). https://doi.org/10.1016/j.trc.2023.104063

Singh,

, Srivastava,

: Multi-scale graph-transformer network for trajectory prediction of the autonomous vehicles. Intel. Serv. Robot., 15 (3): 307-320 (2022)

Ma,

L.J.

, Qu,

S.R.

: Application of conditional generative adversarial network to multi-step car-following modeling. Front. Neurorobot., (2023). https://doi.org/10.3389/fnbot.2023.1148892

Pirani,

, et

al.

: Stable interaction of autonomous vehicle platoons with human-driven vehicles. Paper presented at the American Control Conference (ACC), Atlanta, GA : (2022)

Yang,

, Li,

, Zhou,

Y.P.

: A path planning method for autonomous vehicles based on risk assessment. World Electr. Veh. J., 13 (12): 234

Mora,

, Wu,

X.Y.

, Panori,

: Mind the gap: developments in autonomous driving research and the sustainability challenge. J. Clean. Prod., (2020). https://doi.org/10.1016/j.jclepro.2020.124087

Bitar,

, Watling,

, Romano,

: How can autonomous road vehicles coexist with human-driven vehicles? An evolutionary-game-theoretic perspective. Paper presented at the 8th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS), Electr Network, (2022)

Ye,

L.Y.

, et

al.

: GSAN: Graph self-attention network for learning spatial-temporal interaction representation in autonomous driving. IEEE Internet Things J., 9 (12): 9190-9204 (2022)

Li,

, et

al.

: Safe, efficient and socially-compatible decision of automated vehicles: a case study of unsignalized intersection driving. Autom. Innov., 6 (2): 281-296 (2023)

Qi,

H.S.

: Capacity adjustment of lane number for mixed autonomous vehicles flow considering stochastic lateral interactions. J. Transp. Eng. Part A Syst., 150 (2): 04023134 (2024)

Hu,

, et

al.

: CACC simulation platform designed for urban scenes. IEEE Trans. Intell. Veh., 8 (4): 2857-2874 (2023)

Abbasi,

, Rahmani,

A.M.

: Artificial intelligence and software modeling approaches in autonomous vehicles for safety management: a systematic review. Information, 14 (10): 555 (2023)

Laghmara,

, et

al.

: Obstacle avoidance, path planning and control for autonomous vehicles. Paper presented at the 30th IEEE Intelligent Vehicles Symposium (IV), Paris, France : (2019)

Ma,

Y.L.

, et

al.

: Real-time risk assessment model for multi-vehicle interaction of connected and autonomous vehicles in weaving area based on risk potential field. Phys. A Stat. Mech. Appl., (2023). https://doi.org/10.1016/j.physa.2023.128725

Mozaffari,

, et

al.

: Deep learning-based vehicle behavior prediction for autonomous driving applications: A review. IEEE Trans. Intell. Transp. Syst., 23 (1): 33-47 (2022)

Song,

R.T.

, Li,

: Surrounding vehicles' lane change maneuver prediction and detection for intelligent vehicles: A comprehensive review. IEEE Trans. Intell. Transp. Syst., 23 (7): 6046-6062 (2022)

Markkula,

, et

al.

: Defining interactions: A conceptual framework for understanding interactive behaviour in human and automated road traffic. Theor. Issues Ergon. Sci., 21 (6): 728-752 (2020)

Wang,

, Cao,

, Hu,

: A trajectory planning method of automatic lane change based on dynamic safety domain. Automotive Innovation., 6 (3): 466-480 (2023)

Yuan,

M.F.

, Shan,

J.J.

, Mi,

: Deep reinforcement learning based game-theoretic decision-making for autonomous vehicles. IEEE Robotics and Automation Letters., 7 (2): 818-825 (2022)

Dong,

, et

al.

: Graph-based planning-informed trajectory prediction for autonomous driving. Paper presented at the 2022 6th CAA International Conference on Vehicular Control and Intelligence (CVCI), (2022)

Tran,

A.T.

, et

al.

: A model predictive control-based lane merging strategy for autonomous vehicles. Paper presented at the 30th IEEE Intelligent Vehicles Symposium (IV), Paris, France : (2019)

Dafoe,

, et

al.

: Cooperative AI: Machines must learn to find common ground. Nature, 593 (7857): 33-36 (2021)

Li,

L.H.

, et

al.

: Vehicle interaction behavior prediction with self-attention. Sensors., 22 (2): 429 (2022)

Guo,

H.Y.

, et

al.

: Map-enhanced generative adversarial trajectory prediction method for automated vehicles. Inf. Sci., 622. 1033-1049 (2023)

Cheng,

J.J.

, et

al.

: A behavior decision method for autonomous vehicles in an urban scene. Paper presented at the 17th International Conference on Wireless Algorithms, Systems, and Applications (WASA), Dalian, China : (2022)

Bachmann,

, et

al.

: Responsible integration of autonomous vehicles in an autocentric society. Paper presented at the IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT), Electr Network, (2022)

Wang,

, et

al.

: An information-centric in-network caching scheme for 5G-enabled internet of connected vehicles. IEEE Trans. Mob. Comput., 22 (6): 3137-3150 (2023)

Liu,

Y.K.

, et

al.

: Vision-cloud data fusion for ADAS: A lane change prediction case study. IEEE Trans. Intell. Veh., 7 (2): 210-220 (2022)

Biparva,

, et

al.

: Video action recognition for lane-change classification and prediction of surrounding vehicles. IEEE Trans. Intell. Veh., 7 (3): 569-578 (2022)

Hou,

, et

al.

: Interactive trajectory prediction of surrounding road users for autonomous driving using structural-LSTM network. IEEE Trans. Intell. Transp. Syst., 21 (11): 4615-4625 (2020)

Wang,

W.D.

, et

al.

: An intelligent lane-changing behavior prediction and decision-making strategy for an autonomous vehicle. IEEE Trans. Ind. Electron., 69 (3): 2927-2937 (2022)

Zhan,

T.Z.

, et

al.

: VRR-Net: learning vehicle-road relationships for vehicle trajectory prediction on highways. Mathematics., 11 (6): 1293 (2023)

Lv,

, et

al.

: Trajectory prediction with correction mechanism for connected and autonomous vehicles. Electronics., 11 (14): 2149 (2022)

Zhou,

, et

al.

: Interaction-aware moving target model predictive control for autonomous vehicles motion planning. Paper presented at the European Control Conference (ECC), London, England : (2022)

Huang,

Z.Y.

, et

al.

: Conditional predictive behavior planning with inverse reinforcement learning for human-like autonomous driving. IEEE Trans. Intell. Transp. Syst., 24 (7): 7244-7258 (2023)

Kim,

, et

al.

: Reinforcement learning for autonomous vehicle using MPC in highway situation. Paper presented at the International Conference on Electronics, Information, and Communication (ICEIC), Jeju, South Korea : (2022)

Hu,

, et

al.

: Vehicles swarm intelligence: Cooperation in both longitudinal and lateral dimensions. IEEE Trans. Intell. Veh., (2024). https://doi.org/10.1109/TIV.2024.3412130

Hu,

, et

al.

: A simulation platform for truck platooning evaluation in an interactive traffic environment. IEEE Trans. Intell. Transp. Syst., (2024). https://doi.org/10.1109/tits.2024.3388161

Jeong,

: Predictive lane change decision making using bidirectional long shot-term memory for autonomous driving on highways. IEEE Access., 9 144985-144998 (2021)

Wang,

J.R.

, et

al.

: Event-triggered MPC for collision avoidance of autonomous vehicles considering trajectory tracking performance. Paper presented at the 7th IEEE International Conference on Advanced Robotics and Mechatronics, Guilin, China : (2022)

Sun,

Y.B.

, et

al.

: Inverse reinforcement learning based: Segmented lane-change trajectory planning with consideration of interactive driving intention. IEEE Trans. Veh. Technol., 71 (11): 11395-11407 (2022)

Wang,

J.X.

, et

al.

: Path planning on large curvature roads using driver-vehicle-road system based on the kinematic vehicle model. IEEE Trans. Veh. Technol., 71 (1): 311-325 (2022)

Luan,

Z.K.

, et

al.

: A comprehensive lateral motion prediction method of surrounding vehicles integrating driver intention prediction and vehicle behavior recognition. Proc. Inst. Mech. Eng. Part D J. Autom. Eng., 237 (1): 61-74 (2023)

Veres,

S.M.

, et

al.

: Autonomous vehicle control systems-a review of decision making. Proc. Inst. Mech. Eng. Part I J. Syst. Control Eng., 225 (12): 155-195 (2011)

Gao,

H.B.

, et

al.

: Car-following method based on inverse reinforcement learning for autonomous vehicle decision-making. Int. J. Adv. Robot. Syst., 15 (6): 1729881418817162 (2018)

Schwarting,

, Alonso-Mora,

, Rus,

: Planning and decision-making for autonomous vehicles. Ann. Rev. Control Robot. Auton. Syst.1, 187-210 (2018)

Wang,

, et

al.

: Social interactions for autonomous driving: a review and perspectives. Found. Trends Robot., 10 (3-4): 198-376 (2022)

Malik,

, et

al.

: How do autonomous vehicles decide?. Sensors., 23 (1): 317 (2023)

Abdallaoui,

, et

al.

: Advancing autonomous vehicle control systems: an in-depth overview of decision-making and manoeuvre execution state of the art. J. Eng., 2023 (11): e12333 (2023)

Reda

, et

al.

: Path planning algorithms in the autonomous driving system: a comprehensive review. Robot. Auton. Syst, 174. 104630 (2024)

Song

, et

al.

: A review of the motion planning and control methods for automated vehicles. Sensors, 23 (13): 6140 (2023)

Wang

H.R.

, et

al.

: A pathway forward: the evolution of intelligent vehicles research on IEEE T-IV. IEEE Trans. Intell. Veh, 7 (4): 918-928 (2022)

Morooka

F.E.

, et

al.

: Deep learning and autonomous vehicles: strategic themes, applications, and research agenda using SciMAT and content-centric analysis, a systematic review. Mach. Learn. Knowl. Extr. 5 (3): 763-781 (2023)

Huang

, et

al.

: A review of deep learning-based vehicle motion prediction for autonomous driving. Sustainability. 15 (20), 14716 (2023)

Rizk

, Chaibet

, Kribèche

: Model-based control and model-free control techniques for autonomous vehicles: a technical survey. Appl. Sci, 13 (11): 6700 (2023)

Zhang

Z.L.

, Ohya

: Movement control with vehicle-to-vehicle communication by using end-to-end deep learning for autonomous driving Paper presented at the 10th International Conference on Pattern Recognition Applications and Methods (ICPRAM). Electr Network (2021)

Xing

, Lv

, Cao

D.P.

: Personalized vehicle trajectory prediction based on joint time-series modeling for connected vehicles. IEEE Trans. Veh. Technol. 69 (2): 1341-1352 (2020)

X.L.

, et

al.

: Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. Part C Emerg. Technol. 54. 187-197 (2015)

Altche

, et

al.

: An LSTM network for highway trajectory prediction Paper presented at the 20th IEEE International Conference on Intelligent Transportation Systems (ITSC). Yokohama, Japan (2017)

M.X.

, et

al.

: Predicting future locations of moving objects with deep fuzzy-LSTM networks. Transp. A Transp. Sci. 16 (1): 119-136 (2020)

Zhong

, et

al.

: Autonomous vehicle trajectory combined prediction model based on C-LSTM Paper presented at the 2021 International Conference on Fuzzy Theory and Its Applications (iFUZZY). Taitung, China (2021)

Hou

, et

al.

: Structural transformer improves speed-accuracy trade-off in interactive trajectory prediction of multiple surrounding vehicles. IEEE Trans. Intell. Transp. Syst. 23 (12): 24778-24790 (2022)

Qiao

S.Y.

, et

al.

: An enhanced vehicle trajectory prediction model leveraging LSTM and social-attention mechanisms. IEEE Access. 12. 1718-1726 (2024)

Yang

, et

al.

: A multi-task learning network with a collision-aware graph transformer for traffic-agents trajectory prediction. IEEE Trans. Intell. Transp. Syst. (2024). https://doi.org/10.1109/tits.2023.3345296

Wang

, et

al.

: Risk assessment and mitigation in local path planning for autonomous vehicles with LSTM based predictive model. IEEE Trans. Autom. Sci. Eng. 19 (4): 2738-2749 (2022)

Deo

, et

al.

: Multi-modal trajectory prediction of surrounding vehicles with maneuver based LSTMs Paper presented at the IEEE Intelligent Vehicles Symposium (IV). Changshu, Peoples Republic of China (2018)

Alahi

, et

al.

: Social LSTM: human trajectory prediction in crowded spaces Paper presented at the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA (2016)

Deo

, et

al.

: Convolutional social pooling for vehicle trajectory prediction Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT (2018)

, et

al.

: Human-like motion planning of autonomous vehicle based on probabilistic trajectory prediction. Appl. Soft Comput, (2022). https://doi.org/10.1016/j.asoc.2022.108499

, et

al.

: Learning vehicle surrounding-aware lane-changing behavior from observed trajectories Paper presented at the IEEE Intelligent Vehicles Symposium (IV). Changshu, China (2018)

Wang

J.H.

, Zhang

, Lu

G.Q.

: A Bayesian inference based adaptive lane change prediction model. Transp. Res. Part C Emerg. Technol, (2021). https://doi.org/10.1016/j.trc.2021.103363

Zhu

J.P.

, et

al.

: A novel hybrid method based on deep learning for an integrated navigation system during DVL signal failure. Electronics. 11 (19): 2980 (2022)

Ren

Y.Y.

, et

al.

: A method for predicting diverse lane-changing trajectories of surrounding vehicles based on early detection of lane change. IEEE Access. 10. 17451-17472 (2022)

Yin

, Chen

, Yue

: Extracting overtaking segments by unsupervised clustering and predicting nonmotorized vehicle's trajectory. J. Adv. Transp, (2022). https://doi.org/10.1155/2022/1410296

Shangguan

, et

al.

: Analyzing the collision probability of autonomous vehicle at crossroad Paper presented at the 9th IEEE Data Driven Control and Learning Systems Conference (DDCLS). Liuzhou, China (2020)

, Lu

K.L.

, Sun

C.Y.

: Deep learning aided state estimation for guarded semi-Markov switching systems with soft constraints. IEEE Trans. Signal Process. 71. 3100-3116 (2023)

Jin

X.B.

, et

al.

: Parameter-free state estimation based on kalman filter with attention learning for gps tracking in autonomous driving system. Sensors. 23 (20): 8650 (2023)

Liu

Y.H.

, et

al.

: GPS/INS integrated navigation with LSTM neural network Paper presented at the 4th International Conference on Intelligent Autonomous Systems (ICOIAS). Wuhan, China (2021)

Xiong

H.Y.

, et

al.

: Steering actuator fault diagnosis for autonomous vehicle with an adaptive denoising residual network. IEEE Trans. Instrum. Meas. 71. 1-13 (2022)

Jeong

: Interactive lane keeping system for autonomous vehicles using LSTM-RNN considering driving environments. Sensors. 22 (24): 9889 (2022)

Zhao

S.Y.

, et

al.

: Collision-free emergency planning and control methods for cavs considering intentions of surrounding vehicles. ISA Trans. 136. 535-547 (2023)

Min

, et

al.

: A hierarchical LSTM-based vehicle trajectory prediction method considering interaction information. Autom. Innov. 7. 71-81 (2024)

Vaswani

, et

al.

: Attention is all you need Paper presented at the 31st annual conference on neural information processing systems (NIPS). Long Beach, CA (2017)

Chen

X.B.

, et

al.

: Intention-aware vehicle trajectory prediction based on spatial-temporal dynamic attention network for internet of vehicles. IEEE Trans. Intell. Transp. Syst. 23 (10): 19471-19483 (2022)

100

Messaoud

, et

al.

: Attention based vehicle trajectory prediction. IEEE Trans. Intell. Veh 6 (1): 175-185 (2021)

101

Chitta

, et

al.

: Transfuser: imitation with transformer-based sensor fusion for autonomous driving. IEEE Trans. Pattern Anal. Mach. Intell 45 (11): 12878-12895 (2023)

102

Geng

M.S.

, et

al.

: Dynamic-learning spatial-temporal transformer network for vehicular trajectory prediction at urban intersections. Transp. Res. Part C Emerg. Technol (2023). https://doi.org/10.1016/j.trc.2023.104330

103

Geng

M.S.

, et

al.

: A physics-informed transformer model for vehicle trajectory prediction on highways. Transp. Res. Part C Emerg. Technol (2023). https://doi.org/10.1016/j.trc.2023.104272

104

Y.F.

, Wang

, Peeta

: Leveraging transformer model to predict vehicle trajectories in congested urban traffic. Transp. Res. Rec 2677 (2): 898-909 (2023)

105

Chen

, et

al.

: NAST: Non-autoregressive spatial-temporal transformer for time series forecasting. arXiv preprint arXiv (2021)

106

Hasan

, Huang

: MALS-Net: a multi-head attention-based Istm sequence-to-sequence network for socio-temporal interaction modelling and trajectory prediction. Sensors 23 (1): 530 (2023)

107

Chen

, et

al.

: Vehicle trajectory prediction based on intention-aware non-autoregressive transformer with multi-attention learning for internet of vehicles. IEEE Trans. Instrum. Meas 71. 1-12 (2022)

108

Zhou

Z.K.

, et

al.

: HiVT: Hierarchical vector transformer for multi-agent motion prediction Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA (2022)

109

Geng

, et

al.

: Adaptive and simultaneous trajectory prediction for heterogeneous agents via transferable hierarchical transformer network. IEEE Trans. Intell. Transp. Syst 24 (10): 11479-11492 (2023)

110

Vishnu

, et

al.

: Improving multi-agent trajectory prediction using traffic states on interactive driving scenarios. IEEE Robotics and Automation Letters 8 (5): 2708-2715 (2023)

111

Y.N.

, Gilles

, Stanciulescu

, Moutarde

: TSGN: temporal scene graph neural networks with projected vectorized representation for multi-agent motion prediction Paper presented at the 34th IEEE Intelligent Vehicles Symposium (IV). Anchorage, AK (2023)

112

, Chen

: Intention-aware transformer with adaptive social and temporal learning for vehicle trajectory prediction Paper presented at the 26th International Conference on Pattern Recognition (ICPR)/8th International Workshop on Image Mining -Theory and Applications (IMTA). Montreal, Canada (2022)

113

Jiang

, Liu

, Dong

, Xu

: Intention-aware interactive transformer for real-time vehicle trajectory prediction in dense traffic. Transp. Res. Rec 2677 (3): 946-960 (2023)

114

Gao

, et

al.

: Dual transformer based prediction for lane change intentions and trajectories in mixed traffic environment. IEEE Trans. Intell. Transp. Syst 24 (6): 6203-6216 (2023)

115

Zhang

, et

al.

: Trajectory prediction for autonomous driving using spatial-temporal graph attention transformer. IEEE Trans. Intell. Transp. Syst 23 (11): 22343-22353 (2022)

116

Huang

Z.Y.

, et

al.

: Learning interaction-aware motion prediction model for decision-making in autonomous driving Paper presented at the IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). Bilbao, Spain (2023)

117

, Zhu

, Clifton

D.A.

: Multimodal learning with transformers: a survey. IEEE Trans. Pattern Anal. Mach. Intell 45 (10): 12113-12132 (2023)

118

Zhao

Z.Q.

, Duan

Y.P.

, Tao

X.M.

: Path-based multimodal trajectories prediction Paper Presented at the IEEE 96th Vehicular Technology Conference (VTC-Fall). London, Ecuador (2022)

119

Liu

Y.C.

, et

al.

: Multimodal motion prediction with stacked transformers Paper Presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Electr Network (2021)

120

Mozaffari

, et

al.

: Multimodal manoeuvre and trajectory prediction for automated driving on highways using transformer networks. IEEE Robot. Autom. Lett 8 (10): 6123-6130 (2023)

121

Wang

, et

al.

: Safety-balanced driving-style aware trajectory planning in intersection scenarios with uncertain environment. IEEE Trans. Intell. Veh 8 (4): 2888-2898 (2023)

122

Dong

, et

al.

: Why did the AI make that decision? Towards an explainable artificial intelligence (XAI) for autonomous driving systems. Transp. Res. Part C Emerg. Technol 156. 104358 (2023)

123

Liang

H.B.

, et

al.

: A hierarchical imitation learning-based decision framework for autonomous driving Paper presented at the 32nd ACM International Conference on Information and Knowledge Management (CIKM). Birmingham, England (2023)

124

J.W.

, et

al.

: InteractionNet: joint planning and prediction for autonomous driving with transformers Paper presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Detroit, MI (2023)

125

Huang

Z.Y.

, et

al.

: Gameformer: game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving Paper presented at the IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France (2023)

126

Wang

, et

al.

: Lane transformer: a high-efficiency trajectory prediction model. IEEE Open J. Intell. Transp. Syst 4. 2-13 (2023)

127

Gomez-Huelamo

C. et al.

: Efficient context-aware graph transformer for vehicle motion prediction Paper presented at the IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). Bilbao, Spain (2023)

128

, Li

: Multi-future transformer: learning diverse interaction modes for behaviour prediction in autonomous driving. IET Intel. Transp. Syst 16 (9): 1249-1267 (2022)

129

Liu

Y.Y.

, et

al.

: Proactive longitudinal control to preclude disruptive lane changes of human-driven vehicles in mixed-flow traffic. Control. Eng. Pract (2023). https://doi.org/10.1016/j.conengprac.2023.105522

130

Sharma

, Sahoo

N.C.

, Puhan

N.B.

: Kernelized convolutional transformer network based driver behavior estimation for conflict resolution at unsignalized roundabout. ISA Trans 133. 13-28 (2023)

131

Papineni

, et

al.

: BLEU: a method for automatic evaluation of machine translation Paper presented at the 40th Annual Meeting of the Association-for-Computational-Linguistics. Univ Penn, Philadelphia, PA (2002)

132

, et

al.

: Explainable object-induced action decision for autonomous vehicles Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA (2020)

133

, et

al.

: Holistic transformer: a joint neural network for trajectory prediction and decision-making of autonomous vehicles. Pattern Recogn 141. 109592 (2023)

134

Ettinger

, et

al.

: Large scale interactive motion forecasting for autonomous driving: the waymo open motion dataset Paper presented at the 18th IEEE/CVF International Conference on Computer Vision (ICCV). Electr Network (2021)

135

Khatib

: Real-time obstacle avoidance for manipulators and mobile robots. Int. J. Robot. Res 5 (1): 90-98 (1986)

136

Sun

Q.Y.

, et

al.

: Human-like obstacle avoidance trajectory planning and tracking model for autonomous vehicles that considers the driver's operation characteristics. Sensors 20 (17): 4821 (2020)

137

Wang

S.B.

, et

al.

: Autonomous vehicle path planning based on driver characteristics identification and improved artificial potential field. Actuators 11 (2): 52 (2022)

138

Feng

J.K.

, et

al.

: Path planning and trajectory tracking for autonomous obstacle avoidance in automated guided vehicles at automated terminals. Axioms 13 (1): 27 (2024)

139

Jin

X.J.

, et

al.

: An efficient trajectory planning approach for autonomous ground vehicles using improved artificial potential field. Symmetry-Basel 16 (1): 106 (2024)

140

X.P.

, et

al.

: Intelligent vehicle path planning based on improved artificial potential field algorithm Paper presented at the International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS). Shenzhen, China (2019)

141

Szczepanski

, Tarczewski

, Erwinski

: Energy efficient local path planning algorithm based on predictive artificial potential field. IEEE Access 10. 39729-39742 (2022)

142

Cao

M.C.

, et al: An integrated MPC approach for FWIA autonomous ground vehicles with emergency collision avoidance Paper presented at the 21st IEEE International Conference on Intelligent Transportation Systems (ITSC). Maui, HI (2018)

143

, Gao

, Li

K.Q.

: Humanlike decision and motion planning for expressway lane changing based on artificial potential field. IEEE Access 10. 4359-4373 (2022)

144

Qian

Y.B.

, Sun

H.T.

, Feng

: Obstacle avoidance method of autonomous vehicle based on fusion improved A*APF algorithm. Bull. Pol. Acad. Sci. Tech. Sci 71 (2): e144624 (2023)

145

Yang

H.J.

, et

al.

: EMPC with adaptive APF of obstacle avoidance and trajectory tracking for autonomous electric vehicles. ISA Trans 135. 438-448 (2023)

146

Schwarting

, et

al.

: Social behavior for autonomous vehicles. Proc. Natl. Acad. Sci. USA 116 (50): 24972-24978 (2019)

147

Crosara

, et al: Game theoretic analysis of overtaking maneuvers for autonomous vehicles with moving obstacles Paper presented at the International Balkan Conference on Communications and Networking (BalkanCom). Istanbul, Turkey (2023)

148

Hang

, et

al.

: Human-like decision making for autonomous driving: a noncooperative game theoretic approach. IEEE Trans. Intell. Transp. Syst 22 (4): 2076-2087 (2021)

149

Lopez

V.G.

, et

al.

: Game-theoretic lane-changing decision making and payoff learning for autonomous vehicles. IEEE Trans. Veh. Technol 71 (4): 3609-3620 (2022)

150

, et

al.

: Game-theoretic modeling of multi-vehicle interactions at uncontrolled intersections. IEEE Trans. Intell. Transp. Syst 23 (2): 1428-1442 (2022)

151

Liu

K.W.

, et

al.

: Interaction-aware trajectory prediction and planning for autonomous vehicles in forced merge scenarios. IEEE Trans. Intell. Transp. Syst 24 (1): 474-488 (2023)

152

Nie

P.Y.

, Wang

, Cui

: Players acting as leaders in turn improve cooperation. R. Soc. Open Sci 6 (7): 190251 (2019)

153

Al-Azzawi

R.S.

, Simaan

M.A.

: On the selection of leader in stackelberg games with parameter uncertainty. Int. J. Syst. Sci 52 (1): 86-94 (2021)

154

Hang

, et

al.

: An integrated framework of decision making and motion planning for autonomous vehicles considering social behaviors. IEEE Trans. Veh. Technol 69 (12): 14458-14469 (2020)

155

Wei

, et

al.

: Game theoretic merging behavior control for autonomous vehicle at highway on-ramp. IEEE Trans. Intell. Transp. Syst 23 (11): 21127-21136 (2022)

156

Yuan

, et

al.

: Decision-making and planning methods for autonomous vehicles based on multistate estimations and game theory. Adv. Intell. Syst 5 (11): 2300177 (2023)

157

Wang

X.Y.

, et

al.

: Driver's lane selection model based on multi-player dynamic game. Adv. Mech. Eng 11 (1): 1687814018819903 (2019)

158

Y.W.

, et

al.

: A dynamic lane-changing decision and trajectory planning model of autonomous vehicles under mixed autonomous vehicle and human-driven vehicle environment. Phys. A Stat. Mech. Appl (2023). https://doi.org/10.1016/j.physa.2022.128361

159

Ali

, et

al.

: CLACD: A complete lane-changing decision modeling framework for the connected and traditional environments. Transp. Res. Part C Emerg. Technol (2021). https://doi.org/10.1016/j.trc.2021.103162

160

Liu

M.S.

, et

al.

: A three-level game-theoretic decision-making framework for autonomous vehicles. IEEE Trans. Intell. Transp. Syst 23 (11): 20298-20308 (2022)

161

Shoham

, LeytonBrown

: Multiagent Systems: Algorithmic, Game-Theoretic, And Logical Foundations. Cambridge University Press, London : Cambridge University Press (2009)

162

H.T.

, Tseng

H.E.

, Langari

: A human-like game theory-based controller for automatic lane changing. Transp. Res. Part C Emerg. Technol 88. 140-158 (2018)

163

Shu

K.Q.

, et

al.

: Human inspired autonomous intersection handling using game theory. IEEE Trans. Intell. Transp. Syst 24 (10): 11360-11371 (2023)

164

Zhang

Y.R.

, et

al.

: Human-like interactive behavior generation for autonomous vehicles: a Bayesian game-theoretic approach with turing test. Adv. Intell. Syst 4 (5): 2100211 (2022)

165

Tian

, et

al.

: Game-theoretic modeling of traffic in unsignalized intersection network for autonomous vehicle control verification and validation. IEEE Trans. Intell. Transp. Syst 23 (3): 2211-2226 (2022)

166

, Li

, Orsag

, Han

: Hierarchical and game-theoretic decision-making for connected and automated vehicles in overtaking scenarios. Transp. Res. Part C Emerg. Technol 150. 104109 (2023)

167

Sankar

G.S.

, Han

: Adaptive robust game-theoretic decision making strategy for autonomous vehicles in highway. IEEE Trans. Veh. Technol 69 (12): 14484-14493 (2020)

168

Camerer

C.F.

, Ho

T.H.

, Chong

J.K.

: A cognitive hierarchy model of games. Quart. J. Econ 119 (3): 861-898 (2004)

169

Garzon

, et al: Game theoretic decision making for autonomous vehicles' merge manoeuvre in high traffic scenarios Paper presented at the IEEE Intelligent Transportation Systems Conference (IEEE-ITSC). Auckland, New Zealand (2019)

170

Zhu

, et

al.

: A model to manage the lane-changing conflict for automated vehicles based on game theory. Sustainability 15 (4): 3063 (2023)

171

Wegener

, et al: Automated eco-driving in urban scenarios using deep reinforcement learning. Transp. Res. Part C Emerg. Technol 126. 102967 (2021)

172

Tammewar

, et al: Improving the performance of autonomous driving through deep reinforcement learning. Sustainability 15 (18): 13799 (2023)

173

Tiong

, et al: Autonomous vehicle driving path control with deep reinforcement learning IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC). Electr Network (2023)

174

Perez-Gil

, et al: Deep reinforcement learning based control for autonomous vehicles in CARLA. Multimedia Tools Appl 81 (3): 3553-3576 (2022)

175

Shi

H.T.

, et al: A deep reinforcement learning based distributed control strategy for connected automated vehicles in mixed traffic platoon. Transp. Res. Part C Emerg. Technol 148. 104109 (2023)

176

Xie

, et al: Modeling human-like longitudinal driver model for intelligent vehicles based on reinforcement learning. Proc. Inst. Mech. Eng. Part D J. Autom. Eng 235 (8): 2226-2241 (2021)

177

Zhang

, et al: Human-like autonomous vehicle speed control by deep reinforcement learning with double q-learning Paper presented at the IEEE Intelligent Vehicles Symposium (IV). Changshu, Peoples Republic of China (2018)

178

Chen

, et al: Autonomous driving using safe reinforcement learning by incorporating a regret-based human lane-changing decision model. Paper presented at the American Control Conference (ACC), Denver, CO, USA(2023)

179

Shi

H.T.

, et al: Connected automated vehicle cooperative control with a deep reinforcement learning approach in a mixed traffic environment. Transp. Res. Part C Emerg. Technol 133. 103421 (2021)

180

Tian

Y.T.

, et al: Learning to drive like human beings: a method based on deep reinforcement learning. IEEE Trans. Intell. Transp. Syst 23 (7): 6357-6367 (2022)

181

Wang

, et al: Velocity control in car-following behavior with autonomous vehicles using reinforcement learning. Accid. Anal. Prev 174. 106729 (2022)

182

Zhu

M.X.

, Wang

X.S.

, Wang

Y.H

: Human-like autonomous car-following model with deep reinforcement learning. Transp. Res. Part C Emerg. Technol 97. 348-368 (2018)

183

Yang

X.X.

, et al: Improved deep reinforcement learning for car-following decision-making. Phys. A Stat. Mech. Appl 624. 128912 (2023)

184

Zhou

J.H.

, et al: A cooperative car-following control model combining deep optical flow estimation and deep reinforcement learning for hybrid electric vehicles. Proc. Inst. Mech. Eng. Part D J. Autom. Eng (2023). https://doi.org/10.1177/09544070231181667

185

Zhou

, et al: Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic. Auton. Intell. Syst 2 (1): 5 (2022)

186

, Li

: A review of vehicle lane change research. Phys. A Stat. Mech. Appl 626. 129060 (2023)

187

Wang

, et al: Make space to change lane: a cooperative adaptive cruise control lane change controller. Transp. Res. Part C Emerg. Technol 143. 103847 (2022)

188

Y.J.

, Zhang

X.H.

, Sun

: Automated vehicle's behavior decision making using deep reinforcement learning and high-fidelity simulation environment. Transp. Res. Part C Emerg. Technol 107. 155-170 (2019)

189

Wang

, et al: A deep reinforcement learning-based approach for autonomous lane-changing velocity control in mixed flow of vehicle group level. Expert Syst. Appl 238. 122158 (2024)

190

Makantasis

, Kontorinaki

, Nikolos

: Deep reinforcement-learning-based driving policy for autonomous road vehicles. IET Intel. Transp. Syst 14 (1): 13-24 (2020)

191

Kang

L.W.

, et al: A reinforcement learning based decision-making system with aggressive driving behavior consideration for autonomous vehicles 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). Electr Network (2021)

192

G.F.

, et al: Decision making of autonomous vehicles in lane change scenarios: deep reinforcement learning approaches with risk awareness. Transp. Res. Part C Emerg. Technol 134. 103452 (2022)

193

K.X.

, et al: A safe and efficient lane change decision-making strategy of autonomous driving based on deep reinforcement learning. Mathematics 10 (9): 1551 (2022)

194

Chen

B.M.

, et al: Adversarial evaluation of autonomous vehicles in lane-change scenarios. IEEE Trans. Intell. Transp. Syst. 23 (8): 10333-10342 (2022)

195

Irshayyid

, Chen

, Xiong

: A review on reinforcement learning-based highway autonomous vehicle control. Green Energy Intell. Transp. 3. 100156 (2024)

196

Kiran

B.R.

, et al: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst. 23 (6): 4909-4926 (2022)

197

Crosato

, et al: Social interaction-aware dynamical models and decision-making for autonomous vehicles. Adv. Intell. Syst. 6. 2300575 (2023)

198

Ben Elallid

, et al: A comprehensive survey on the application of deep and reinforcement learning approaches in autonomous driving. J. King Saud Univ. Comput. Inf. Sci. 34 (9): 7366-7390 (2022)

199

Spaan

M.T

: Reinforcement Learning: State-of-the-Art. Berlin : Springer (2012)

200

Xiang

X.C.

, Foo

: Recent advances in deep reinforcement learning applications for solving partially observable Markov decision processes (POMDP) problems: part 1-fundamentals and applications in games, robotics and natural language processing. Mach. Learn. Knowl. Extr. 3 (3): 554-581 (2021)

201

X.H.

, et al: Real-time trajectory planning for autonomous urban driving: framework, algorithms, and verifications. IEEE ASME Trans. Mechatron. 21 (2): 740-753 (2016)

202

Sunberg

, Kochenderfer

M.J

: Improving automated driving through POMDP planning with human internal states. IEEE Trans. Intell. Transp. Syst. 23 (11): 20073-20083 (2022)

203

Tang

, et al: Integrated decision making and planning framework for autonomous vehicle considering uncertain prediction of surrounding vehicles IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). Macau, Peoples Republic of China (2022)

204

Pouya

, Madni

A.M

: Expandable-partially observable Markov decision-process framework for modeling and analysis of autonomous vehicle behavior. IEEE Syst. J. 15 (3): 3714-3725 (2021)

205

, Zhao

W.Z.

, Wang

C.Y

: POMDP motion planning algorithm based on multi-modal driving intention. IEEE Trans. Intell. Veh. 8 (2): 1777-1786 (2023)

206

Ding

W.C.

, et al: Epsilon: an efficient planning system for automated vehicles in highly interactive environments. IEEE Trans. Rob. 38 (2): 1118-1138 (2022)

207

Kruse

L. A.

, et al: Uncertainty-aware online merge planning with learned driver behavior IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). Macau, China (2022)

208

Zhu

, et al: Functional testing scenario library generation framework for connected and automated vehicles. IEEE Trans. Intell. Transp. Syst. 24 (9): 9712-9724 (2023)

209

Wray

K.H.

, et al: POMDPs for safe visibility reasoning in autonomous vehicles 2021 IEEE International Conference on Intelligence and Safety for Robotics (ISR). (2021)

210

Lin

, et al: Decision making through occluded intersections for autonomous driving IEEE Intelligent Transportation Systems Conference (IEEE-ITSC). Auckland, New Zealand (2019)

211

Elter

, et al: Interaction-aware prediction of occupancy regions based on a POMDP framework 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). (2022)

212

, et al: A review of research on traffic conflicts based on intelligent vehicles. IEEE Access. 8. 24471-24483 (2020)

213

Tran

D.Q.

, Bae

S.H

: Improved responsibility-sensitive safety algorithm through a partially observable Markov decision process framework for automated driving behavior at non-signalized intersection. Int. J. Autom. Technol. 22 (2): 301-314 (2021)

214

Xia

C.M.

, Xing

M.L.

, He

S.H

: Interactive planning for autonomous driving in intersection scenarios without traffic signs. IEEE Trans. Intell. Transp. Syst. 23 (12): 24818-24828 (2022)

215

Shu

K. Q.

, et al: Autonomous driving at intersections: a critical-turning-point approach for left turns 23rd IEEE International Conference on Intelligent Transportation Systems (ITSC). Electr Network (2020)

216

Shu

K.Q.

, et al: Autonomous driving at intersections: a behavior-oriented critical-turning-point approach for decision making. IEEE-Asme Trans. Mechatron. 27 (1): 234-244 (2022)

217

Sun

, Leng

J.H.

, Lu

: Interactive left-turning of autonomous vehicles at uncontrolled intersections. IEEE Trans. Autom. Sci. Eng (2022). https://doi.org/10.1109/tase.2022.3227964

218

X.C.

, Guvenc

, Aksun-Guvenc

: Autonomous vehicle decision-making with policy prediction for handling a round intersection. Electronics. 12 (22): 4670 (2023)

219

Smallwood

R.D.

, Sondik

E.J

: The optimal control of partially observable markov processes over a finite horizon. Oper. Res. 21 (5): 1071-1088 (1973)

220

Kurniawati

, Yadav

: An online POMDP solver for uncertainty planning in dynamic environment Robotics Research: The 16th International Symposium ISRR. (2016)

221

Wray

K.H.

, Witwicki

S.J.

, Zilberstein

: Online decision-making for scalable autonomous systems 26th International Joint Conference on Artificial Intelligence (IJCAI). Melbourne, Australia (2017)

222

, et al: DESPOT: online POMDP planning with regularization. J. Artif. Intell. Res. 58. 231-266 (2017)

223

Silver

, Veness

: Monte-carlo planning in large POMDPS. Adv. Neural Inf. Process. Syst. 23 (2010)

224

Ammour

, Orjuela

, Basset

: A MPC combined decision making and trajectory planning for autonomous vehicle collision avoidance. IEEE Trans. Intell. Transp. Syst. 23 (12): 24805-24817 (2022)

225

H.R.

, et al: Distributed MPC for multi-vehicle cooperative control considering the surrounding vehicle personality. IEEE Trans. Intell. Transp. Syst (2023). https://doi.org/10.1109/tits.2023.3253878

226

, et al: Real-time optimal trajectory planning for autonomous driving with collision avoidance using convex optimization. Autom. Innov. 6 (3): 481-491 (2023)

227

Farkas

, et al: MPC control strategy for autonomous vehicles driving in roundabouts 30th Mediterranean Conference on Control and Automation (MED). Athens, Greece (2022)

228

Liu

, et al: Graph reinforcement learning-based decision-making technology for connected and autonomous vehicles: framework, review, and future trends. Sensors. 23 (19): 8229 (2023)

229

Chu

W.B.

, et al: Motion planning for autonomous driving with real traffic data validation. Chin. J. Mech. Eng. 37 (1): 6 (2024)

230

Zeng

, et al: Spatio-temporal-attention-based vehicle trajectory prediction considering multi-vehicle interaction in mixed traffic flow. Appl. Sci. Basel. 14 (1): 161 (2024)

231

Yan

, et al: Interaction-awareness based intention inference of lag vehicle in lane changing decision-making process for autonomous driving IEEE 6th International Conference on Industrial Cyber-Physical Systems (ICPS). Wuhan, China (2023)

232

Wang

, et al: Parallel vision for long-tail regularization: Initial results from IVFC autonomous driving testing. IEEE Trans. Intell. Veh. 7 (2): 286-299 (2022)

233

, Wei

, Wang

: Parallel control for optimal tracking via adaptive dynamic programming. IEEE/CAA J. Autom. Sin. 7 (6): 1662-1674 (2020)

234

Yin

, et al: V2VFormer++: multi-modal vehicle-to-vehicle cooperative perception via global-local transformer. IEEE Trans. Intell. Transp. Syst. 25 (2): 2153-2166 (2023)

235

, et al: Safety-aware human-lead vehicle platooning by proactively reacting to uncertain human behaving. arXiv preprint (2024)

236

Zhang

, et al: No more road bullying: an integrated behavioral and motion planner with proactive right-of-way acquisition capability. Transp. Res. Part C Emerg. Technol. 156. 104363 (2023)

237

Khakzar

, et al: Driver influence on vehicle trajectory prediction. Accid. Anal. Prev. 157. 106165 (2021)

238

, et al: Decision-making model for dynamic scenario vehicles in autonomous driving simulations. Appl. Sci. 13 (14): 8515 (2023)

239

, et al: Autonomous vehicles testing considering utility-based operable tasks. Tsinghua Sci. Technol. 28 (5): 965-975 (2023)

240

Tang

, et al: Driving environment uncertainty-aware motion planning for autonomous vehicles. Chin. J. Mech. Eng. 35 (1): 120 (2022)

Appendix

Less

Year 2025 volume 8 Issue 2

PDF

307

119

Cite this Article

BibTeX

Article Info

doi: 10.1007/s42154-024-00332-w

Receive Date：2024-05-11
Online Date：2025-07-21

Article Data

Affiliations

History

Received：2024-05-11
Accepted：2024-09-18

Affiliations

¹ Texas Tech University Department of Mechanical Engineering Lubbock TX 79409 USA

Corresponding:

James Yang james.yang@ttu.edu

References

Share

https://castjournals.cast.org.cn/joweb/qccx/EN/10.1007/s42154-024-00332-w

Share to

Scan QR to access full text

Cite this article

BibTeX

Citations

表12种不同金属材料的力学参数

科 Family	属数 Number of genus	种数 Number of species	占总种数比例 Percentage of total species (%)	属 Genus	种数 Number of species	占总种数比例 Percentage of total species (%)
鹅膏菌科Amanitaceae	2	11	5.26	鹅膏菌属 Amanita	10	4.78
小菇科 Mycenaceae	2	12	5.74	丝盖伞属 Inocybe	5	2.39
多孔菌科 Polyporaceae	8	14	6.70	蜡蘑属 Laccaria	5	2.39
红菇科 Russulaceae	3	23	11.00	小皮伞属 Marasmius	6	2.87
				小菇属 Mycena	11	5.26
				光柄菇属 Pluteus	5	2.39
				红菇属 Russula	17	8.13
				栓菌属 Trametes	5	2.39

关闭全屏

BibTeX
EndNote
RefWorks
TxT

Table 1 Database and search terms

Search strategy	Details
Databases and search engines Search terms	IEEE Xplore, Scopus, SAGE Research Methods Datasets, MDPI, and Wiley Online Library
	Key concepts and their synonyms (interaction-aware [Title/Abstract/Keywords] OR behavior planning [Title/Abstract/Keywords]…)
	Various methodologies (game theory [Title/Abstract/Keywords] OR POMDP planning [Title/Abstract/ Keywords]…)
	Different traffic scenarios (car-following [Title/Abstract/Keywords] OR lane-changing planning [Title/ Abstract/Keywords]…)

Table 2 Applications of transformer models in autonomous driving

References	Main objective	Input	Output	Evaluation metrics	Scenario	Dataset
[17]	Trajectory prediction	Initial set of randomized trajectory data	Long-term predicted trajectory	Average displacement error (ADE), final displacement error (FDE)	Complex road scenario	Argoverse, Apolloscape, Lyft
[102]	Trajectory prediction	Historical trajectory and position	Predicted trajectories	ADE, FDE	Intersection	inD
[103]	Trajectory prediction	Historical trajectory, instant speed	Predicted historical trajectory	ADE, FDE	Highway	Ubiquitous Traffic Eyes, NGSIM
[104]	Trajectory prediction	Location, velocity, acceleration	Predicted longitudinal and lateral velocities	RMSE, mean absolute error (MAE), maximum velocity error (MVE)	Lane-changing	pNEUMA
[107]	Trajectory prediction	Historical trajectories	Predicted intention and position of target vehicle	Root mean squared error (RMSE)	Lane-changing	HighD, NGSIM
[109]	Trajectory prediction	Historical trajectories, speed, masks	Learned parameters	ADE, FDE, relative displacement error (RDE)	T-junction, intersection	inD
[110]	Trajectory prediction	Historical trajectories	Predicted trajectory	minADE, minFDE, kernel density estimate-based negative log likelihood (KDE-NLL)	Merging, roundabout, intersection	EyeonTraffic (EOT), INTERACTION
[112]	Trajectory prediction	Historical trajectories	Predicted intention and position of target vehicle	RMSE	Lane-changing	HighD, NGSIM
[113]	Trajectory prediction	Historical trajectories	Predicted intention and position of target vehicle	Accuracy, area under the receiver operating characteristic curve (AUC-ROC)	Lane-changing	NGSIM
[115]	Trajectory prediction	Historical observations and spatial-temporal interactions	Predicted coordinates	ADE, FDE, inference time	Intersection	nuScenes, Argoverse, Lyft
[118]	Trajectory prediction	Historical trajectories	Predicted trajectory	minADE, minFDE, miss rate (MR)	T-junction, intersection	Argoverse
[126]	Trajectory prediction	Past trajectory	Predicted trajectory	minADE, minFDE, MR, inference time	T-junction, intersection	Argoverse
[108]	Motion prediction	Relative position, map information	Distribution of future trajectories	minADE, minFDE, MR	Intersection, reversing loop	Argoverse
[111]	Motion prediction	Position, displacement, speed, heading	Probability of possible future trajectories of all agents	minADE, minFDE, MR	T-junction, intersection	Argoverse
[127]	Motion prediction	Plausible high-definition (HD) map, past multi-agent trajectories	Multimodal prediction and confidences	ADE, FDE	T-junction, intersection	Argoverse
[128]	Behavior prediction	Vehicle position, map information	Motion, scene score, agent score	minADE, minFDE, MR, brier-minFDE, minimum self-attention distance error (minSADE), minimum self-feature distance error (minSFDE), scene collision rate (SCR)	T-junction, intersection	Argoverse
[129]	Intention prediction	Spacings, speeds, and accelerations of surrounding vehicles	Predicted acceleration and occurrences of disruptive lane-change conditions	Mean square error (MSE)	Car-following, lane-changing	NGSIM
[130]	Driver behavior estimation	Relative position, speed, heading	Predicted driver behavior	Accuracy, ${\chi }^{2}$, p-value	Unsignalized roundabout	Real-world dataset in Australia
[114]	Intentions and trajectories prediction	Distance and relative speed	Intention probability, lateral trajectory	Precision, recall, F1-score, RMSE	Lane-changing	HighD, NGSIM
[120]	Maneuver and trajectory prediction	Track history, position of lane markings	Multiple maneuver likelihoods, predicted trajectory	minRMSE, average negative log-likelihood (meanNLL), collision rate, off-road rate, ${\mathrm{{div}}}_{\mathrm{K}}$, maximum accuracy (maxACC)	Lane-changing	HighD, NGSIM, exiD
[122]	Decision-making	Image and language	Textual descriptions for driving actions	Bilingual evaluation understudy (BLEU) score [131]	Complex road scenario	Berkeley deep drive object induced actions (BDD-OIA) [132]
[121]	Path planning	Historical state, map information	Optimal executable trajectory	Collision, L2, off-road events, interventions, ADE, FDE	Intersection	Lyft
[133]	Trajectory prediction, behavioral decisions	Scene and agent information	Predicted multimodal trajectories and their probability	Precision, recall, F1-score	T-junction, intersection	Argoverse
[124]	Trajectory prediction, path planning	Map-view feature, the detection, high-level behavior, sparse global navigation satellite system (GNSS) target	Predicted trajectories, planned trajectory	Driving Score (DS), route completion (RC), infraction score (IS)	T-junction	Autopilot in CARLA
[125]	Trajectory prediction, path planning	Historical state, map information	Predicted trajectory	minADE, minFDE, MR, mean average precision (mAP)	T-junction, intersection	Waymo [134], nuPlan

Table 3 Improvement of modified APF model

References	Components of artificial potential field
[136]	Road boundary repulsion field	${U}_{\text{road }} = \frac{1}{2}{k}_{\text{road }}{\left(\frac{1}{{d}_{\text{road }} -\frac{{d}_{\mathrm{W}}}{2}}\right) }^{2}$
[136]	Obstacle repulsive potential field	${U}_{\mathrm{{obs}}} = \left\{ \begin{matrix} {k}_{\mathrm{{obs}}}\exp \left({-\frac{1}{2}\left({{\left(\frac{x -{x}_{\mathrm{{obs}}}}{A}\right) }^{2} + {\left(\frac{y -{y}_{\mathrm{{obs}}}}{B}\right) }^{2}}\right) }\right), & d \leq {d}_{0} \\ 0, & d > {d}_{0} \end{matrix}\right.$
[137]	Goal attractive field Surrounding obstacle’s potential field	${U}_{\text{goal }} = \frac{1}{2}{k}_{\text{goal }}{d}^{m}$ ${U}_{\mathrm{{obs}}} = \left\{ \begin{matrix} \frac{1}{2}{k}_{\mathrm{{obs}}}{\left(\frac{1}{d} -\frac{1}{{d}_{0}}\right) }^{2}, & d \leq {d}_{0} \\ 0, & d > {d}_{0} \end{matrix}\right.$
[138]	Lane line potential field	${U}_{\text{lane }} = \mathop{\sum }\limits_{i}{A}_{\text{lane }}\exp \left({-\frac{{d}_{\text{lane }, i}}{2{\sigma }_{\mathrm{r}}^{2}}}\right)$
	AV velocity potential field	${U}_{\mathrm{v}} = {k}_{\mathrm{v}}\left({v -{v}_{\text{obs }}}\right)$
	Surrounding obstacle’s potential field	${U}_{\mathrm{{obs}}} = \left\{ \begin{matrix} \frac{1}{2}{k}_{\mathrm{{obs}}}{\left(\frac{1}{d} -\frac{1}{{d}_{0}}\right) }^{2}, & d \leq {d}_{0} \\ 0, & d > {d}_{0} \end{matrix}\right.$
[139]	Goal attractive field	${U}_{\text{goal }} = {b}_{1}x -\frac{{d}_{\mathrm{w}}}{\pi }{b}_{2}\cos \left({\frac{\pi }{{d}_{\mathrm{w}}}\left({y -{y}_{\text{goal }}}\right) }\right)$
	Road boundary repulsion field	${U}_{\text{road }} = \left\{ \begin{matrix} {k}_{\text{road }}{d}_{\mathrm{w}}^{4}, & y > {d}_{\mathrm{w}} \\ {k}_{\text{road }}{\left(y -\frac{{d}_{\mathrm{w}}}{2}\right) }^{4}, & {d}_{\mathrm{w}} > y > \frac{{d}_{\mathrm{w}}}{2} \\ {k}_{\text{road }}{\left(\frac{{d}_{\mathrm{w}}}{2} -y\right) }^{2}, & y < \frac{{d}_{\mathrm{w}}}{2} \end{matrix}\right.$
	Obstacle potential field	${U}_{\mathrm{{obs}}} = \left\{ \begin{matrix} \cos \left({\arctan \left(\frac{y -{y}_{\mathrm{{obs}}}}{x -{x}_{\mathrm{{obs}}}}\right) }\right) \times {e}^{{A}_{x}{R}_{x}^{2} + {A}_{y}{R}_{y}^{2}}, & d \leq {d}_{\mathrm{s}} \\ 0, & d > {d}_{\mathrm{s}} \end{matrix}\right.$

Table 4 DRL applications in car-following and lane-changing

References	Scenario	State	Action	Reward	DRL Algorithm
[182]	Car-following	Velocity of RV, inter-vehicle spacing, and the relative velocity between AV and surrounding vehicles (FV and RV)	Longitudinal acceleration	Spacing and velocity	DDPG
[188]	Car-following	Current velocity and acceleration of the AV, the spacing, and velocity difference to FV	Longitudinal acceleration	Collision, comfortable, reverse, low velocity	DDPG
[178]	Lane-changing	Lateral position, longitudinal velocity, steering angle, and throttle value of AV	Lateral velocity and longitudinal acceleration	Collision, stable velocity, lane-lefting, headway	DDQN
[179]	Car-following	The velocity of RV, inter-vehicle spacing, relative velocity between a lead and following vehicle, and the deviation from equilibrium spacing	Longitudinal acceleration	String stability, safety time gap, velocity limit	DPPO
[171]	Car-following	Velocity and acceleration of AV, velocity and distance of the leading vehicle, traffic light	Longitudinal acceleration	Velocity, acceleration, traffic light, monitoring interventions	TD3
[194]	Lane-changing	Lateral position and yaw angle of AV, velocity, inter-vehicle spacing	Longitudinal acceleration	Collision, traffic law, goal	DDPG
[192]	Lane-changing	Spacing, yaw angle, yaw rate	Steering angle, throttle	Traffic rule, human driving habits, road boundaries, collision, risk assessment	PRDQN
[181]	Car-following	Relative velocity and spacing between AV and surrounding vehicles (FV and RV)	Longitudinal acceleration	Collision, spacing, velocity	SAC
[175]	Car-following	Velocity, spacing	Longitudinal acceleration	Efficiency, driving comfort	DPPO
[189]	Lane-changing	Relative velocity and spacing	Acceleration, steering angle	Lateral position and velocity	SAC

Table 5 POMDP applications in car-following and lane-changing

References	Scenario	State (AV)	Action	Observation (surrounding vehicles)	Reward	POMDP online solver
[210]	T-junction	Position, velocity, route	Acceleration	Position, velocity, and steering angle	Collision, goal, velocity	API and TraCI in SUMO
[215,216]	Intersection	Traveled distance on the planned trajectory, velocity, and route	Acceleration, left turn signal	Position, velocity	Collision, goal, velocity, reverse, target path	Adaptive belief tree (ABT) [220]
[213]	Intersection	Position, velocity, yaw angle, yaw rate	Acceleration	Position, velocity, yaw angle, yaw rate, relative velocity and distance	Safe, goal	Responsibility sensitive safety (RSS) algorithm
[209]	T-junction	Position, relative spacing	Stop, edge, go	Position, relative spacing	Goal	Multiple online decision components with interacting actions (MODIA) [221]
[206]	Lane-changing	Position, velocity, acceleration, heading, heading, and steering angle	Throttle/brake and steering	Position, velocity of other vehicles	Efficiency, safety, and navigation	EPSILON
[207]	Merging	Velocity, latent cooperation level	Longitudinal jerk value	Position	Velocity, safety, comfort	Cooperative intelligent driver model (C-IDM)
[216]	Intersection	Traveled distance on the planned trajectory, velocity, route	Acceleration, left turn signal	Route, velocity, and traveled distance	Collision, goal, velocity, reverse, target path	ABT
[217]	Intersection	Position, velocity	Acceleration	Position, velocity, intention	Collision, velocity, target path	Determined sparse partially observable trees (DESPOT) [222]
[203]	Lane-changing	Longitudinal travel distance, velocity, acceleration. Lateral deviation and velocity	Acceleration	Position, heading	Collision, a safe distance	DESPOT
[214]	Intersection	Position, velocity	Acceleration	Position, velocity, direction	Collision, goal, velocity, acceleration	POMCP [223]
[218]	Round intersection	Position, velocity, heading angle	Acceleration, angular velocity	Position, velocity, heading angle	Collision, velocity, acceleration, target path	POMCP

Table 6 Comparative analysis of approaches in interactive driving for AVs

Method	Advantages	Disadvantages
LSTM	Capture complex temporal relationships in traffic data	Incur high computational cost
	Predict the behavior of surrounding vehicles from various sequential data types	Require large data
	Improve decision-making over time	Tune hyper parameters with difficulty
Transformer	Handle sequential data with efficient processing and global attention	Consume high computational resources
	Model and predict interactions accurately	Require large datasets and quality data
	Make informed decisions about path planning, lane changes, and speed adjustments	Limit real-time predictions due to autoregressive decoding
APF	Implement easily and simply	Experience local minima issues
	Enhance efficiency for real-time obstacle avoidance	Struggle with dynamic obstacles
	Respond quickly to dynamic changes	Lack long-term predictive capability for planning
Game theory	Provide strategic interaction and decision-making	Rely on unrealistic assumptions
	Model competitive and cooperative behaviors properly	Limit real-world applicability due to simplifications for tractability
	Optimize strategies and conflict resolution	Fail to account for unpredictable human behavior
RL/DRL	Learn optimal policies through trial and error	Consume time for many interactions
	Make decisions for multiple agents in changing traffic conditions effectively	Encounter stability and convergence issues
	Adapt to unseen environments from training experiences Handle high-dimensional state and action spaces (DRL)	Exhibit sensitivity to the reward structure
POMDP	Make decisions under uncertainty and partial observation	Require detailed models of the environment
	Model the interaction dynamics and predict the behavior of other vehicles	Consume significant computational resources for optimization
	Offer a robust approach to planning under uncertainty	Limit policy optimality with solution approximations
MPC	Predict future states and optimize control actions accordingly	Exhibit high computational complexity
MPC	Handle multiple objectives and constraints simultaneously	Depend on accurate models of the vehicle and environment
GANs	Generate realistic training data, enhancing model robustness Enhance image and augment data	Involve difficult and potentially unstable training process
GANs	Simulate complex driving scenarios for testing	Require high computational resources
GNNs	Model complex relationships represented as graphs effectively	Require a significant amount of data to train effectively, especially with large graphs
GNNs	Manage dynamic traffic effectively	Challenge adaptation to highly dynamic environments
GMM-HMM	Model complex behaviors and temporal sequences	Require computationally resources for real-time applications
	Adapt to various driving conditions	Exhibit sensitivity to initial parameter settings and model configurations
	Predict behavior effectively	Struggle with complex environments

Fig. 1 Demonstration of interactive behavior

Fig. 2 Positional definition for surrounding vehicles

Fig. 3 Technology roadmap for interactive driving

Fig. 4 Literature search processes

Fig. 5 Five-year (2020-2024) keyworks co-occurrence network for “Autonomous vehicles

Fig. 6 Trajectory prediction[72] and driver intention recognition[46] using the LSTM

Fig. 7 The application of the APF method for an AV

Fig. 8 A framework of the decision-making for AV using game theory [1528, 170]

Fig. 9 Interaction between agent and environment using RL/DRL

Fig. 10 Path planning for AVs based on the POMDP

Articles: Latest Articles; Most Read; Collections

Updates: Events; News; Multimedia

About: About Us

Contact

No. 86 Xueyuan South Road, Haidian District, Beijing

100081

010-62199257

qkjq@cast.org.cn

Copyright © 2025 China Association for Science and Technology. All rights reserved. For all open access content, the relevant licensing terms apply.
Sponsored by the Office of the Leading Group for Cybersecurity and Informatization of CAST, and supported by Science and Technology Review Publishing House