Decision‐Making and Planning Methods for Autonomous Vehicles Based on Multistate Estimations and Game Theory

A core issue inherent to decision‐making and path‐planning tasks is managing the uncertainties in the motion of dynamic obstacles. Therefore, this article proposes a new decision‐making and path‐planning framework, based on game theory, that considers the multistate future actions of surrounding vehicles. First, multistate future actions of neighboring vehicles, whose driving styles vary, are estimated and fed into a decision‐making module for risk assessment. Then, based on the Stackelberg game theory, the ego vehicle and the rear object vehicle are modeled as two players in the game, and their optimal decisions are obtained. In addition, the path‐planning model incorporates a potential‐field model that utilizes several potential functions to explain the varied styles and physical limitations of the surrounding vehicles. Finally, the trajectory of the ego vehicle is obtained through model predictive control that is based on the outputs of the decision‐making and constructed potential‐field models. The results of simulation experiments that used designed scenarios demonstrate that the proposed method effectively manages various social interactions and generates safe and appropriate trajectories for autonomous vehicles. In addition, the simulation results demonstrate that considering multistate trajectories caused the decision‐making and path‐planning modules to be appropriate for unpredictable environmental conditions.

A core issue inherent to decision-making and path-planning tasks is managing the uncertainties in the motion of dynamic obstacles.Therefore, this article proposes a new decision-making and path-planning framework, based on game theory, that considers the multistate future actions of surrounding vehicles.First, multistate future actions of neighboring vehicles, whose driving styles vary, are estimated and fed into a decision-making module for risk assessment.Then, based on the Stackelberg game theory, the ego vehicle and the rear object vehicle are modeled as two players in the game, and their optimal decisions are obtained.In addition, the path-planning model incorporates a potential-field model that utilizes several potential functions to explain the varied styles and physical limitations of the surrounding vehicles.Finally, the trajectory of the ego vehicle is obtained through model predictive control that is based on the outputs of the decision-making and constructed potential-field models.The results of simulation experiments that used designed scenarios demonstrate that the proposed method effectively manages various social interactions and generates safe and appropriate trajectories for autonomous vehicles.In addition, the simulation results demonstrate that considering multistate trajectories caused the decisionmaking and path-planning modules to be appropriate for unpredictable environmental conditions.and perform poorly in situations where the rules are inapplicable.Therefore, they have weak generalization capabilities.10] Although the effectiveness of machine learning-based strategies has been well validated, [11] these methods have two major limitations.First, machine learning-based methods require an enormous amount of training data, which can be difficult and time-consuming to acquire.Second, existing machine learning-based approaches are not interpretable; this causes difficulty in predicting their performance for hypothetical scenarios.
To simultaneously achieve better generalization and interpretation capabilities, game theory methods have been increasingly adopted for the decision-making and path-planning tasks of autonomous vehicles, especially for those operating in highly interactive and hazardous scenarios. [12,13]Nan presented a method to increase average speed while maintaining driving stability on roundabouts. [14]Peng presented a paradigm based on game theory to address merging-zone safety issues. [15]Game theory is frequently employed because it provides excellent descriptions of the complex interactions between dynamic agents. [16]t has also been verified that game theory methods produce decision-making behaviors that are more similar to those of human drivers than those generated by traditional rule-based approaches. [17]Therefore, the Stackelberg game theory was adopted during this study as the core of the decision-making module.In addition, because of the development of V2X (vehicle-to-everything) technologies, [18] it can be assumed that the vehicles addressed in this article were connected vehicles, meaning that they had access to accurate motion information about other nearby vehicles.
A major challenge to the decision-making and path-planning tasks of the ego vehicle is the uncertain motion of surrounding dynamic obstacles.][21][22] Some scholars accounted for the future trajectories of the surrounding vehicles, but they often only predicted the trajectory with the highest probability. [23,24]irst, the ego vehicle would exhibit poor adaptability to uncertain circumstances in which the driver's intentions altered rather frequently and in evidently nonunique ways.Second, as the complexity of the environment increased, the exploration space available to the ego vehicle would be limited when only the trajectory with the highest probability was considered.Thus, accounting for multiple future surrounding-vehicle trajectories is required to address the uncertainties in the local scenario and to enable the ego vehicle to perform safe and feasible decision-making.
Therefore, a heavy computational burden would be created for the downstream decision-making and path-planning tasks because the number of potential future trajectories would substantially increase as more neighboring vehicles became involved.Therefore, considering multistate trajectories while meeting real-time requirements is a challenge.
To address these issues, this article proposes a new framework for decision-making and path-planning based on the Stackelberg game theory that considers the multistate future trajectories of surrounding vehicles.The framework first estimates the multistate future trajectories of the surrounding vehicles, whose driving styles vary, and feeds them into a decision-making module for risk assessment.Then, using the Stackelberg game theory, the ego vehicle and the rear object vehicle are modeled as two players in the game and their optimal decisions are obtained.In addition, the path-planning model incorporates a potentialfield model, which makes use of several potential functions to explain the varied styles and physical limitations of the surrounding vehicles.Finally, the trajectory of the ego vehicle is obtained through model predictive control (MPC) based on the outputs of the decision-making and potential-field models.The proposed module's schematic diagram is presented in Figure 1.
In the proposed integrated approach, the following assumptions are made: 1) The vehicles addressed in this article are connected vehicles, meaning they have access to motion information about other vehicles in the scene via vehicle-to-vehicle or vehicle-to-infrastructure communications; and 2) As the distance between obstacle vehicles and ego vehicle is small, only the acceleration and deceleration characteristics of the obstacle vehicles are taken into account.
This study produced three primary contributions: 1) To provide a more appropriate decision-making and path-planning module for unpredictable environmental conditions, a decision-making module, which accounts for multistate motion estimations and driving styles in the risk assessment, was constructed; 2) To explain the varied styles and physical limitations of the surrounding vehicles and to meet real-time requirements, a potential-field model was incorporated into the path-planning model; and 3) To verify the feasibility and effectiveness of the proposed method, typical scenarios were created and simulated.In addition, the simulations proved that multistate estimations can enable downstream decision-making and path-planning that are better suited to complicated scenarios and environmental uncertainties.
The rest of this article is organized as follows: Section 2 introduces the development of decision-making module based on multistate estimation and Stackelberg game theory.Section 3 introduces the path-planning module via potential-field module.Finally, simulations result and corresponding discussions are included in Section 4 to show the workings of the proposed method.Conclusions are offered in Section 5.

Development of Decision-Making Module Based on Multistate Estimation and Stackelberg Game Theory
The decision-making process must account for both the situation of the ego vehicle and how it interacts with other surrounding vehicles. [25]The future dynamic changes in the traffic environment can be predicted by analyzing the motion of the surrounding vehicles, which is a crucial consideration for the decision-making and path-planning modules.Unlike other methods, the risk-assessment method employed during this study accounts for driving behavior, multistate estimations, and the intentions of the drivers in the surrounding vehicles.

Multistate Estimation Module
The decision-making module faced challenges in real-traffic situations because of the complexity of the relationships between the vehicles and the ambiguity of those vehicles' potential future actions.All the potential driving paths and the probability associated with each are determined by multistate estimations.An ultimate planning path is then produced by integrating the risk assessment module with game theory.The goal of using multistate estimations in this study was to forecast the future trajectory of an agent for the subsequent T seconds.The prediction model was trained using information gathered during this study, and this article presents the well-trained model.
As the future is ambiguous, an attempt was made to produce K distinct predictions for the future trajectory that could be compared with the actual trajectory.The basic design in this study was based on the "MotionCNN". [26]The model was incarnated as an image-based regression model and, during this study, it was changed to a trajectory-dataset-based model.The input of the model was history trajectories of the ego vehicle and surrounding obstacles.Then the data were passed to the multimodal trajectory prediction (MTP), where each generator is composed of independent fully connected (FC) layers.
The dataset was collected from an in-vehicle test, in which a vehicle moved along a fixed path Figure 2. Data of different vehicle decision results were obtained within the same scenario by translating the time axis.Thus, a multimodal dataset was produced.For dataset, we collected about 25 h of driving data at our operations campus.The complete dataset includes approximately 400 000 Lidar scans and 50 000 object bounding boxes.
The module consists of a CNN backbone pretrained in ImageNet with one fully connected layer attached on top.The model predicts K trajectories along with their corresponding confidence values, c 1 , : : : , c k , which are normalized using the "softmax" operator such that P k c k ¼ 1.The model uses the mean squared error (MSE) for the loss functions.The negative logarithmic probability of the anticipated mixture of Gaussians is found for the ground truth trajectory when the means match the predicted trajectories and the identity matrix, I, represents the covariance, as shown in Equation (1).
In Equation (1), N (; μ,Σ) is the probability density function for the multivariate Gaussian distribution, which also includes the mean, μ, and the covariance matrix, Σ.The loss can be further decomposed into a product of Gaussian functions in one dimension; therefore, the resultant expression is merely the logarithm of the sum of the exponents, as shown in Equation ( 2) The performance of the MotionCNN on the multimodal dataset is measured by the average RMSE over the prediction horizon of 3 s.The test RMSE score of network on multimodal dataset was 0.74. Figure 3 presents a scene for multi-modal prediction.
The capability of the model to generate extremely similar trajectories is not explicitly penalized by the proposed loss function.Because all the probability mass is combined into one mode, the model uses a higher risk strategy and generates higher loss values in the event of a wrong prediction; however, users are prevented from observing a mode collapse empirically.As a result, optimizing the proposed loss function produces sufficient multimodality.
The diagram in Figure 4 illustrates the multistate estimation architecture.

Stackelberg Game Theory
A driver has two options when a preceding vehicle travels more slowly than his or her own vehicle: he or she can either slow down and follow the preceding vehicle or accelerate and change lanes.The driver will evaluate the risks involved before deciding if a lane change is safe.The rear object vehicle also has two options: to accelerate or not to accelerate.If the driver is aggressive, he or she may decide to accelerate to maintain a smooth driving pattern.The driver of the ego vehicle must then decide whether to pursue a lane change or to decelerate and stay in the current lane.The decision-making module must therefore account for the driving styles of the drivers in rear object vehicles as well.
The relationships between the ego vehicle and the rear object vehicles can be described as a classic Stackelberg game problem, as was previously mentioned.As a result, Stackelberg game theory is discussed later in the article to explain how autonomous vehicles make decisions.Four factors were considered during the modeling process: the cost, the object vehicle in front of the ego vehicle, the rear object vehicle, and the ego vehicle.
Figure 5 presents a scenario for autonomous vehicles, in which object vehicle 1 is in front of the ego vehicle and is moving more slowly than the ego vehicle.The ego vehicle must decide whether to follow object vehicle 1 or move to the left lane.Object vehicle 2 must simultaneously decide whether to accelerate.The ego vehicle and object vehicle 2 can be modeled as two players in a Stackelberg game, where the ego vehicle acts as the leader and object 2 represents the follower.Both the leader and the follower work to decrease costs throughout the game module.A cost function based on multistate estimations was therefore constructed that considers driving styles as well as intentions.

Ego Vehicle Cost Function
To assess the decision-making costs of the vehicles, four factors -driving safety, riding comfort, travel efficiency, and passable risk-were considered during this study.
The longitudinal and lateral risks to the ego vehicle comprise the driving-safety costs, which are defined by Equation (3)  C ev ds ¼ jjαj À 1jC ev dsÀlog þ jαjC ev dsÀlat (3)   In Equation ( 3), the lateral and longitudinal safety costs are denoted by C ev dsÀlog and C ev dsÀlat , respectively.α stands for the ego vehicle's action, α ∈ fÀ1, 0, 1g, which is change to the left lane, keep current lane and change to the right lane.
The longitudinal costs of the ego vehicle account for the longitudinal gap and the velocity relative to the object vehicle in front of the ego vehicle.It can be expressed by Equation ( 4) In Equation ( 4), the longitudinal velocities of the object vehicle's front and ego vehicles, respectively, are indicated by the v fv logÀv and v ev logÀv .The longitudinal positions of the front vehicle and ego vehicle are X fv logÀd and X ev logÀd .The weighting coefficients for the velocity gap and position gap are w ev logÀv and w ev logÀd , respectively.σ = 0.001 is used in calculations to get around the zero denominator.L v is the length of the vehicle when safety is concern.
The lateral costs of the ego vehicle can be expressed by Equation ( 6) In Equation ( 6), X ov logÀd indicates the longitudinal location of the rear object vehicle; v ev logÀv and v ov logÀv reflect the longitudinal velocities of the ego vehicle and the rear object vehicle, respectively.The weighting factors for lateral velocity gap and position gap are w ev latÀv and w ev latÀd , respectively.The comfort costs of the ego vehicle directly impact both the lateral and longitudinal acceleration values.It was inferred from earlier studies that passengers will feel very uneasy when the longitudinal acceleration exceeds 0.27 g and the lateral acceleration exceeds 0.2 g. [27][28][29] In this study, the comfort costs were therefore defined as shown in Equation ( 8) In Equation ( 8), w ev ax and w ev ay represent the weighting factors and a ev x and a ev y are the longitudinal and lateral accelerations of the ego vehicle, respectively.
The longitudinal velocity of the ego vehicle affects its tripefficiency costs and impacts the multistate estimations of the object vehicles in front of it.The lane-change behaviors of the object vehicles were not considered in this study because of their nearness to the ego vehicle; instead, only the acceleration and deceleration characteristics of the object vehicles were considered.In Stackelberg game theory, the players are the objects and they will reach their minimal costs.However, it is important to consider the multistate estimations of the front object vehicle.As a result, the efficiency costs can be expressed by Equation ( 10) In Equation ( 10) to (11), confi risk represents the possibility of a future collision between the front object vehicle and the ego vehicle.While the ego vehicle is moving faster than the object vehicle in front of it, a collision may occur between the two.First, the ego vehicle moves into the target lane in accordance with its current lane and motion characteristics (α).Afterward, the various states of the object vehicles and the ego vehicle's target lane are evaluated.The crucial factor in a collision is the likelihood of an intersection of the object vehicles' future paths.
However, there is another situation in which the intended speed of the ego vehicle is equal to v max x .If the distance gap between ego vehicle and front vehicle is greater than the safety threshold.Equation ( 12) yields the safety threshold In Equation ( 12), v fv x indicates the speed of the vehicle in front of the ego vehicle in the same lane, a x symbolizes the maximum deceleration the vehicle may perform on the road, τ is the response time, and δ is the safe coefficients.
Finally, the multistate estimations of front object vehicle are equal to the passable costs of the ego vehicle.These passible costs are defined in Equation ( 13)-( 15) In Equation ( 13)-( 15), X fv logÀd represents the longitudinal position of the front object vehicle, v ev logÀv and v fv logÀv denote the longitudinal velocities of the ego vehicle and the front object vehicle, respectively, and w ev1 latÀv and w ev1 latÀd are the weighting factors for the lateral velocity gap and position gap, respectively.
The cost function of the ego vehicle, which is expressed in Equation ( 16), integrates the costs of the multistate estimations, riding comfort, and driving safety.
In Equation ( 16), weighting coefficients of driving safety, ride comfort, and travel efficiency for the ego vehicle are ω ev ds , ω ev cs , and ω ev es , respectively.

Rear Object Vehicle Cost Function
The driving-safety costs of the rear object vehicle consist of both the longitudinal and lateral costs, as for the ego vehicle; these costs are depicted in Equation (17).
In Equation (17), the lateral and longitudinal safety costs are denoted by C ov dsÀlat and C ov dsÀlog , respectively.The rear object vehicle's behavior is depicted by the symbol α1.
The longitudinal costs for the rear object vehicle are defined in Equation ( 18) In Equation ( 18) to (19), the longitudinal velocities of the vehicle in front of the rear object vehicle and the rear object vehicle are indicated by v av logÀv and v ov logÀv , respectively.The longitudinal locations of the vehicle in front and the object vehicle behind are represented by the variables X av logÀd and X ov logÀd .The weighting coefficients for the velocity gap and position gap are w ov logÀv and w ov logÀd , respectively.σ = 0.001 is used in calculations to get around the zero denominator.L v is the vehicle's length when safety is a concern.
No lane-change behaviors for rear object vehicles were accounted for because, as was previously indicated, there is little space between these vehicles and the ego vehicle.As a result, only acceleration and deceleration actions were considered for rear object vehicles.Therefore, the lateral costs of a rear object vehicle are equal to those of the ego vehicle.
In Equation ( 20), X ov logÀd represents the longitudinal position of the rear object vehicle, v ev logÀv and v ov logÀv denote the longitudinal velocities of the ego vehicle and the rear object vehicle, and w ov latÀv and w ov latÀd are the weighting factors for the lateral velocity gap and position gap, respectively.Equation ( 22) expresses how a rear object vehicle affects comfort In Equation ( 22), w ov ax stands for the weighting components and a ov x is the longitudinal acceleration of the rear object vehicle.The longitudinal velocity of the vehicle in front of a rear object vehicle and the multistate estimations for the preceding vehicles are both relevant to the trip-efficiency costs of the rear object vehicle.
In Equation ( 24)-( 25), v max x is the environment's maximum velocity.Same line1 = 1 indicates that the vehicle in front and the rear object vehicle are currently in the same line.confi risk1 shows the possibility of an upcoming collision between a vehicle traveling in front of an object vehicle.
The multistate estimations for the preceding vehicle correspond to the passable costs of a rear object vehicle.In this study, the passible costs were defined according to Equation ( 26) In Equation ( 26)-( 28), X av logÀd represents the longitudinal position of the ahead vehicle, v ov logÀv and v av logÀv denote the longitudinal velocities of the rear object vehicle and the ahead vehicle, and w ov1 latÀv and w ov1 latÀd are the weighting factors for the lateral velocity gap and position gap, respectively.
Finally, the cost function for a rear object vehicle also considers multistate estimations, riding comfort, trip efficiency, and driving safety.
Multistate estimations and risk assessments were considered during this study because a decision-making module was established.In addition, driving habits impact how decisions are made and routes are planned; different styles result in different solutions.Many styles were therefore defined, and their characteristics were incorporated into the decision-making module.The weighting factors, ω ov ds , ω ov cs , and ω ov es , should differ based on the driving style exhibited by the obstacle vehicle.They should also reflect the distinctive properties of each style because three distinct styles were defined during this study.Table 1 lists the weighting factors for the various styles while accounting for all the information.

Decision-Making using Stackelberg Game Theory
The decision-making process for lane changing frequently uses game theory; in this approach, one object vehicle is considered first.The two-player game, which is referred to as a bi-level optimization problem, is then developed using the ego vehicle and the object vehicle as the players.This is one way to express the decision-making module.
In Equation ( 30) to (31), a ev x denotes the ideal longitudinal acceleration of the ego vehicle, α denotes the ideal lane change motion of the ego vehicle, γ 2 ða ev x , αÞ denotes the ideal choice of the object vehicle given the ego vehicle's decision, and ϕ 2 denotes the object vehicle's action choices.
In addition, velocity and acceleration restrictions should be accounted for while creating constraints.
In Equation (32), Δa max x represents the higher boundaries of the acceleration change; a min x and a max x stand for the lower and upper borders of the acceleration, respectively.
The ego vehicle only has one choice regarding whether to switch to another lane in two-lane scenarios, so only one vehicle item must be considered.The ego vehicle, however, has more possibilities in three-lane scenarios.In this type of situation, more than one object vehicle must be considered.For instance, when the ego vehicle is traveling in the middle lane, two object vehicles-one in the right lane and one in the left lane-should be considered.Therefore, the ideal Stackelberg game theory problem can be expressed by Equation (33) 8 > > < > > : In Equation ( 33), lov and rov represents the left object vehicle and right object vehicle, respectively.

Motion Planning Based on Potential-Field Model
The local trajectory of an autonomous vehicle always changes with the dynamic traffic environment during its decision-making and path-planning processes.To generate a viable path, popular techniques, such as the rapidly exploring random trees (RRT) and the probabilistic roadmap (PRM) methods, continually sample the state space using a uniform sampling distribution. [13]inally, the collision trajectories are eliminated after linking each sample point to obtain a list of potential travel trajectories.However, human drivers are not always able to foresee all possible trajectories, particularly in situations involving high speeds.In addition to the benefits already described, choosing the best path at any given time improves computing efficiency and the real-time performance.A framework was developed during this study for autonomous vehicle planning and decision-making based on multistate estimates and game theory.
The potential-field method is a useful technique for simulating the dynamics of a vehicle and describing its interactions with the surroundings. [30]In this study, MPC and a potential-field model were used to determine the best route for an autonomous vehicle.

Potential-Field Model
First, an obstacle potential-field model was established In Equation ( 34)-( 41), the potential-field values for the object vehicle and the object vehicle in front of the ego vehicle in the coordinate system are denoted by P ov ðX, YÞ and P av ðX , YÞ, respectively.The center of gravity of the object vehicle is at coordinates (X ov , Y ov ), while that of the front vehicle has coordinates (X av , Y av ).Base P is the maximum potential field of the object vehicle and the front object vehicle.ρ X and ρ Y represent the convergence coefficients in the X-and Y-directions, respectively, and the longitudinal velocity of the object vehicle is given by v ov x .The longitudinal velocity of the front object vehicle is given by v av x .η is a factor that is associated with how the vehicle is driven.c is a shape coefficient.Figure 6 presents an example 3D map of the potential-field model for the vehicle derived from Equation ( 34)-(41).
As previously noted, object vehicles with different driving styles have distinct effects on ego vehicles when considering the potential-field model.For instance, an aggressive object vehicle may have larger acceleration or deceleration values or may steer sharply.Therefore, the range of the potential-field model will be greater than that of the vehicle for producing warnings.Therefore, it is important to identify vital qualities for many driving styles when creating a potential-field model.Throughout this article, the factor η was used to affect the characteristics of various driving styles, as shown in Table 2.
The following chart illustrates how various styles performed in the potential-field model.
As demonstrated in Figure 7, the aggressive style has a wider risk potential range than the normal and cautious styles, meaning that the aggressive object vehicle is more likely to collide with the ego vehicle.

Lateral Motion Planning Model
Only lateral motion planning is necessary because the decisionmaking module produces the longitudinal motion.Establishing a vehicle module is necessary for the motion-planning module.The condensed kinematic model, as expressed in Equation ( 42), was used during this study 8 > > > > < > > > > : In Equation ( 42), φ ev represents the heading angle of the ego vehicle, a ev x is the longitudinal acceleration of the ego vehicle, and a ev y is the lateral acceleration of the ego vehicle.

Path-Planning Module
During this study, planned paths were produced by combining MPC with the potential-field model.As a result, Equation (42) was discretized, as shown in Equation ( 43) f ½xðtÞ, uðtÞ ¼ a ev x a ev y =v ev x v ev x cosφ ev v ev x sinφ ev In Equation (44), the state vector and control vector of the MPC module are xðtÞ and uðtÞ, respectively.x = ½v ev x φ ev X ev Y ev T , u ¼ a ev y .Consequently, in Equation (42), condition of the ego vehicle can be described by Equation (45).
In Equation (45), ΔT is the sampling time, which in this article is chosen at 0.05 s.
The output vector of the MPC is connected to the potentialfield value in the proposed module.Equation ( 46) defines an output vector.yðkÞ ¼ g½xðkÞ, uðkÞ ¼ PðX ev ðkÞ, Y ev ðkÞÞ (46) The optimal solution for the ego vehicle can then be obtained by minimizing the function costs and using the MPC module to forecast the states of the ego vehicle using scheduled steps.The MPC module predictions can be expressed by Equation (47) In Equation ( 47), N P represents the prediction horizon of the MPC module, N c is the control horizon, and N P is larger than N c .In this study, the potential-field value, the target-lane center, and the lateral acceleration are the three components that comprise the cost function of the MPC module.The changing motion of the decision-making module and the current location of the ego vehicle are used to determine the target-lane center In Equation (48), W r denotes the lane width, Lane mark is the current lane mark of the ego vehicle, and L r represents the road border on the right side.
The cost function of the MPC module can then established.
In Equation (49), Q 1 and Q 2 represent the target lateralposition error and the potential-field output weighting matrices, respectively.The control-factor weighting matrix is denoted by R. The path-planning challenge then evolves into a constrained optimization problem.Finding the best solution with the lowest costfunction value is the goal of the MPC module.Finally, the best lateral choice is sent to the decision-making module using an integrated framework.
In this study, an effective projected fmincon method with many constraints was used to solve the MPC optimization problem.

Experiments and Discussion
This research created two typical situations were created to validate the efficacy of the suggested integrated architecture.Typical characteristics, such as lane changes, double-lane changes, and three lanes, were present in these situations.These scenarios covered a wide range of driving environments and demonstrated the viability and efficacy of the proposed strategy.
There were two lanes in scenario 1 and three lanes in scenario 2. In addition, two distinct actions were created for the vehicle pulling the object in front in each circumstance.The scenario complexity was greater for scenario 2 than for scenario 1.In addition, the state complexity of the front object vehicle increased during each scenario, which allowed for testing of the efficacy of the suggested algorithms.Different vehicle driving styles were also considered in each scenario.Figure 8 presents the experiment scene.

Test Case 1
To demonstrate the feasibility and effectiveness of the proposed method, it was tested using scenario 1, which was a typical double-lane-change situation.As an example, case 1-1 demonstrated how the approach could adapt to varied driving styles.Case 1-2 demonstrated how the approach could effectively manage multiple states.
The ego vehicle (EV) and object vehicle 1 (OV1) were initially separated by a distance of 50 m.The initial longitudinal positions of object vehicle 2 (OV2) and object vehicle 3 (OV3) were 2 and 102 m, respectively.The initial velocity of OV2 was 12 m s À1 , while that of the EV was 20 m s À1 .Both OV1 and OV3 had initial speeds of 15 m s À1 .Case 1-1 was dependent on the assumption that OV3 would proceed at a steady speed while remaining in its current lane.Due to the close interactions between the EV and OV2, case 1-1 primarily tested the relationship between the EV and OV3.For this case, OV2 had a normal driving style and OV1 had a transient driving style.According to the test results, varied vehicle driving habits affect how the decision-making and planning modules perform.
A multistate comparison was also performed.The results are shown in Figure 11 and 12.In case 1-2, OV3 moved forward at a constant speed while having two passable motions.In both situations, OV2 drove normally; the line end denotes the moment when OV3 eventually decided to move into its original lane, while the curve end denotes the moment when OV3 finally decided to move to the right lane.
According to the test results, when OV3 had a variety of intentions, various outcomes were generated by the decision-making and planning modules.Due to the intention risk, if OV3 decided to stay in its original lane, the EV would begin to change lanes slightly later than in the typical situation described by case 1-1.OV3 would temporarily decelerate to maintain safety.If the EV ultimately decided to move to the left lane, OV3 would maintain its current lane and maintain a safe distance from the EV by slowing slightly for safety reasons.These results show that the EV will adopt appropriate methods, based on the object vehicle trajectory, to ensure its own safety.

Testing Case 2
A more complex three-lane scenario, scenario 2, was considered to further illustrate the viability and efficacy of the suggested approach.Case 2-1 demonstrated how the technique can adapt its behavior to suit various driving styles.Case 2-2 demonstrated how the approach can effectively manage multiple states.
The initial longitudinal distance between the EV and OV1 was 50 m.The initial longitudinal positions of OV2 and OV3 were 2 and 4 m, respectively.OV1 had an initial velocity of 15 m s À1 , while that of the EV was 20 m s À1 .OV2 and OV3 had initial speeds of 12 and 15 m s À1 , respectively.In case 2-1, it was assumed that OV1 would continue to proceed forward at a steady speed.Figure 13 and 14 present the test results for Case 2-1.According to the test results, varied vehicle driving habits affect how the decision-making and planning modules respond in complex situations.
A multistate comparison was also performed.In case 2-2, OV1 traveled forward at a constant speed and had two potential actions.Figure 15 and 16 provide details of the test results for this case.The line end indicates that the OV1 ultimately decided to move to its original lane, while the curve end indicates that the OV1 finally decided to move to the left lane.In both circumstances, the driving styles of OV2 and OV3 were normal.
According to the test results, when OV1 had various objectives, various outcomes were produced by the decision-making and path-planning module.Due to the intention risk, if OV1 decided to remain in its original lane, the EV would begin to change lanes slightly later than in a typical situation, such as that in case 2-1.OV1 would accelerate before moving to the right lane.If the EV eventually decided to move to the left lane, it would likely accelerate slightly while maintaining a safe distance because its driving style was aggressive.In case 2-2, the EV would account for the multistate estimations of the vehicle in front of it and adopt appropriate methods based on the trajectory of that vehicle to ensure its own safety.

Testing Case 3
In this study, a typical three-lane driving scenario (scenario 2) was created to compare the suggested method with simple prediction methods and to demonstrate the adaptability and practicality of the proposed system.
In scenario 2, the suggested strategy was contrasted with a traditional approach that only considered the maximum likelihood of the object vehicle and an approach that made no predictions.First, the front object vehicle traveled in the current lane and maintained a consistent speed; however, it eventually moved to the left lane.Figure 17 provides detailed test results for this case.
The results show that the traditional path-planning method, which considers the trajectory with the maximum probability,  was susceptible to prediction errors and made poor decisions.Such situations can be avoided using the suggested strategy, which considers several object vehicle states while making decisions.The suggested way of path-planning and decision-making produced driving behavior more like that of a human driver.Combining the simulation method with two-lane and three-lane situations demonstrated the viability and efficacy of the suggested techniques.
The model code was downloaded into the computing platform of a vehicle to clarify the time cost issue when using the MPC approach.The vehicle computing platform included two Xavier components and one MPC5748 component.One thousand loops were performed on the vehicle computing platform.The time costs of the algorithm are displayed in Figure 18.The figure indicates that the primary calculations for the suggested approach were completed in less than 10 ms and that the maximum time cost was 18.48 ms; these results show that this approach is adequate for the path-planning time cost problem and demonstrate the enormous potential of the suggested method.

Conclusions
This article proposes an integrated decision-making and pathplanning framework for autonomous vehicles that focuses on the utilization of multistate estimations and driving styles.First, multistate future actions of neighboring vehicles with varying driving styles are estimated and fed into the decision-making module for risk assessment.Then, based on Stackelberg game theory, the ego vehicle and the rear object vehicle are modeled as two players in the game and their optimal decisions are obtained.In addition, the path-planning model also incorporates a potential-field model, which makes use of several potential functions to explain the varied driving styles and physical limitations of the surrounding vehicles.Finally, the trajectory of the ego vehicle is obtained through MPC that is based on the outputs of the decision-making and constructed potential-field models.
Testing was performed for two situations to evaluate the efficiency of the proposed method.The simulation results reveal that the integrated strategy effectively managed various social interactions and generated safe and appropriate decisions for autonomous vehicles.In addition, the simulation results demonstrate that multistate trajectory estimations make the decision-making and path-planning modules appropriate for unpredictable environmental conditions.
Future studies will focus on obtaining a more thorough understanding of the trajectory prediction, decision-making, and pathplanning combination, which is crucial for achieving safety and reliability in local path-planning of autonomous vehicles.The local path-planning of autonomous vehicles, where more contributions to decision-making and path-planning applications will be made, will also be explored.

Figure 2 .
Figure 2. Testing vehicle and routing map.

Figure 1 .
Figure 1.Schematic diagram of the proposed decision-making and path-planning module.

Figure 5 .
Figure 5.One scene for ego vehicle.

Figure 7 .
Figure 7. Performance of different styles.a) Aggressive driving styles b) Normal driving styles c) Cautions driving styles.

Figure 9 .
Figure 9. Testing results of decision-making and planning module in case 1-1.

Figure 10 .
Figure 10.Testing results of decision-making and planning module in case 1-1.

Figure 12 .
Figure 12.Testing results of decision-making and planning module in case 1-2.

Figure 11 .
Figure 11.Testing results of decision-making and planning module in case 1-2.

Figure 13 .
Figure 13.Testing results of decision-making and planning module in case 2-1.

Figure 14 .
Figure 14.Testing results of decision-making and planning module in case 2-1.

Figure 15 .
Figure 15.Testing results of decision-making and planning module in case 2-2.

Figure 16 .
Figure 16.Testing results of decision-making and planning module in case 2-2.

Table 1 .
Weighting coefficients of different styles.

Table 2 .
Factors for different styles.
Figure 6.3D Map of the potential for vehicles.