A behavioral planning framework for autonomous driving

In this paper, we propose a novel planning framework that can greatly improve the level of intelligence and driving quality of autonomous vehicles. A reference planning layer first generates kinematically and dynamically feasible paths assuming no obstacles on the road, then a behavioral planning layer takes static and dynamic obstacles into account. Instead of directly commanding a desired trajectory, it searches for the best directives for the controller, such as lateral bias and distance keeping aggressiveness. It also considers the social cooperation between the autonomous vehicle and surrounding cars. Based on experimental results from both simulation and a real autonomous vehicle platform, the proposed behavioral planning architecture improves the driving quality considerably, with a 90.3% reduction of required computation time in representative scenarios.


I. INTRODUCTION
Autonomous driving has been a promising research area since the 1990s. With recent advances in computer and sensor engineering, partially autonomous vehicles may be produced and widely used for travelling in the near future with fully automated vehicles to follow.
There are over 1,200,000 deaths and 50,000,000 injuries each year, worldwide [11], [1]. Autonomous vehicles have great potential to avoid or reduce the severity of most of such traffic accidents which are mainly caused by human driver errors. They will also have the potential to enhance the overall performance of the current traffic system by efficiently increasing highway capacity and decreasing traffic congestion in cities [20]. Fully autonomous driving is able to spare people from the task of driving while commuting.
Since the 1980s, autonomous vehicle intelligence has increased from lane centering to actually driving on public roads with lane-changing capability [3], [12], [15], [7], [2], [19]. A few research platforms have shown great capability in driving on public roads. However, autonomous vehicle sensing, decision making and motion planning intelligence have not achieved the same level as that of good human drivers.
In the architecture of an autonomous vehicle, motion planning is a key component that affects the autonomous vehicles performance. The motion planners role is to generate a kinematically and dynamically feasible path and velocity profile for the robot to execute. Due to computational constraints, the motion planner usually simplifies the problem by assuming that the surrounding moving vehicles will keep constant speed [15], [8]. The inaccurate prediction and lack of understanding of human drivers intentions prevent autonomous vehicles from cooperating with human-driven vehicles as human drivers intuitively do with one another.
Most motion planners in autonomous vehicles also assume the robot will execute the planners command with no delay or error [21], [18]. Such assumptions cause the autonomous vehicle either to fail to execute the command, leading to potential danger, or overcompensate for the control error, potentially decreasing the comfort level.
II. RELATED WORK 1) Motion Planner: Motion planning for autonomous vehicles has been developed and substantially improved in recent decades [14]. Most fully autonomous concept vehicles have their own motion planning to directly command a desired trajectory, which includes both desired path and speed. In the Urban Challenge, CMUs Tartan Racing Teams motion planner considered kinematic and dynamic constraints of the vehicle, as well as all moving and static obstacles [15]. It generated 20 to 50 feasible trajectories from a given sample point with different offsets from the center of the road 10 to 30 meters away from the vehicle and then selected the best one. This mechanism proved to be an efficient way to avoid static obstacles. However, due to the short planning horizon, it was difficult for it to perform smooth lane changes or circumvent dynamic obstacles, especially at highway speeds.
To overcome the shortcomings of this planner, Mc-Naughton et al. proposed concatenating multiple layers of trajectories to produce a longer horizon and search more candidate trajectories [8]. Benefiting from the larger number of possible trajectories, this achieves better performance in obstacle avoidance and circumventing slowly moving obstacles. However, the planner still only chooses from a very limited number of candidate trajectories. Their endpoints have different lateral offsets, but the same heading as the road. In cases with sharp turns and in obstacle avoidance maneuvers, sometimes no trajectory provides smooth and human-like performance.
An alternative approach to sample-based planners [15], [8] is based on path optimization [21]. The trajectory is defined as a function of a few control points. It is initialized to be the center line of the road. Then the parameters of the control points are optimized based on multiple cost terms, including curvature, length of the trajectory, and lateral acceleration. However, this kind of optimization is vulnerable to local optima, which can cause very poor performance. It is also computationally quite intensive, with non-deterministic computation time. 2) Social Behavior: Although kinematic and dynamic constraints, smoothness of path, and obstacle avoidance are frequent concerns in motion planning, the autonomous vehicles ability to socially cooperate with surrounding moving vehicles is usually overlooked. In circumstances such as entrance ramps, lane changes, etc., the human driver will make decisions based on his/her understanding of what other human drivers are thinking or are likely to do. He can also adjust his speed to show other drivers his own intention. This kind of cooperation happens intuitively between human drivers. Not understanding this behavior can result in incorrect decision making and motion planning with undesirable outcome [18].
Wei et al. [17] proposed a Prediction-and Cost-function Based algorithm that can make the autonomous vehicle consider potential reactions of surrounding objects when making decisions. However, this approach has only been applied and tested in simulated freeway driving scenarios. Due to the computational complexity and potential explosion of the search space, it is difficult to introduce vehicle interaction and intention understanding into the planner with the current architecture of autonomous vehicles [8].
3) Hierarchical Autonomous Vehicle Architecture: To enable autonomous vehicles to finish long-term missions and reduce the workload of motion planning, a hierarchical architecture is widely used [13], [6], [10]. For each layer of the architecture, the input higher-level mission is decomposed into sub-missions and passed on to the next-lower level. Using this framework, many complicated problems becomes solvable. However, this layered architecture can cause some performance problems. Figure 1 (CMUs Boss from the Urban Challenge) shows such a framework [9]. The perception system analyzes realtime data input from laser scanner (Lidar), radar, GPS and other sensors. Mission planning optimizes the path to achieve different checkpoints considering the arrival time, distance, or different required maneuvers. The behavior executive system makes tactical driving decisions about such things as distance keeping, lane changing, and neighboring vehicle interactions. Motion planning generates the desired trajectory of the vehicle considering the dynamic parameters and output steering and throttle pedal commands.
One of the shortcomings of this framework is that the higher-level decision making module usually does not have enough detailed information, and the lower-level layer does not have authority to reevaluate the decision. For instance, when the autonomous vehicle wants to use an oncoming lane to circumvent a slow-moving obstacle, the behavior layer makes the decision. Then it outputs the desired circumvention window and speed that the car should use to perform this behavior without interfering with traffic in the opposing lane. However, the lower-level motion planner may find out it cannot drive at the desired speed because of an upcoming sharp turn, which is not considered in the behavior layers decision making, so it has to slow down to perform the circumvention. However, because of the oncoming car, failing to keep the desired speed will cause the car to miss the circumvention time and spatial window [15].
The other shortcoming is that in this architecture, most information is processed in the motion planner, including road geometry, vehicle dynamics, and surrounding moving objects and static obstacles. This makes the planner problem very complicated and computationally expensive. The planner usually needs to sacrifice performance to meet real-time constraints [20], [8].
4) Parallel Autonomous Vehicle Architecture: There are also many autonomous or semi-autonomous vehicles based on existing Advanced Driving Assist Systems (ADAS). For example, by integrating lane centering and Adaptive Cruise Control (ACC), a basic single lane freeway autonomous driving capability can be achieved [4], [16].
Compared with the hierarchical framework, modules in this system are relatively independent and work in parallel, as shown in Figure 2. There are dedicated sensors and actuation mechanisms for each of the controllers. For example, the lane centering module focuses on using the steering wheel to keep the car in the center of a lane, while the ACC controller controls the throttle and brake to make the car follow its leader. The benefits of this architecture could be: first, the controllers are running at a high frequency and are proven to be safe and stable; second, the controllers have achieved a high level of smoothness and performance; third, computational cost is low because there is no complicated motion planning being used. However, in some complicated cases needing cooperation, this framework may not perform well. For example, when the autonomous vehicle wants to merge into a neighboring lane with slower-moving traffic, it may not be able to properly slow down because the speed controller typically does not know the cars intended lateral movements. In contrast, some motion planner algorithms are able to slow down the vehicle and find the best opening to merge into. As discussed in Section II, there are shortcomings of both the current hierarchical and parallel robot decision making architectures. Due to real-time constraints, the motion planner cannot consider the effects of imperfect vehicle controllers or cooperation between cars. In this paper, we propose a novel behavioral planning framework that combines the strengths of the hierarchical and parallel architectures. It is based on the hierarchical architecture so that fully autonomous driving with high-level intelligence can be achieved. However, it also uses the independent controllers as in the parallel autonomous vehicle architecture to ensure basic performance and driving quality. The design greatly reduces the necessary search space for the motion planner without sacrificing any performance due to coarser granularity. The proposed framework is shown in Figure 3.

A. Mission Planning
The mission planning module takes charge of decomposing driving missions such as "go from point A to point B" into lane-level sub-missions such as which lane we should be in, whether we need to stop at an intersection, etc. It takes a human drivers desired destination and computes the shortest path to it from the robots current position. It outputs a set of future lane-level sub-missions which describe the desired lane and turn of the vehicle at each intersection. In addition, this module also controls the transition of goals. When the vehicle completes the current lane-level sub-mission, it will automatically send the next set of goals to the lower layers. When there are intersections with stop signs, traffic lights or yielding requirements, this module directly talks to the perception system to decide whether the car can proceed to the next sub-mission.

B. Traffic-free Reference Planning
The reference planning layer takes the lane-level submissions, and outputs a path and speed profile for the autonomous vehicle to drive [5]. In this layer, the planner assumes there is no traffic on the road. It uses non-linear optimization to find a smooth and human-like path and speed with consideration of the kinematics and dynamics of the autonomous vehicle, as well as the geometry of the road. For example, if the road is not straight, as shown in Figure  4, instead of driving exactly in the center of the road, a human driver will drive slightly offset from the center line to minimize the required steering maneuver, which is emulated in this module. In addition, traffic rules such as speed limits are applied in this layer.

C. Behavioral Planning
Since the road geometry has been considered in the previous layer, the behavior planner focuses on handling onroad traffic, including moving obstacles and static objects. It takes the traffic-free reference, moving obstacles and all road blockages as input. It outputs controller directives including lateral driving bias, the desired leading vehicle we should follow, the aggressiveness of distance keeping, and maximum speed. In this module, a Prediction-and Cost-function Based (PCB) algorithm is implemented [17]. There are three primary steps: candidate strategy generation, prediction, and cost function-based evaluation, as shown in Figure 5.
In the PCB algorithm, the world is abstracted as shown in Equation 1. V S host is the state vector for the host vehicle and surrounding vehicles, with station (longitudinal distance along the reference path), velocity, acceleration and lateral offset information. V S other is the state vector for each surrounding vehicle. Compared with V S host , it has additional information about the intention and probability of intention. S road is the state vector of the reference path with maximum speed information at each of M stations. S P CB is the state vector for the behavioral planner. (1)

1) Candidate Strategy Generation:
In the candidate decision generation step, a certain number of candidate directives for the controllers are generated. Figure 6 shows the candidate strategies for longitudinal (h des ) and lateral movement (l des ) of the car, respectively, where l des is the desired lateral offset, and h des is the desired time headway, which is related to the distance the host vehicle wants to keep from its leader. By pairing an l des profile with a h des profile, a candidate strategy is created. The following steps will convert each candidate strategy profile into trajectories for the behavioral planner to evaluate.
2) Prediction Engine: All the potential controller directives are sent to the prediction engine, which forwardsimulates the controllers to generate a trajectory for the car to drive. There are two controllers that are forward-simulated in the behavioral planning, the Adaptive Cruise Controller (ACC) and lateral controller. In the prediction engine, each candidate strategy is forward-simulated based on the ACC controller and lateral controller models to get the trajectory of the host vehicle. At every time step, all vehicles behaviors including their reaction to the host vehicles trajectory are also predicted.
a) Forward-Simulate ACC Controller: The ACC controller takes charge of the longitudinal speed control of the vehicle. The state of the ACC controller S ACC is described in Equation 2, in which h des comes from the candidate strategy generated in the first step of the PCB algorithm.
The desired distance can be computed using Equation 3, in which d min is the minimum distance when both the leading and following cars are static.
Other parameters in the ACC controller state can be calculated from V S host , V S other and S road using the following mechanism.
.v end if end for An LQR controller with gain scheduling is developed to generate the acceleration command for the host vehicle. The desired velocity is generated by adding the acceleration command to the last commanded velocity, as shown in Equation 4.
The kinematics of the host vehicle are simplified and modeled as a first-order system with pure time delay. Therefore, the discretized model for simulated speed of the host vehicle is computed using Equation 5.
The delay and parameter of the first-order system are identified from data collected using a real autonomous vehicle, which will be discussed in Section IV-A. Figure 7 shows an example of the forward-simulated ACC controller, which also uses the maximum speed information from S road to constrain the vehicle's maximum speed.
Using this model, for every time step, we can get a prediction of the host vehicle's speed given a certain candidate strategy.
b) Forward-Simulate Vehicle Lateral Controller: The autonomous vehicle lateral controller's rule is to minimize the error between current lateral offset l host and desired lateral offset l desired , which is generated in the candidate strategy generation step. The system's response can be simulated using Equation 6. ∆l = max(−0.5, min(0.5, (l host − l desired ))) l host (t) = l host (t − ∆t lat ) + k lat ∆l∆t lat (6) Figure 8 shows an example of the forward-simulated ACC controller, which also uses the maximum speed information from S road to constrain the vehicle's maximum speed. Although the input candidate strategy is discretized and not smooth, the forward-simulation creates a practical path the car will execute. c) Predict Surrounding Vehicles' Reaction: For every step of the simulation, all the surrounding moving obstacles are also predicted based on the assumptions that they will tend to keep their current speed and be able to perform distance keeping to their leading vehicle. The prediction from S P CB (t) to S P CB (t + ∆t) is described as below: for moving cars V S other i do if (found the closest leader) then Simulate distance keeper Generate desired speed command else (no leading vehicle) Forward-simulate ACC controller Forward-simulate lateral controller By iteratively running the prediction, a series of simulated scenarios of S P CB including V S host and V S other covering the prediction horizon t predict can be generated, as shown in Figure 9.
3) Cost function-based Evaluation: Exhaustive search is used to iterate through all candidate strategies and run .
The cost for each strategy is the sum of the cost of the series of predicted scenarios, as shown in Equation 8.
The detailed descriptions of these cost terms are given in [17], [18].
One of the benefits of using the PCB architecture is that the surrounding vehicles' reactions to the host vehicle's movement are considered. The other benefit is that the vehicle controllers' reactions (including the lateral controller and distance keeping controller) and time delays are simulated. The planner can therefore take these into account in the cost function when it searches for the best strategy.

D. Vehicle Controller Layer
The last layer in the framework consists of multiple vehicle controllers running in the embedded controllers in the vehicle, including an adaptive cruise controller and vehicle lateral controller, as shown in Figure 10. Both of these controllers are modeled and forward-simulated in the behavioral planner layer.
The lateral controller takes the lateral offset output from the behavioral planner as a directive. It can therefore make the car drive along the road with desired lateral offset to perform lane changes or avoid static obstacles.
The adaptive cruise controller (ACC) uses control laws to compute a desired speed command based on the current state of the vehicle, leading vehicle information and control preferences. Then this velocity command is converted to throttle and brake pedal actuation commands by a speed tracking controller. Based on the control preferences h des given by the upper-layer behavioral planner, it can perform aggressive distance keeping that keeps a closer distance to its leader or more conservative behavior that stays further away from its leader and applies milder acceleration.
IV. TESTING PLATFORM To support future development of autonomous vehicle technology, including a more consumer-friendly appearance, human-like smooth vehicle control and intelligence, and advanced sensor fusion based on automotive-grade sensors, a  Figure 11. This autonomous driving research vehicle is a modified 2011 Cadillac SRX. The vehicle has been extensively tested on both a closed test field and public roads. The behavioral planning architecture proposed in this paper is implemented and tested on this platform in both simulation-mode and on the vehicle.

A. On Road Validation
Real vehicle testing has been done to verify the accuracy of the forward simulation in the behavioral planner, as shown in Figure 12. It can be seen that the simulated ACC model matches the actual velocity of the vehicle closely. The system identification of the vehicle's longitudinal and lateral responses to the behavioral planner's directive is shown in Table I.

V. EXPERIMENTAL RESULTS
The framework has been implemented in the autonomous vehicle control software TROCS [9], which also works as a real-time autonomous vehicle simulator. The behavioral planner while planning at a higher level of abstraction, was more computationally efficient in pruning the search space than the previous motion planners [21], [8] that evenly sampled the spatio-temporal space. We compare the behavioral planners  Comparison between the best speed profile found by the behavioral planner and the saptio-temporal sample-based planner performance with a sample-based planner in the spatiotemporal space [21], [17] in two testing cases: path planning to avoid an obstacle on a curve and distance keeping.
1) Path planning: The sample-based planners path candidates usually are described by a polynomial. The heading and curvature for the end points of the trajectory need to be fixed [21] to constrain the polynomial. In cases wherein the autonomous vehicle drives on a curve and needs to perform evasive maneuvers, the constraints of the sample point limited the flexibility of candidate paths, thereby considerably affecting its performance. Instead of applying these arbitrary constraints, the behavioral planner uses lateral offset to direct the controller to avoid obstacles smoothly, as shown in Figure  13.
2) Velocity planning: Figure 14 shows the sample space for the normal planner applying searches on spatio-temporal grids, which covers a wide variety of possible speed profiles for the car to execute. In this scheme, the profiles, though piecewise linear, are non-smooth connections between discretized velocity sample points. Figure 15a shows the search space for the behavioral planner, which samples different candidate time headways. They have enough diversity, but more importantly, they all perform distance keeping to the leading vehicle, which is not true of all of the profiles in Figure 15a. Table II compares the behavioral and spatio-temporal sample-based planners performances in searching for the best speed profile by computing the cost terms described in Section III-C.3 in 1,000 randomly generated traffic scenarios.
It can be seen that even with many fewer samples, the behavioral planners driving quality (the cost of the   best sample) outperforms the spatio-temporal sample-based planners by 2.9%. The overall computational cost is also dramatically reduced, by over 90%. That is because the proposed behavioral planner does not need to search through the huge number of candidate profiles that the other planner does.

VI. SUMMARY
In this paper, we propose a novel behavioral planning framework for fully autonomous vehicles. Using the framework, an autonomous vehicle can achieve functionalities such as lane keeping and lane changing; distance keeping; handling traffic lights, stop signs and yield signs; and avoiding obstacles. The key component of this framework is the behavioral planner, which uses the PCB algorithm to coordinate the ACC controller and lateral controller of the vehicle to perform high-quality distance keeping, lane changing and obstacle avoidance behaviors. For path planning, the behavioral planner does not need to use polynomial paths, as do spatio-temporal sample-based planners. Therefore, it generates much smoother paths in some complicated cases. In speed profile planning, a statistical test shows that the behavioral planner reduces computing time by 90.3% while achieving 5.1% higher quality with respect to smoothness, safety, and fuel consumption costs.
The framework has been fully implemented on an autonomous vehicle platform. Preliminary road testing shows the accuracy of the behavioral planners forward simulation of the autonomous vehicles movement. However, more intensive road testing and statistical analysis need to be performed to fully verify fully the frameworks implementation and performance .