Receding Horizon Control Using Graph Search for Multi-Agent Trajectory Planning

It is hard to find the global optimum of general nonlinear and nonconvex optimization problems in a reasonable time. This article presents a method to transfer the receding horizon control approach, where nonlinear, nonconvex optimization problems are considered, into graph-search problems. Specifically, systems with symmetries are considered to transfer system dynamics into a finite-state automaton. In contrast to traditional graph-search approaches where the search continues until the goal vertex is found, the transfer of a receding horizon control approach to graph-search problems presented in this article allows to solve them in real time. We prove that the solutions are recursively feasible by restricting the graph search to end in accepting states of the underlying finite-state automaton. The approach is applied to trajectory planning for multiple networked and autonomous vehicles. We evaluate its effectiveness in simulation and experiments in the Cyber-Physical Mobility Lab, an open-source platform for networked and autonomous vehicles. We show real-time capable trajectory planning with collision avoidance in experiments on off-the-shelf hardware and code in MATLAB for two vehicles.


I. INTRODUCTION
S OLVING nonlinear and nonconvex optimization problems under temporal constraints is hard, if not intractable. There exist several approaches to find good solution candidates for a nonconvex, nonlinear optimization problem. The variety of approaches, which are too many to review in detail, have in common that they simplify the original problem at some point or another within the optimization algorithm as a compromise between global optimality and computational efficiency [4]. In the following, we will focus on approaches that simplify the original optimization problem into a discretized version, i.e., the decision variable is restricted to a finite set of values.
One field of application that deals with nonlinear and nonconvex optimization is trajectory planning for mobile robots. Trajectory planning for mobile robots has been a topic of research for several decades. While the planning problem for a single robot with linear dynamics in a static environment is considered solved, there exist open questions when planning trajectories for multiple robots with nonlinear dynamics that interact in changing environments [5], [6]. The problem with oversimplifying the model is that the trajectory following controller on the real system might not be able to follow the plan generated by the trajectory planner. In the worst case, even though the plan is feasible, due to the modeling error, it might drive the system into an undesired state, such as a collision. If trajectories are planned based on a detailed model, the robotic system is more likely to be able to follow the resulting plan. Using detailed models comes at the cost of increased computation time. An abstraction into a discretized system representation as it is proposed in this work can circumvent this problem.
The next paragraphs present this article's contribution, the state of the art, and this article's structure.

A. Contribution
This article presents the combination of a heuristic search on a finite state automaton (FSA) with a receding horizon framework known from model predictive control (MPC), resulting in a method we refer to as receding horizon graph search (RHGS). We prove that our method fulfills recursive feasibility, which is the desired property in receding horizon control (RHC). We evaluate the method by applying it to the problem of trajectory planning for multiple car-like robots. The trajectories are planned using a high-fidelity dynamic model. The analytically determined steady-state conditions and the transitions between them are translated into a motion primitive automaton (MPA). In contrast to state-of-the-art methods that use graph search to find a path between a start and an endpoint, we plan trajectories to follow a reference trajectory while considering system dynamics and satisfying input and collision constraints. The effectiveness of the approach is shown in experiments in the Cyber-Physical Mobility Lab (CPM Lab) [7].

B. Related Work
In the literature, most methods for solving the trajectory planning problem can be categorized into either methods that are based on continuous optimization problem formulations, such as MPC, or methods that are based on discretized problem formulations, such as an A * search algorithm or the rapidly exploring random tree (RRT) approach [5]. In the following, we first present works that use an optimization problem formulation based on motion primitives and second examine works that use a graph-based formulation.
The idea of state and/or control quantization is pursued in research on control theory for the last few decades (e.g., [8]). One method of quantization is selecting a finite number of control and state trajectories, also known as motion primitives. Each motion primitive determines a specific behavior of the system's dynamics. Multiple motion primitives with flexible duration can be concatenated for a longer trajectory of the states of a system [9]. Such systems are classified as a mixed logical dynamical (MLD) systems [10].
The sequence and duration of motion primitives that take an MLD system from the starting point to the goal point can be found by solving a hybrid optimization problem. A method to solve a hybrid optimization problem is mixed integer programming (MIP). MIP combines both continuous and discrete variables in one optimization problem. An approach for collision avoidance of networked vehicles, where collision constraints are modeled with binary decision variables, can be found in [11]. While MIP approaches can guarantee to find optimal solutions, the required computation time is high. The reader can consult [12] for a general review of trajectory planning in an MIP framework. The approach of modeling the system as an MLD system and solving a hybrid optimization problem has been successfully applied to an aeronautic vehicle, where the state space is typically larger than for ground vehicles [9]. In [13], the approach was applied to autonomous ground vehicles with detailed system models for agile movements. In [14], multiple vehicles are coordinated based on motion primitives. Here, the vehicles' goal is to keep a rigid formation. A hierarchy is introduced with a layer of motion coordination primitives on top of motion primitives. These coordination primitives determine the respective motion primitive for each participating vehicle in the formation. In [15], a hierarchical trajectory planner for multiple vehicles is presented. A highlevel planner determines a sequence of motion primitives for participating vehicles. The vehicles avoid collisions with one another by concatenating the motion primitives resulting from a hybrid optimization. In [16], multiple vehicles are controlled in centralized and distributed manners using motion primitives. This article presents two formulations of the optimal control problem, (OCP): one by using a convex approximation and quadratic programming, (QP), and one using MIP.
Early work in the area of graph-based trajectory planning for autonomous vehicles does not consider vehicle dynamics [5]. In [17], an A * search algorithm for trajectory planning using motion primitives is applied. The approach includes obstacle avoidance in three dimensions for unmanned aeronautic vehicles.
In trajectory planning tasks performed with graph search, path validations and edge evaluations are computationally demanding [5]. The number of edge evaluations can be reduced by employing a lazy approach. In lazy approaches, the actual cost of an edge is approximated by an easy-to-compute lower bound on the cost. The exact cost, which is computationally expensive, is only determined when the vertex connected with the edge is considered promising. Examples that employ a lazy approach are lazy weighted A* (LWA*) [18], lazy shortest path (LazySP) [19], or lazy receding horizon A* (LRA*) [20]. The algorithm's name in [20] is Lazy Receding Horizon A * , which originates from two horizons of evaluating the true cost of edges and a lazy estimate of the cost. This is not to be confused with the receding horizon on the tree depth as it is proposed in this work. Both approaches can be combined to further decrease computational effort.
A variant of A * termed Hybrid A * reduces the discretization error that occurs when associating expanded vertices to cell centers of the world grid as A * does. Instead of associating vertices to the cell centers, continuous state values are associated with the cells alongside their corresponding cost. This means that vertices are discarded if others with better cost estimations are found, so suboptimal solutions can be returned. In [21], Hybrid A * is used for path planning for one vehicle in static, unknown environments and validated in the 2007 DARPA Urban Challenge. Hybrid A * is also used in [22] for trajectory planning for a single vehicle. Trajectories are planned with a simplified vehicle dynamics model for longitudinal and lateral motion. The planning horizon is limited in both time and distance traveled. Simulation results are given for a straight road segment.
Limiting the depth of a graph search algorithm for real-time capability was also explored in [23]. In that work, the trajectory for one aerial vehicle is optimized in a receding horizon fashion. The vehicle moves in a static environment and uses a precomputed cost function to evaluate graph vertices during the search. The approach is evaluated in simulation. In [24], the limited available planning time is split into two parts. In the first part, an algorithm looks for an optimal trajectory from the current pose to the goal pose. If the algorithm does not terminate during the available time, a receding horizon planner is activated as a backup solution to plan long enough for the vehicle to operate safely in the immediate future. To avoid getting stuck at local suboptimal solutions, the costto-go function is based on a reduced trajectory planning problem with static obstacles and a low-fidelity maneuver automaton. In [25], multiple resolutions are introduced to the state lattice on which the graph search operates. Trajectory planning gives coarser results at a greater distance to the vehicle but can be accelerated compared to trajectory planning in state lattices with uniform resolution. The algorithm in [26] performs an anytime and iterative two-stage planning process. First, a sequence of motion primitives is selected with a graph search algorithm to find a trajectory that ends in a goal region. Second, the primitive durations are optimized to find a trajectory that ends closer to a goal point.
In [27], a Monte Carlo tree search on quantized control inputs generates cooperative trajectories. The approach is evaluated in simulation and multiple traffic scenarios. In [28] and [29], trajectory planning based on graph search was extended to multiple vehicles. These works do not generate a motion graph for the system but use discretized control inputs as their motion primitives without smooth transitions. The approach is evaluated in simulation on specific traffic scenarios, such as an intersection or two merging lanes.

C. Article Overview
This article is organized as follows. In Section II, we review properties of systems with symmetries and introduce primitives and an FSA based on primitives. In Section III, we propose RHGS based on an FSA and show that it fulfills the recursive feasibility property. In Section IV, the approach is applied to trajectory planning for car-like robots. The section presents the kinematic single-track vehicle model, the MPA, and the trajectory planning framework. In Section V, the approach is evaluated in simulation and experiments.

II. MODELING WITH A FINITE STATE AUTOMATON
Consider the time-invariant nonlinear dynamical control system S described by the following ordinary differential equation (ODE): with the state vector x(t) ∈ M, a manifold M that looks locally like R n , the control input u(t) ∈ R m for time t ≥ 0, the initial condition x(0) = x 0 , and a vector field f . Let the function ϕ u : M × R → M represent the state flow and consider a control law u : [0, T] → R m for a maximal interval of existence t ≤ T. We assume the function f of system (1) as being continuous and locally Lipschitz w.r.t. the state x(t). Then, we guarantee the existence and uniqueness of the solution in the form for t ∈ [0, T]. Note that the flow ϕ u(t) is not known analytically for nonlinear systems and general choices of u(t) and x 0 , so it needs to be approximated numerically. This might make model-based optimization and trajectory planning computationally challenging. However, the computational load can be reduced by exploiting the invariance of the system S w.r.t. a class of transformations on the state.

A. Trajectories for Systems With Symmetry
Symmetries in dynamical systems are important structures that can be exploited for trajectory planning. By introducing symmetry as Lie group actions, we can formally express symmetry as the invariance of the dynamics w.r.t. these group operations [30]. Many mechanical systems present this property, for example, mobile robots, cars, or helicopters [9], [31]. For readers unfamiliar with Lie theory, we recommend [32] and [33].
For the formal definition of symmetry, let G be a Lie group with an identity element. Let a left action of the group G on the state manifold M be a left-invariant smooth map : G × M → M for the system S described by (1). We use the expression g (x) as a shorthand for the operation (g, x) with g ∈ G.
Definition 1 (Symmetry Group): The triple (G, M, ) is a symmetry group for the system S if the property If (3) holds, we say that the system S is invariant w.r.t. the group action .
For mechanical systems, in general, the symmetry group is a subgroup of SE(n), which is homeomorphic to R n × SO(n), i.e., it allows rotations and translations. The special orthogonal group SO(n) can be represented by the set {R ∈ R n×n : R T R = I, det R = 1} such that we can represent SE(n) by a pair (R, x), with R ∈ SO(n) and x ∈ R n . Then, the group action is described by x → Rx + x [31].
To model the system S with an FSA, a particular class of trajectories is fundamental: the steady-state motions, otherwise known as relative equilibria.
Definition 2 (Relative Equilibrium): Let (G, M, ) be a symmetry group in the sense of Definition 1, g be the Lie algebra of G with exponential map exp : g → G, and consider a constant control inputū. A flow x(t) = ϕ u (x 0 , t) is a relative equilibrium of the system (1) if there exists a Lie algebra element of ξ ∈ g such that ∀t ∈ [0, T] The relative equilibria constitute the FSA states. In particular, we consider the dynamical system remaining in the FSA states with null duration, i.e., when reaching one relative equilibrium, the system instantaneously has to proceed with a smooth transition to another one. For that, we consider controlled trajectories that start and end on a relative equilibrium, which we denominate as transitions.
Definition 3 (Transition): A transition is a finite-time controlled trajectory π : t ∈ [0, T ] → (x(t), x(t)) that constitutes a link between two relative equilibria, and it is characterized by the following: 1) a time duration T ; 2) a control signal u : 3) x(0) and x(T ), each belonging to a relative equilibrium characterized by (4). Under these conditions, transitions can be modeled arbitrarily as a customized trajectory. Alternatively, they can be computed with optimal control methods, e.g., minimizing the intrinsic energy of the system [34]. Note that, as a special case, we can also define transitions that move along a relative equilibrium. To this aim, assume a relative equilibrium be given byū and ξ and a fixed duration T . Then, by definition, (exp(ξ t), x 0 ) and u(t) =ū, is a trajectory on [0, T ]. These transitions constitute self-loops on states in our FSA.
Invariance w.r.t. a group action leads us to define equivalence of trajectories, a fundamental concept to allow the concatenation of finite flows. Intuitively, two trajectories are equivalent if they can be exactly superimposed with time translation and the action of a symmetry group.
Both can be represented by the same motion primitive. Fig. 2. Concatenation of trajectories π 1 and π 2 in R 2 , written as π 1 π 2 by the action gc and time translation for t ∈ [T 1 ,

Definition 4 (Equivalence of Trajectories): Given two trajectories
they are equivalent if the following holds.

Definition 5 (Motion Primitives):
A class of equivalent trajectories is called a motion primitive.
Primitives can be transitions or relative equilibria. Fig. 1 illustrates an example motion primitive. Note that, if the primitive in Fig. 1 is a transition, then As shown in Definition 4, a trajectory can easily be shifted via the symmetry action g . Together with the boundary conditions of transitions in Definition 3, it allows various concatenations of transitions. Denoting by P(S, G) the set of all primitives π for a system S with a symmetry group (G, M, ) and given two primitives we can concatenate π 1 and π 2 , written as If π 1 , π 2 ∈ P(S, G) can be concatenated to form π 1 π 2 , then π 1 π 2 ∈ P(S, G). The proof is given in [9]. Fig. 2 illustrates the concatenation of two primitives π 1 and π 2 , resulting in the trajectory π 1 π 2 by the action g c and the time translation of By selecting relative equilibria and computing transitions between them, we can build an FSA.

B. Finite State Automaton
Given the needed mathematical tools, now, we can define our FSA, which will be further used in the receding horizon graph search method in Section III.
Definition 6 (Finite State Automaton): An FSA is a fivetuple (Q, S, γ, q 0 , Q f ) composed by the following.
1) Q is a finite set of relative equilibria.
2) S is a finite set of transitions.
3) γ : Q × S → Q is the update function defining the switch between two relative equilibria after the execution of a transition. 4) q 0 ∈ Q is the initial relative equilibrium. 5) Q f ⊆ Q is the set of final relative equilibria. The FSAs considered in this work are deterministic, i.e., given q ∈ Q and π ∈ S, the result of the update function γ (q, π) can be computed deterministically.

C. Discrete System
For a set of transitions of the invariant system (1), the update function γ defines a discretization of the state space. In the specific case when all transitions share the same duration T s , we have a discrete-time nonlinear time-invariant system given by with k ∈ N, the vector field f d : M × R m → M, the state vector x ∈ M, and the input vector u ∈ R m . The product k · T s represents the corresponding continuous time. The input vector u k (t) for t ∈ [0, T s ] is given by the applied primitive π(k). Please note that the sequence u : [0, T s ] → R m for each transition is precomputed offline.

III. RECEDING HORIZON CONTROL USING GRAPH SEARCH
This section presents the combination of graph search with an RHC concept. Particularly, the recursive feasibility property known from RHC is transferred to the graph search algorithm.

A. Problem Statement
Consider the problem of controlling the discrete-time nonlinear time-invariant system (5), subject to constraints where X ⊆ R n and U ⊆ R m . Assume that a full measurement or estimate of the state x(k) is available at the current time k. We define the cost function where N is the horizon length of the optimal control problem, (OCP), and p and q are penalty functions for the state and input.
The corresponding OCP is formulated as Here, U k→k+N |k is the vector of stacked control inputs (u k|k , u k+1|k , . . . , u k+N |k ).
The OCP is solved repeatedly after a timestep duration T s and with updated values for the states and constraints, which establishes the RHC. We choose the timestep duration equal to the motion primitive duration T s = T , such that exactly one motion primitive is executed between consecutive values of x and u. In general, the timestep duration could be any multiple of the motion primitive duration T s = · T, ∈ N.

B. Recursive Feasibility
A set C ⊆ X is a control invariant set for the system (5) subject to constraints (6) if Lemma 1: If X f is a control invariant set of the system with constraints (8) with N > 1, then (8e) ensures recursive feasibility of the RHC.
Proof: The proof is given in [35]. In a representation of the system as an FSA, the transitions determine the control input and the change of the state variables of the system (5). The available transitions and initial state variables depend on the state. We can reformulate the condition of control invariant sets for FSAs as follows.
Definition 7 (Control Invariant Set for an FSA): A set C ⊆ X is a control invariant set for the system (5) if Note that the relative equilibrium q follows from the state vector x.
Theorem 1: RHGS achieves recursive feasibility if the generated sequence of transitions ends in an automaton state and a system state that together form a control invariant set.
Proof: It follows directly from Lemma 1 with Definition 7 of control invariant sets for FSAs.
An especially simple case is if the generated sequence of transitions ends in an automaton state from where there exists a transition that keeps the state vector x(k) at an equilibrium of the system. In an equilibrium of the system, it holds that

C. Graph Search Loop
This subsection presents the RHGS approach to solve the RHC problem (8). Mapping the RHC problem to graph search based on an FSA, we have the following correspondences. The motion primitive duration T is set equal to the duration of a timestep T s , so exactly one motion primitive is executed between consecutive states and control inputs at time k + i and k + i + 1. The FSA incorporates the constraints on system dynamics (8b), control inputs (8c), and the part of the state vector that is invariant w.r.t. group actions [see (8d) and (8e)]. The graph search incorporates the part of the state vector that is affected by group actions [see (8d) and (8e)], and the cost function (8a). In the OCP (8), the input u(k + i ) is kept constant during every timestep. In the transitions of the FSA, the control inputs can change over the duration of a primitive, as explained in Section II-C.
Generally, the FSA represents the discrete system model and, therefore, the discrete system trajectories that can be followed when performing a graph search. In order to explore the search space, a search tree T is constructed. A vertex in the tree τ consists of the tuple (q, x, i, J ), whose elements are the corresponding relative equilibrium of the automaton q ∈ Q, the location in the search space x ∈ X , the level i , and the value of the cost function J , respectively. Please note that a level i in the tree directly corresponds to the timestep i in the OCP (8). The relative equilibrium q of a tree vertex τ together with the update function γ determines the possible successors.
Algorithm 1 shows the main steps of an informed search algorithm loop. An informed search is characterized by an estimate of the remaining cost-to-go (CTG), which supports more efficient searches. The algorithm ensures optimality of the returned path with an admissible and underestimating cost estimation, similar to A * . In contrast to A * , the algorithm does not keep track of a closed list. Due to the limited planning horizon, the algorithm will still not get stuck in endless loops. The following paragraphs explain the algorithm.
The algorithm initializes the open list with the root vertex τ 0 (Line 1). The root vertex contains the current state x(k) of the system at the start of each iteration, which determines the relative equilibrium of the FSA and the initial location in the search space. We assume that evaluating edges to a vertex can be separated into calculating the cost function (8a) value and checking for constraints (8d). Furthermore, we assume that checking for constraints is computationally more demanding. The constraints are checked only on vertices selected for further inspection (Line 6). If the vertex does not satisfy all constraints, the algorithm skips to the next vertex in the open list.
If the vertex satisfies all constraints, the algorithm checks if it has found the optimum. The condition that must be met is that the lowest-cost vertex that fulfills the constraint (8e) is found at the planning horizon N (Line 9). This results in a limited planning horizon, which reduces computational effort. The algorithm consequently expands the root node only up to a tree depth N, and the path to the lowest-cost vertex at depth N is returned (Line 10).
If the selected vertex is not optimum, the algorithm generates successors based on the relative equilibrium q, the update function γ , and the state x of the parent vertex (Line 12). When vertices are expanded, the possible transitions and, therefore, successors are limited such that an equilibrium state is reached within the horizon N. The distance between two vertices in a graph is the number of edges in the shortest connecting path. Let i τ s be the distance from the root vertex of the search tree to the successors of the expanded vertex and d FSA : Q × Q → N be the distance between two states in the FSA. Then, only vertices are returned as successors for which it holds that where q τ s is the state of the successor vertex and q τ eq is the equilibrium state closest to the successor vertex. This ensures recursive feasibility. For each successor vertex, a lazy cost estimate is computed. In informed search algorithms, the cost of a vertex consists of two parts: the cost-to-come (CTC) g : τ ∈ T → R, which is the cost of the path to the current vertex and the CTG, estimated by the heuristic h : τ ∈ T → R. The cost chosen here determines the functional that is minimized by the graph search, which corresponds to (8a). In applications such as robotic trajectory planning, computing the CTC, also known as evaluating the edges leading to a vertex, includes checking for collision constraints, which has a high computational cost [5]. The number of costly edge evaluations can be reduced if a lazy estimate of the cost is computed here and only substituted with the real cost if it is necessary [18], [19], [20].
Adding the successor vertices to the open list wraps up the graph search loop (Line 15).
Example 1: We illustrate the RHGS with an example. Fig. 3(a) shows an example FSA consisting of two states. State 2 is the current state of the system, and State 1 is an accepting state, i.e., an equilibrium of the underlying dynamical system. The states are unrolled over time to show that possible transitions change over the horizon. There exist transitions between the states and transitions that loop back to a state. Fig. 3(b) shows an exemplary search tree resulting from a receding horizon graph search with horizon length N = 3. Vertices are numbered consecutively with their index, and a subscript indicates the corresponding state of the FSA.

IV. TRAJECTORY PLANNING FOR CAR-LIKE ROBOTS
This section applies the general approach of recursively feasible RHGS on systems with symmetry to networked trajectory planning for car-like robots, which we also call vehicles in the following. In Section IV-A, a system model is shown to exhibit the symmetry property. In Section IV-B, the FSA for multiple car-like robots, which we call MPA, is introduced. In Section IV-C, the RHGS algorithm is adjusted to the trajectory planning task.  Fig. 3(a). The vertex' automaton state is indicated with the number in superscript.

Algorithm 1 Receding Horizon Graph Search
For this section, let I and 0 denote, respectively, the identity matrix and the null matrix (or vector) with the appropriate dimension.

A. Kinematic Single-Track Model
The nonlinear kinematic single-track model according to [36] is more precise than the linear point-mass model (see [37]). It is suitable for small velocities. Fig. 4 shows a general overview of the vehicle model's variables and their relations. Following [38], we assume that no slip occurs on the front and rear wheels, and no forces act on the vehicle. The resulting equations are with where x ∈ R and y ∈ R describe the position of the center of gravity (CG), ψ ∈ [0, 2π) is the orientation, β ∈ [0, 2π) is the side slip angle, δ ∈ [0, 2π) and u δ ∈ R are the steering angle and its derivative, respectively, v ∈ R and u v ∈ R are the velocity and acceleration, respectively, L is the wheelbase length, and r is the length from the rear axle to the CG. The inputs are chosen to be the change in the steering angle u δ and the acceleration u v . For the sake of simplicity, the input effects are assumed to influence the dynamics instantly without delay.
The nonlinear dynamical control system S defined in (12) is compactly written as (1) with the state vector the control input and the vector field f in terms of (12). In the following, we refer to the pose of the system, which is described by the first three components of the state vector, as The results can be easily extended to the complete state vector x, as the other components are decoupled from the first three and are not affected by the following transformations. For this system, the symmetry group G is a subgroup of SE(3), i.e., the class of combined rotations and translations about a vertical axis. It can be nicely represented by matrix multiplication when switching to homogeneous coordinates is a rotation of an angle ψ and is a translation. Henceforth, in this article, consider the notation g := g( p).
Proposition 1: The group action g for the kinematic single-track model (12) can be given as which is equivalent to considering Proof : The relation (3) can be proven in terms of the equivariance of the vector field f . The vector field f is equivariant w.r.t. the symmetry action if [31] f Replacing this relation on the definition (24), we get (x, u). (26) Note that (d/dt)(ψ + ψ) =ψ. Then, from f (x, u) given by (12) and the from (26), we have Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. proofing the equivariance of the vector field by satisfying (25).

B. Motion Primitive Automaton
We apply the concept of modeling the system dynamics as an FSA following the approach presented in Section II to carlike robots. We refer to this specific automaton as an MPA.
An MPA consists of two main types of components, the relative equilibria q ∈ Q and the motion primitives π ∈ S with duration T s . The motion primitives can be precomputed by solving an optimization problem. Our objective function is the squared change of the input. The constraints are: 1) each motion primitive must start and end on a relative equilibrium and 2) the system dynamics (12) and the limits for states and inputs must be respected.
If vehicles plan their trajectories cooperatively, they need to create a representation of the dynamical system consisting of all vehicles that should cooperate. We obtain the MPA of all cooperating vehicles by using synchronous side-by-side composition [39]. Suppose that two MPA A and B are given by the five tuples The synchronous side-by-side composition C is given by and the update function is given by where ∀q A ∈ Q A , q B ∈ Q B , s A ∈ S A , and s B ∈ S B . Fig. 5 illustrates an example of a synchronous side-by-side composition of two MPAs. During trajectory planning, we exploit the symmetry of the system and shift the precomputed motion primitives to the current state of the system. Trajectory planning includes collision checks when a transition is taken. Since the underlying trajectory carries information about time, the transitions that connect relative equilibria need to share the same duration T s . Then, the time for collision checks between the primitives of the two vehicles matches.

C. Trajectory Planning
The goal of trajectory planning in this article is the search for a trajectory that: 1) keeps a vehicle as close as possible to a reference trajectory; 2) while avoiding obstacles; and 3) under consideration of the vehicle dynamics [e.g., (12)]. The first subgoal is addressed by minimizing a suitable cost function, the second subgoal is considered during edge evaluation, and the last subgoal is achieved by the construction of the MPA. This section presents the specific implementation of the RHGS, as shown in Line 1 in Section III.
During graph search, a search tree is constructed from a root vertex τ 0 . The information contained in each vertex τ of the search tree T is a tuple (q, p, i, J ), whose elements are the corresponding state of the automaton, the vehicle pose, the distance to the root vertex, and the cost function, respectively.
At the beginning of the control loop, the vehicle's states are measured and predicted using the vehicle model until the time when the current control input takes effect. The speed v and the steering angle δ determine the state q. The root vertex τ 0 of the search tree is initialized with this state and the pose p.
During the graph search iterations, the vertex with the lowest cost is expanded. The cost of a vertex consists of the CTC and the CTG. The cost function to be minimized in the given trajectory planning problem is With n veh as the number of vehicles, x ∈ R 5n veh is the state vector, Q ∈ R 5n veh ×5n veh is a positive semidefinite, block diagonal matrix with n veh times repeated entries q q = I 2 0 0 0 (32) and x ref,k+i|k ∈ R 5n veh is the reference trajectory sampled at the timesteps in the horizon k +i . The trajectory planner generates the reference trajectory from a reference path and a reference speed as follows. First, the planner determines the projection of the vehicle's position on the reference path, which is closest to the vehicle's position. Second, from this projection point, the planner samples reference points at a distance of on the reference path using linear interpolation. These sampled points correspond to the time k + i in the horizon and form the reference trajectory. In urban environments, the reference path could be retrieved from the lane that a vehicle should follow, and the reference speed could be retrieved from the speed limit.
Trajectory planning for ground vehicles using graph search belongs to a class of graph search problems with expensiveto-evaluate edges [5]. Computationally most demanding are collision checks, either between two vehicles or a vehicle and an obstacle. Therefore, we employ a lazy approach as inspired by Dellin and Srinivasa [19].
The lazy edge evaluation corresponds to an evaluation of the cost function (31) without collision checks. In RHGS, the cost at a vertex is comprised of two parts. The first part is the CTC. Given a vertex τ at depth i τ , the CTC is the Euclidean distance squared from the position to the reference position at time k + i The second part of the cost is the CTG, which must underestimate the remaining cost from a vertex in the search tree. In our case of trajectory planning with a reference, the CTG is computed as the distance from every future position to the reference position, assuming that the vehicle would move toward the reference position with maximum velocity The exact edge evaluation requires collision checks on the generated trajectories with both other vehicles and obstacles. A collision between two vehicles occurs if the areas occupied by each vehicle during a simultaneously executed motion primitive overlap. In order to check this condition, vehicles are approximated with rectangles of length L and width W . Furthermore, the occupied area during a motion primitive is approximated using a polygon, as shown in Fig. 6. On the left, the occupied space during a primitive of straight motion is depicted, where the polygonal approximation does not introduce an error. On the right, the occupied space during a primitive of circular motion is depicted. The five-sided polygon overapproximates the actual space occupied but simplifies checking the occupied areas of multiple vehicles for overlaps. Intervehicle collisions and obstacle collisions can efficiently be identified using the separating axis theorem [4]. Both vehicles and obstacles are approximated using convex polygons, so a separating axis exists iff an axis through one of the polygon sides represents a separating axis. Consequently, we iterate over all sides of the polygons of the vehicles or obstacles in question until we find a separating axis. An axis separates two polygons if the projection of polygon corner points onto a line perpendicular to the axis results in the projected corner points of one polygon being on the left and the others being on the right of the axis.
As explained in Section III, the proposed method returns the path to the cheapest vertex at the desired depth N. Occupied space in green during a primitive of straight motion (left) and circular motion (right) with polygonal approximation in orange. The vehicle geometry is approximated by a rectangle.
Denote the set of successors of a state q i with Q i and the cardinality of the set of states |Q| with N q . We denote with

V. EVALUATION
This section details the simulation and experiment setup with which we validate our approach. The section first highlights the importance of the recursive feasibility property in simulation. Then, the effect of the finite planning horizon is analyzed. Finally, we run the algorithm in an experiment in the CPM Lab [7]. It is a 1:18 model-scale, open-source platform for networked and autonomous vehicles. A video of the simulations and experiments is available online [1].
Our software runs in MATLAB R2022a on a PC with an AMD Ryzen 5 3600 4.2-GHz hexacore CPU and 16-GB RAM on Ubuntu 18.04. It is available online [2], [3].
We assume that our trajectory planner has knowledge about the future trajectories of obstacles for its prediction horizon. All vehicles in the evaluation are controlled by one centralized controller. Table I    plans the trajectory from start to finish, which corresponds to an infinite horizon. In practice, it plans with a shrinking horizon of N exp −k, where N exp is the number of timesteps that the experiment is running. The planners compute the trajectory with the MPA shown in Fig. 7, which consists of four states with two speeds and three steering angles.
We test the trajectory planner in a scenario where it needs to find a trajectory that keeps the vehicle close to the reference trajectory, which is a line from left to right. The planner needs to avoid obstacles that travel from the top downward and from the bottom upward with a speed of v obs = 1 m/s. Fig. 8 shows the resulting traveled and planned trajectories at four key timesteps. Both planners find a trajectory from left to right, which avoids the dynamic obstacles. With the limited horizon, the RHGS planner needs to stop in front of the moving obstacles before it can find a way past them. Using its infinite horizon, the SGS can avoid the obstacles faster than the RHGS and can, therefore, stay closer to the reference trajectory.
We compare the quality of the traveled trajectory from our RHGS planner with the SGS planner using the objective function value (31). For this, we insert the actual state trajectory x(k) and the reference trajectory x k|0 for k = 0, . . . , N exp in (31). The objective function value of our RHGS is approximately five times higher than that of the SGS. Fig. 9 compares the computation time on a logarithmic x-axis. The maximum computation time of our RHGS is around 175 times faster than SGS, while the median computation time is around 3.5 times faster. The maximum computation time of 3 s of the SGS is 7.5 times our sampling time T s , which renders this baseline method unsuitable for real-time trajectory planning even for a single vehicle. In contrast, our method is at worst still 23 times faster than our sampling time. Comparison of planned trajectories from RHGS (red) and SGS (blue). The planned trajectories avoid moving obstacles to follow a straight reference trajectory from left to right. Obstacles on the left move upward, while obstacles on the right move downward. Fig. 9. Comparison of the maximum and median computation time of RHGS and SGS for the scenario shown in Fig. 8.

B. Recursive Feasibility
As Section V-A demonstrated, transferring the idea of RHC to graph search improves computation time. In RHC approaches, feasibility at the initial time does not imply feasibility for all future times [35]. It is desirable in RHC to guarantee feasibility for all future times. We refer to this property as recursive feasibility. We achieve this property with our design of the MPA and demonstrate its importance in RHGS in simulation.
The simulations use two simple MPAs with only three speeds. The MPAs and their transitions over a horizon of N = 2 are shown in Fig. 10(a) and (b). The figures showcase the effect of restricting the transitions such that the automaton ends in an equilibrium state at the end of the horizon. Note that, with the restriction in Fig. 10(b), the state with the highest speed v = 1.5 m/s cannot be reached.
Both MPAs are evaluated in the scenario shown in Fig. 11(a). The scenario is chosen such that it illustrates the possible unsafe behavior of Fig. 10(a) even though a full stop can technically be reached during the horizon. In the worst case, the vehicle has a high speed very close to the obstacle. This is the case if the planner chooses the state sequence "3, 2" at time k, which results in the vehicle being right in front of the obstacle at the prediction horizon. Then, at time k + 1, the remaining distance to the obstacle is In order to avoid the collision, the vehicle needs to stop before traveling the distance d o . Since the input u v corresponds to the acceleration and is constant per timestep, according to the model (12), the distance traveled at time k of the horizon is (38) According to the MPA in Fig. 10(a), the minimum acceleration that reaches the speed v(k + N) = 0 from the maximum velocity v max over the horizon of N steps is By inserting (39) into (38), the distance traveled until stop becomes For collision avoidance, the distance traveled over the horizon d(k + N) − d(k) must be smaller than the remaining distance d o Fig. 11(b) shows the simulated trajectory of the distance to the obstacle in x-direction for both MPAs. Without enforcing the final state to be in equilibrium, the planner maneuvers the vehicle in such a way that it cannot find a feasible solution at t = 1.2 s. On the other hand, when the final state must be an equilibrium, the vehicle is able to stop in front of the obstacle.
In this specific MPA, the planner would stay feasible if the maximum shortest path length of all states q to the equilibrium was d FSA (q, q eq ) = N − 1, but this behavior cannot be generalized to any FSA.

C. Receding Horizon
This section analyzes the implications of implementing a receding horizon to a graph search algorithm based on simulations. The focus lies on the behavior of the computation time depending on the scenario.
The following simulations use the MPA shown in Fig. 7. The three upper states specify the speed v = 0.75 m/s and allow transitions between three steering angles. The lower state is the equilibrium state of the system.
The structure of the scenario is shown in Fig. 11(a). A vehicle's reference trajectory is obstructed by an obstacle, so the planner avoids the obstacle by steering around it. In the following simulations, the initial distance between the vehicle and the obstacle varies from 1 to 3 m, and the horizon N varies from 5 to 30.
The number of expanded vertices during the search gives an indication of the necessary computational burden. This number of expanded vertices is shown in Fig. 12 depending on the horizon length N and the initial distance between the vehicle and the obstacle. As presented in Section IV-C, the worst case computational effort grows exponentially with the horizon N, and the best case computational effort grows linearly with the horizon N. The tendency of increasing computational effort with horizon length is observable in Fig. 12. There are some cases where the number of expanded vertices decreases with increasing N. In these cases, the planner finds a trajectory, which is clearly better than other alternatives judged by the heuristic function. This is the case if the planned trajectory lies on the reference trajectory. If the vehicle moves at a slight angle to the reference trajectory, many options are explored in order to find a better solution. The computational effort also increases with increasing the initial distance between the vehicle and the obstacle. When the obstacle is encountered, the value of the cost function increases. Therefore, vertices before the obstacle are expanded. The larger the distance to the obstacle when it appears on the horizon, the larger the number of vertices that need to be considered.

D. Multi-Agent Experiment
This section presents the results of trajectory planning for multiple model-scale vehicles in simulation and experiment in the CPM Lab.
The simulations and experiments are based on an MPA that is the synchronous composition of two of the MPA shown in Fig. 7. Two vehicles start opposite of each other and follow reference trajectories to the other's position. This provokes a collision in the center between the vehicles, which the planner avoids. Fig. 13 shows the simulation and experiment results. The vehicles successfully avoid the collision in a cooperative fashion by solving the RHC problem for the system consisting of both vehicles. The maximum computation time is 45 ms and occurs when the collision is avoided. The median of all 14 control loop iterations of the scenario is 2 ms. The maximum number of expanded vertices during one control loop is 2928. This number of course strongly depends on the MPA and, therefore, also the number of vehicles, the scenario, and the planning horizon N, as seen in Section V-C. Fig. 14 illustrates the rapidly growing complexity with the number of vehicles by the computation time. This supports our analysis in Section IV-C. If a single vehicle plans a trajectory, the maximum computation time is only 1 ms. On the other hand, when three vehicles are positioned in the corners of an equilateral triangle and plan their trajectory toward the  opposite side, the maximum computation time is around 8.5 s, while the median is at around 9 ms.

VI. CONCLUSION
This article presented a trajectory planning approach using motion primitives for multiple vehicles that avoid collisions. By introducing a receding horizon to a graph search algorithm, we reduced the computational load while keeping the important property of recursive feasibility. Using motion primitives can also reduce computational load compared to approaches that consider the complete action space. In contrast to many graph-based approaches, we extended the strategy to reach a single goal by planning a trajectory to follow a reference trajectory. The approach can plan collision-free trajectories for multiple vehicles simultaneously. Experimental results for trajectory planning for two vehicles and numerical results for trajectory planning for three vehicles were presented.
Abstracting the system dynamics to a motion graph yields the advantage that detailed nonlinear models can be used without significantly increasing the computation time. This article investigated a limited set of motion primitives. A finer grained quantization of motion primitives could achieve lower objective values in our graph search but would also increase the computation time. The search algorithm could be developed further in future work. State-of-the-art strategies, such as an incremental search using D * Lite [40] or a Hybrid A * variant, which keeps track of a closed list, can be combined with the presented receding horizon graph search. The presented work is real-time capable for systems of up to two vehicles and a crude motion primitive automaton. Of course, it can be accelerated by increasing the efficiency of the implementation with a more performing programming language, such as C++. However, the source of the high computational load is the centralized problem formulation. In future work, the authors plan to investigate the distribution of the presented control algorithm [41].
ACKNOWLEDGMENT A demonstration video of this work is available online [1]. The code is available at GitHub [2] and ready-to-run at Code Ocean [3].