On the Actuator Dynamics of Dynamic Control Allocation for a Small Fixed-Wing UAV With Direct Lift Control

A novel dynamic control allocation method is proposed for a small fixed-wing unmanned aerial vehicle (UAV), whose flaps can be actuated as fast as other control surfaces, offering an extra way of changing the lift directly. The actuator dynamics of this kind of UAVs, which may be sluggish comparing with the UAV dynamics, should also be considered in the control design. To this end, a hierarchical control allocation architecture is developed. A disturbance observer-based high-level tracking controller is first designed to accommodate the lagging effect of the actuators and to compensate the adverse effect of external disturbances. Then, a dynamic control allocator based on a receding-horizon performance index is developed, which forces the actuator state in the low level to follow the optimized reference. Compared with the conventional control allocation method that assumes ideal actuators with infinite bandwidths, higher tracking accuracy of the UAV and better energy efficiency can be achieved by the proposed method. Stability analysis and high-fidelity simulations both demonstrate the effectiveness of the proposed method, which can be deployed on different fixed-wing UAVs with flaps to achieve better performance.

R ECENT decades have witnessed the rapid growth on both the development and application of small unmanned aerial vehicles (UAVs) in different domains, ranging from surveillance, payload delivery to environmental monitoring and agriculture mapping. Among different UAV configurations, such as helicopters and multicopters, conventional fixedwing UAVs still manage to position themselves in many critical applications, because their simpler structure and better aerodynamic efficiency can offer longer endurance and higher airspeed. For many small fixed-wing UAVs, they are normally featured with low-cost and lightweight designs to facilitate the rapid deployment to various applications. However, these attributes also mean that their limited power and low inertia may make them susceptible to external wind gusts, leading to degraded flight performance. To alleviate these phenomena, some control-oriented techniques [1]- [4] have been investigated to compensate the wind disturbances. This brief approaches this problem from a different perspective, specifically by exploiting the flight mechanism of conventional fixed-wing UAVs equipped with dedicated flaps. Flaps can be used to change the lift and drag, which is often referred to as direct lift control (DLC) in aviation literature [5]. Unlike their counterparts on full-size aircraft, the flaps on small UAVs are usually actuated by the same electric servos used for other control surfaces. This means that the flaps can be used as ordinary control surfaces in the same way as elevators, which, in turn, provides the control redundancy and extra control authority to deal with external disturbances. Therefore, to explore fully the potential of DLC on small fixedwing UAVs, a solution to control allocation is required to distribute the desired total control effort among a redundant set of actuators, i.e., the elevator, motor thrust, and flap.
Various control allocation methods, e.g., pseudoinverse, daisy chaining, and direct allocation, have been developed and extensively investigated in literature [6]- [9], some of which have a particular focus on reconfigurable flight control design [10]- [12]. Apart from the basic control allocation functions, many advanced algorithms have considered additional factors, e.g., actuator energy saving and actuator safety, using techniques such as linear/nonlinear-constrained quadratic programming [13], [14], additional dynamic augmentation [15], and input matrix factorization [16]. However, those control allocation methods may not be readily applicable to the problem investigated in this work due to the presence of slow actuator dynamics. It is very common in control allocation to assume ideal actuators with infinite bandwidths, ignoring their dynamics. This assumption is justifiable, since the dynamics of many overactuated systems, e.g., cargo ship, full-size aircraft, and submarine, are indeed much slower than those of their actuators. On the contrary, the actuator dynamics of small fixed-wing UAVs cannot be ignored in high-performance flight control due to the agile dynamics of a small UAV.
Several pioneering works on control allocation with the consideration of actuator dynamics can be found in literature. In [9,Ch. 7], the existence of actuator dynamics is regarded as one of the most significant obstacles in control allocation, because the input matrix of the cascaded system cannot be factorized into two matrices with lower dimensions, suggesting that the conventional control allocation methods cannot be applied directly. To the best of our knowledge, the first viable solution was proposed in [17] by introducing a punishment on the allocation rate. This approach is also referred to as dynamic control allocation, which indicates that an additional transient process is introduced for control allocation. Hence, it is easier for actuators to catch up with the time-varying control signals. Following this dynamic approach, some important progress has been made in both theory [18] and applications [19]. However, it is worth noting that this kind of dynamic allocation methods only relieves the adverse effect of actuator dynamics, because the perfect allocation is only available for constant control signals (see [17,Th. 3] for details). On the other hand, a practical approach to compensate the actuator dynamics by the measurement of actuator states is proposed in [20]. However, since it is an open-loop compensation, only very fast sampling rate can guarantee the effectiveness and stability of the closed-loop system, setting high requirements on the sensors of actuators. In [21], the model predictive control (MPC) method is used for control allocation, which considers a second-order actuator model with constraints on position and velocity. In [22], an adaptive control allocation method is proposed for stabilization of a much more general case where the actuator model is not only nonlinear but also with parametric uncertainties, which increases the complexity for practical tracking applications. Meanwhile, along the line of linear full-information output regulation for overactuated systems, the rigorous theoretical framework has been build up in [23]- [27]. Specifically, based on [24], overactuated plants with parametric uncertainties are considered in [25], while, in [26] and [27], MPC methods are involved to specify desired steady states with constraints. These methods have the potential to deal with the problem considered in this brief by several extensions and embedding the actuator dynamics into the considered system.
To tackle the challenges on DLC-based control for small fixed-wing UAVs, namely, disturbance rejection, actuator dynamics, and control allocation, a novel integrated control design with dynamic control allocation is proposed in this brief. The considered actuators are of linear dynamics without constraints, but their dynamic states are immeasurable. Following the structure of conventional control allocation [13], a hierarchical framework is adopted to pursue the superiorities of modular design. The structure of the proposed framework consists of two parts. In the high level, a linear control design with a compensator is adopted for precise output tracking. Owing to the insufficient sensors for actuators and external disturbances, a disturbance observer [28] is combined with the tracking controller to achieve the objectives of output tracking and disturbance rejection. However, embedding the actuator dynamics into the high-level tracking design will introduce additional zero dynamics into the system. Therefore, the corresponding low-level allocator needs to be designed not only to reduce the extra energy consumption caused by control inputs but also to stabilize the internal dynamics [29], [30]. In the low level, inspired by the generalized predictive control method [31], a receding-horizon performance index is adopted to represent the total cost. Unlike the conventional performance indexes in control allocation (e.g., J (t) = W u(t) 1,2,∞ ), which is only related to the current control input, the receding-horizon one contains the future information and, hence, is dynamic. By using the Taylor expansion for prediction, the optimal desired states of actuators are explicitly obtained, which are directly dependent on the disturbance estimates in the high level and the reference commands of the UAV. Subsequently, a virtual allocator is designed to force the actuator states to the optimal ones.
Notation: For any smooth enough function f (t), symbol f (i) (t) denotes the i th-order derivative of f (t) with respect to variable t. For any state x, symbolsx and x r denote its estimate and reference, respectively. For any matrix A ∈ R m×n , A (i, j) ∈ R denotes the element in the i th row and j th column of A, A (i,:) ∈ R 1×n denotes the vector row i of A, and Matrix 0 i× j denotes an i × j zero matrix and matrix 1 k×k denotes a k × k identity matrix.

II. PROBLEM FORMULATION
In this section, the linearized longitudinal model of a small fixed-wing UAV with the assistance of DLC, together with models of actuators, i.e., the elevator, motor throttle, and flap, is briefly introduced. Readers can refer to [2], [32] for the illustration of the longitudinal dynamics of UAV.

A. UAV Dynamics
The linearized longitudinal model for the considered small fixed-wing UAV is given as follows [33,Ch. 5.5.3]: ; u, w, q, θ, and h are the forward body velocity, vertical body velocity, pitch rate, pitch angle, and height, respectively; δ e , δ t , and δ f are the control inputs generated by the elevator, motor throttle, and flap, respectively; X, Z , and M are the force and moment coefficients given their associated subscripts; g is the acceleration of gravity; any state denoted further with * represents the state at the linearization point of the model; and d u , d w , d q , and d h are the lumped effects of external disturbances and directly affect the system states u, w, q, and h, respectively. Since most external disturbances in flight control systems, e.g., wind gusts, normally present much slower dynamics than UAV itself [2], d is, therefore, assumed to be an unknown constant vector.

B. Actuator Dynamics
The elevator and the flap are directly driven by electric servos and are much faster than the motor thrust dynamics. Therefore, the first-order systems are used to represent the actuator dynamics of the elevator and the flap, while a secondorder one is used for the motor thrust. The dynamics of the three actuators are written compactly as follows: , and u f are the control input signals to the elevator, motor throttle, and flap, respectively; τ e and τ f are the time constants of the elevator and the flap, respectively; and υ nt and ζ t are the oscillation frequency and damping ratio of the thrust dynamics, respectively. Combining the UAV dynamics, the following assumption is adopted, which can be easily verified for different UAV systems. (2) into (1) to obtain the following actuator-dynamics-based UAV system: The objective of this work is to design a control input u a such that the output of the UAV system y p asymptotically tracks any given sufficiently smooth and bounded reference y pr [u r h r ] T under unknown external disturbances and, meanwhile, to save the total cost caused by the actuators. Before detailed design, the overall control scheme is shown in Fig. 1. Notably, the designed control input u a is only the input command to the actuator, which is a virtual signal without much physical significance. Hence, the output of actuator y a (or u p ), rather than its input u a , will be optimized in the subsequent allocation design.

III. ACTUATOR/DISTURBANCE-OBSERVER-BASED TRACKING
In this section, as shown in Fig. 2, the tracking controller in high level with a compensator for the lag effect of the internal actuators and the adverse effect of the external disturbances is designed for the augmented plant (3).

A. Actuator-Dynamics-Based Disturbance Observer Design
The designed observer should be able to estimate the actuator state x a as well as the external disturbance d. With actuator dynamics (2), one arrives aṫ x ad = A ad x ad + B ad u a , y ad = C ad x ad (4) where that (0 4×4 , B pd ) is observable. By Assumption 1, with the Popov-Belevitch-Hautus observability criterion [34], one gets that ( A ad , C ad ) is also observable. Following the approach in [35], by defining an internal state z ad T ] T , one gets the following actuator/disturbance observer: where L is the observer gain.

B. Tracking Controller Design via Dynamic Inversion
Based upon the parameters in Appendix A, the relative degree of each output is available as C 1 B u = 0, C 1 AB u = 0; C 2 B u = 0, C 2 AB u = 0, and C 2 A 2 B u = 0, where C 1 C (1,:) and C 2 C (2,:) . Since the sum of relative degrees is strictly less than the system dimension, there exists zero dynamics in (3). Following the approach in [36, Ch. 5.1], define a new state as where D Y [u u (1) h h (1) h (2) ] T , D Z is the state of the zero dynamics, and T M and T N are the transformation matrices.
To endow this coordinate transformation with more physical significance, let D Z x a . The transformation matrices can then be fixed as follows: It is worth noting that rank(T M ) = 9, which implies that mapping (6) is inversible between D X and x. Taking derivatives of D X in (6) along (3) giveṡ where A  (7) yields u (2)  where Using the dynamic inversion approach [37] gives the following tracking controller: D Y r(1:2,1) −D Y (1:2,1) D Y r(3:5,1) −D Y (3:5,1) ) + v hr (9) where Remark 1: There are two deficiencies of using the conventional control allocation method for the allocation of the designed controller (9). On the one hand, the targeted control input u a is a virtual signal without much physical significance. One the other hand, based upon the system parameters in Appendix A, we get that B t k = [0 0 − 0.18; 0.59 0 2.13]. The second column of the pseudoinverse of B t k is 0 2×1 , which implies that u * t = u * a(2,1) = 0 holds for any control signal v t k generated from high level. Therefore, it will lose the freedom to stabilize the zero dynamics.

IV. ACTUATOR-DYNAMICS-BASED DYNAMIC
CONTROL ALLOCATION In this section, the dynamic allocator in the low level considering the actuator dynamics is designed, as shown in Fig. 3, following the stability analysis of the closed-loop system.

A. Preliminaries
Since that B t k ∈ R 2×3 with rank(B t k ) = 2, it has a nullspace of dimension 1 = 3 − 2, in which the control input u a can be perturbed without affectingḊ Y . For convenience, we choose P as [0 1 0] T , which belongs to Ker( B t k ), and the allocation input is thus constructed as u t = Pu a . Define It is worth noting that rank( Q) = 3 and u a = Q −1 v. Next, rewriting the actuator dynamics (2) in light of (8) yieldṡ where The following assumption is adopted to facilitate the derivations that are, in general, satisfied.
Assumption 2: In what follows, the proposed dynamic allocation method is introduced, which is inspired by the generalized predictive control method [31]. However, unlike the generalized predictive control method, the desired states of actuators, rather than the states in (10), are optimized here. Note that, in the output regulation framework for overactuated systems [23]- [27], the desired states are also optimized based upon different cost functions.

B. Generator and Estimator of Optimal Desired States
The desired states of the considered actuators should satisfy the following system: x ar = A v x ar + B va u tr + R r + B vd d, y ar = C a x ar . (11) If the tracking objective has been achieved, R r is also available as a priori knowledge, which is directly related to the reference commands, i.e., R r = B a Q −1 [u (2) r − A x(2,1: Due to the linearization points (or trims) on the control inputs, the cost functions for each actuator can be defined as where T > 0 is the predictive period. The total cost of actuators is defined as where y * a [δ * e δ * t δ * f ] T ; ρ e > 0, ρ t > 0, and ρ f > 0 are the weights on the elevator, motor throttle, and flap, respectively; and W diag(ρ e , ρ t , ρ d ). Inspired by Chen et al. [31], the following Taylor expansion with finite series is used to predict y ar (t + τ ) approximately: y ar (t + τ ) = y ar (t) + τ y (1) ar (t) + · · · + τ n n! y (n) ar (t) n ∈ N + , τ ∈ [0, T ] (13) whereȳ ar (t +τ ) is its prediction. Before optimization, notably, C a B va = 0 3×1 and C a A v B va = 0 3×1 , which implies that the Taylor expansion order should be chosen as n ≥ 2, n ∈ N + . The case of n = 2 is simple, i.e., y ar = C a x ar , y (1) ar = C a A v x ar + C a R r + C a B vd d and y (2) (1) r + C a A v B vd d. Therefore, the detailed analysis of this case is ignored here.
With the assistance of (13) and (14), the cost function (12) can be predicted as follows: where T 0 to T 4 are all collected in Appendix B and is independent of u which is an explicit solution of the optimal problem (12). Note that the existence of d in (14) and (16) makes the designed optimal desired states of actuators unimplementable, and hence, the construction of the estimates of these states should be considered first based on observer (5), as follows:

Remark 2:
The main computational burden of the dynamic control allocation method is to compute matrices T 0 to T 4 in (15). In principle, as long as the Taylor expansion order n is fixed, all computation can be done offline, which means that matrices T 0 to T 4 are explicit with respect to the predictive period T and weighting matrix W, making the real-time tuning available.

Remark 3 (Modular Design):
Optimizing the desired states of actuator x ar , rather than directly optimizing the state x a , is adopted here to pursue modular design. Replacing R r in (16) with R in (10), one will find that once trying to directly optimize the actuator states, a higher order observer should be redesigned to estimate R, R (1) , . . . , R (n−1) . This implies that the whole control structure will be changed as long as the Taylor expansion order n in (13) is changed.

C. Dynamic Control Allocator Design
In this step, the control input u t is designed to force the real actuator state x a to the optimal one x ar . Thus, we can design the following allocator: (18) where K t is the allocator gain. Based on the preceding tracking controller (9) and allocator (18), the physical controller u a is obtained as D. Performance Analysis Theorem 1: Under Assumptions 1 and 2, consider the closed-loop systems (1), (2), and (19). If all the control parameters are well tuned, i.e., for observer (5), A ad − LC ad is Hurwitz; for controller (9) (17), A e − B ev T −1 0 T T 1 is Hurwitz; for allocator (18), A v − B va K t is Hurwitz, and then, the following three statements hold. 1) Uniform Boundedness: All the signals in the closed-loop system are uniformly bounded. 2) Asymptotic Tracking: The output of the UAV y p asymptotically tracks any given sufficiently smooth and bounded reference command y pr , i.e., lim t →+∞ y p = y pr . 3) Control Allocation: The state of the actuator x a asymptotically tracks the solution of the optimal problem x ar , which is generated by (14) and (16), i.e., lim t →+∞ x a = x ar . Proof: Define the estimation errors as e ad x ad −x ad and e er x er −x er . Define the tracking errors as E a  (20) is globally asymptotically stable. In addition, since that u r , u (1) r , u (2) r , and d are all bounded, all the signals in the closed-loop system are uniformly bounded.
This completes the proof of Theorem 1.

Remark 4 (Hierarchy Parameter Tuning):
From the dynamics of the output tracking errors, E u and E h , one can conclude that the tracking performance in high level is segregated from the allocation performance in low level. And hence, the tuning process can also be divided as two steps. First, tune the parameters of the proposed actuator/disturbance observer (5) and tracking controller (9) to satisfy specific requirements on the output tracking. Second, tune the weighting matrix W to penalize the specific actuator.

V. SIMULATION STUDY
In this section, the X-plane flight simulation software is employed to verify the proposed dynamic control allocation method. X-Plane is popular in aeronautical engineering due to its high fidelity on both aircraft dynamics and atmospheric environment [38]. In particular, the simulations are conducted based on a geometrical UAV model built in X-Plane using blade element theory to determine the aerodynamic performance. The considered UAV in X-Plane is shown in Fig. 4. To generate extra disturbances and uncertainties in the simulation, we have added 2 m/s wind gusts as well as small changes on the center of gravity and weight of the aircraft. The constraints on the outputs of actuators are chosen as To demonstrate the effectiveness of the proposed method, both the conventional static control allocation (CSCA) method [8] and the conventional dynamic control allocation (CDCA) method [21] are adopted as benchmarks. Physical constraints are considered in both the CSCA and CDCA methods, while actuator dynamics are only used in the design of CDCA and the proposed methods. The high-level tracking controllers are also designed following the same approach. Meanwhile, for fair comparisons, the controller gains of the three methods are tuned to render similar offset errors when the disturbance estimates are not included in the controllers, as shown in the gray patches of the subsequent simulation results. The observer gains are then fixed by making the poles of estimation error system five times of those of tracking error systems. It should be noted that although the CDCA method is able to deal with actuator dynamics, this method replies on the direct measurement of the actuator states and, hence, is not applicable in practice.
In what follows, two case studies are carried out to show the improvement of the proposed method on robustness and disturbance rejection.

A. Robustness Improvement
Following the test approaches of control allocation methods considering actuator dynamics [20], [21], [39], we also consider a sine signal with time-increasing frequency as the reference of the velocity but a constant reference of the height, which are both depicted in the red-dashed lines of Fig. 5(a). The output tracking results of the UAV and the outputs of the actuators under the different allocation methods, and the estimates of the total disturbances are shown in Fig. 5(a)-(c), respectively. It can be observed from the velocity tracking  Fig. 5(a) that the disturbance observer will improve the robustness of the proposed method against the unmodeled dynamics of both aircraft dynamics and atmospheric environment. Moreover, the height tracking precision of the proposed method is much higher than those of the CSCA and CDCA methods when the disturbance estimates are included. The main reason is that without considering the compensation for the lag effect of the actuators, the tracking error will largely depend on the frequency of its reference, especially when the frequency beyond the bandwidth. After nearly 70s, the UAV under the CSCA method tends to be unstable. The essential reason is that the CSCA method is not able to actively stabilize the zero dynamics [30], and hence, its control parameters are very sensitive. Although the UAV under the CSCA method works well at the beginning, it inevitably becomes unstable when the unmodeled dynamics becomes more significant, e.g., in the case where the reference changes more dramatically.

B. Disturbance Rejection
In the second case study, 1 in the presence of additional wind gusts, the UAV is controlled to track a descending profile with airspeed reduced from 15 to 12 m/s and height from 890 to 840 m, which are shown in the red-dashed lines of Fig. 6(a). Due to the weakness of the CSCA method in the first case study, the comparison with only the CDCA method is implemented here. The output tracking results of the UAV, the outputs and estimates of the actuators, and the estimates of the total disturbances are shown in Fig. 6(a)-(c), respectively. It can be observed from these tracking results that the proposed method will largely improve the tracking precisions compared with the CDCA method, which is critical to UAV safety under wind conditions. Since that total disturbances are still immeasurable even in X-Plane simulation, the estimation performance of disturbances can be indirectly verified by 1 Please refer to https://youtu.be/zkTS4GtDSLE for the simulation video.  Fig. 6(a) and the estimation precisions of actuators in Fig. 6(b). From Fig. 6(c), the estimate of d u changes relatively fast but is very small around zero, while the estimates of d w , d q , and d h are all slowly time-varying changing.

VI. CONCLUSION
In this brief, a new actuator-dynamics-based dynamic control allocation scheme is developed for flight control of a small fixed-wing UAV with DLC to compensate unknown external disturbances. Compared with the conventional UAVs, the considered small UAV presents faster dynamics, and hence, the actuator dynamics cannot be ignored in highprecision applications. By embedding the internal models of the actuators into both the tracking controller design in the high level and the dynamic allocator design in the low level, remarkable superiorities in the tracking precision of UAVs and energy efficiency can be achieved by the proposed method, which have been demonstrated in the verification example using high-fidelity simulations. It is envisaged that the proposed control scheme can be used to improve the flight performance and extend the flight envelop of small fixed-wing UAVs with dedicated flaps, allowing them to be used in wider applications and unfavorable weather conditions. Future work will be focusing on the extension of the current framework to deal with actuator constraints.

A. Parameters
The UAV model parameters are obtained through system identification around the operation point of level flight at forward velocity 15 m/s and height 890 m, whereas the actuator dynamics are established based on the ground test data of the servos used on the UAV at the sampling rate of 100 Hz. The identification process is detailed in [40,Ch. 4]. The sampling rate of the control program is 50 Hz. For the sake of completeness, the UAV parameters are listed in Table I.

B. Expressions
The expressions of T 0 to T 4 are presented as follows: