Stochastic MPC for Additive and Multiplicative Uncertainty Using Sample Approximations

We introduce an approach for model predictive control (MPC) of systems with additive and multiplicative stochastic uncertainty subject to chance constraints. Predicted states are bounded within a tube and the chance constraint is considered in a “one step ahead” manner, with robust constraints applied over the remainder of the horizon. The online optimization is formulated as a chance-constrained program that is solved approximately using sampling. We prove that if the optimization is initially feasible, it remains feasible and the closed-loop system is stable. Applying the chance-constraint only one step ahead allows us to state a confidence bound for satisfaction of the chance constraint in closed-loop. Finally, we demonstrate by example that the resulting controller is only mildly more conservative than scenario MPC approaches that have no feasibility guarantee.


Stochastic MPC for Additive and Multiplicative Uncertainty Using Sample Approximations James Fleming and Mark Cannon
Abstract-We introduce an approach for model predictive control (MPC) of systems with additive and multiplicative stochastic uncertainty subject to chance constraints.Predicted states are bounded within a tube and the chance constraint is considered in a "one step ahead" manner, with robust constraints applied over the remainder of the horizon.The online optimization is formulated as a chance-constrained program that is solved approximately using sampling.We prove that if the optimization is initially feasible, it remains feasible and the closed-loop system is stable.Applying the chance-constraint only one step ahead allows us to state a confidence bound for satisfaction of the chance constraint in closed-loop.Finally, we demonstrate by example that the resulting controller is only mildly more conservative than scenario MPC approaches that have no feasibility guarantee.Index Terms-Optimization, predictive control, probability distribution, process control, sampling methods, stochastic systems.

I. INTRODUCTION
Robust model predictive control (MPC) is a method of controlling uncertain systems in which predictions of states are used to minimize a cost function online.Unlike linear feedback controllers, an MPC allows inequality constraints to be handled systematically and, if suitable terminal constraints are included in the optimization, it can ensure that those constraints are satisfied for all uncertainty realizations.In general, propagating the effects of parametric or multiplicative uncertainty over a prediction horizon of length N involves computation that grows exponentially with N .This difficulty can be avoided through the introduction of a prediction tube [1], [2].Recent papers have provided convenient parameterizations of such tubes for systems with parametric uncertainty [3], [4].
In the presence of stochastic uncertainty, it is often desirable to limit the probability of constraint violations.In this case, a robust approach that enforces constraints for all uncertainty realizations may be conservative.Instead, knowledge of the probability distribution of uncertainty should be used to relax constraints.For a linear system with additive uncertainty, constraint tightening/relaxing parameters may be calculated to handle chance constraints [5], [6] and the resulting controller guarantees constraint satisfaction and stability in closed-loop.No similar result exists for stochastic uncertainty in system parameters, and the aim here is to introduce a suitable framework by using conditions for a J. Fleming is with the School of Engineering, University of Southampton, Southampton SO17 1BJ, U.K. (e-mail:, j.m.fleming@soton.ac.uk).
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TAC.2018.2887054 polyhedron to be contained in a set defined by chance constraints.This results in a receding horizon control problem whose solution can be approximated using a sample-based method, while providing closed-loop feasibility, stability, and constraint satisfaction guarantees.In [7] and [8], a scenario approach is used for predictive control, where the probability that constraints are satisfied over some finite horizon is bounded.These scenario methods are very general, but they do not guarantee that the optimization problem remains feasible at subsequent times.In contrast, we consider "one step ahead" chance constraints (as in [5] and [6]) and use a tube to bound the predicted state, which guarantees recursive feasibility if the uncertainty has compact support.This extends the work of [3] by introducing additive uncertainty, less conservative terminal conditions, and a closed-loop confidence bound on constraint violation, and provides a stochastic counterpart to the robust controller of [4].
a) Notation: We use E t [•] to denote conditional expectation given state x(t) while Pr t [•] denotes conditional probability given x(t).The convex hull of a set of points is written as Co{•}.To refer to the ith row of a matrix H, we write (H) i .For convenience, when developing the controller, we consider the time t = 0 and denote predictions of the input and state u(t), x(t) at a future time t = k as u k , x k .

II. PROBLEM STATEMENT
We consider the control of a discrete-time system where the state x(t) is available at each t and u(t) is a control input.The values A(t), B(t), w(t) depend on a vector-valued stochastic process q(t) with elements q i (t) ∈ R, i = 1, . . ., μ for some known A {i } , B {i } , w {i } , with i = 1, . . ., μ.To simplify notation we write A, B, w, q in place of A(t), B(t), w(t), q(t) wherever the time-dependence is clear.We briefly note that (2) allows A, B, and w to be independent, as q may be partitioned as q = [q A q B q w ] T and, for example, B {i } = w {i } = 0 for i corresponding to elements of q A .Assumption 1: For any t 1 = t 2 , q(t 1 ) and q(t 2 ) are independent and identically distributed (i.i.d.).
Remark 1: Assumption 1 limits the approach to systems in which the uncertainty affecting A(t) and B(t) is i.i.d., in common with existing methods [3], [8], and [9].Temporal correlations in w(t) can be handled by augmenting the system with a linear filter generating noni.i.d.noise.Assumption 2 is required for recursive feasibility, and is used to bound the uncertainty robustly after the first prediction time.
The control objective is to minimize the expected value of where Q and R are positive definite matrices, while, for some probability level p ∈ (0, 1], satisfying constraints Constraint ( 5) limits the probability of a constraint violation at the next sample time using the most recent information available, which is useful in applications such as wind turbine control [9].At each time t, the controller minimizes (3), which is the expected future performance given present information, subject to ( 4) and ( 5).We write A (j ) , B (j ) , and w (j ) for values of A, B, w in (2) corresponding to q = q (j ) .

III. INPUT, STATE, AND TUBE PARAMETRIZATION
We parametrize the predicted control input as where K is chosen so that the dynamics ξ k + 1 = (A + BK)ξ k are mean-square stable [10], [11].The uncertain system then generates predicted state trajectories, where Φ = A + BK.Define Φ (j ) = A (j ) + B (j ) K, and let c k = 0 for k ≥ N .The sequence c 0 , . . ., c N −1 is set by the controller, with the integer N referred to as the prediction horizon.
At some prediction time k, the set of possible future states x k is a polyhedron with a number of vertices that grows exponentially with k.To avoid computing these sets, we bound the predictions in a state tube, which is a sequence of polyhedra of fixed complexity.Introducing a parameter α k , we define the state tube cross sections as the sets The choice of V is nontrivial and should be made by offline construction of an invariant set.Assumption 3: For some g, the set {x : V x ≤ g} is robustly invariant for (1) under the control law u(t) = Kx(t).
We apply constraints so that the T k contain the predictions In particular, since (9) implies Constraints ( 4) and ( 5) can be applied to the predictions by requiring that the cross sections satisfy Using ( 6) and ( 7), and defining Fh = F h + G h K, these conditions can be rewritten as

IV. LINEAR AND PROBABILISTIC CONSTRAINTS
We extend a well-known condition for inclusion of polyhedra to the case where the containing set is defined by probabilistic constraints.The following lemma is from [12].
Lemma 1: The next result generalizes Lemma 1 to the case in which P 2 is defined by a linear chance constraint, which in general does not yield a polyhedral set.
Lemma 2: Then, P 1 ⊆ P 2 if and only if matrices H {i } ≥ 0 exist satisfying

. , μ and
Proof: To prove sufficiency, suppose that H {i } , . . ., H {μ } satisfy the conditions of the lemma.Then, for any x * in P 1 , we obtain We prove necessity by construction.Assume where (F 2 ) j is the jth row of F 2 and μ j is the jth element of μ.Strong duality holds for linear programs, so the dual of the linear program defining μ j for any realization of F 2 implies For any i ∈ {1, . . ., μ}, let h {i } j denote the minimizing argument of the LP (16) corresponding to (F 2 ) j = (F {i } 2 ) j , and define H {i } as the matrix with jth row equal to (h {i } j ) T .Then (16) implies that 16), μ j is equal to the jth element of the vector μ i = 1 H {i } q i b 1 , and therefore, (15) 2 )q i ≤ 0] ≥ p.We now apply Lemmas 1 and 2 to the inclusions ( 9), (13), and (14).Analogously to Φ (j ) , we define

, μ, and with probability at least
To apply the set inclusions in the online MPC controller, a choice of "H" matrix is made offline and the associated inequality constraint included in the online optimization, which is then a sufficient condition for the associated set inclusion.To relax the online constraints, these rows of these matrices can be chosen as the solutions of linear programs When defined in this way, these matrices have several useful properties that simplify the online optimization.Remark 2: If Fh appears as a block row of V , then This simplifies the constraints of Proposition 4 and occurs if V defines the maximal robustly invariant set for x k + 1 = Φx k + w subject to Fh x k ≤ 1. Proposition 4 is then necessary and sufficient for constraint satisfaction.For details of how to compute this set, see [12] or [13].
Remark 3: Each of the matrices H (j ) , H h , and defined by ( 17)-( 19) is sparse with at most dim(x) nonzero entries in each row.
A terminal condition is required to ensure the online optimization remains feasible in closed-loop.In [3], a norm-bound was used, but here we use the less conservative condition This terminal constraint guarantees that states within the final tube cross section T N remain within T N , effectively making it a terminal set for the MPC controller.

V. QUADRATIC COST FUNCTION
Due to additive uncertainty, it is not possible to use the cost (3) as the objective of the online MPC optimization directly because its optimal value may not be finite.Following the method of [9], we instead define the predicted cost as where where P , v, and π solve the Lyapunov equation Given the linear dependence of (Ψ, w) on q, it follows immediately that P is the solution of the Lyapunov equation and v is the solution of the linear equations where Ψ {i } and w{i} are defined by replacing Φ, B, and w with Φ {i } , B {i } , and w {i } in the definition of Ψ and w.

VI. CHANCE-CONSTRAINED MPC OPTIMIZATION
The constraints of Section IV can be used to construct a recursively feasible MPC law.The control applied at each time is u 0 = Kx 0 + c 0 , where c 0 is given by the solution of an optimization problem.This online optimization consists of minimizing the predicted cost defined in Section V subject to constraints deriving from Propositions 3-5, with the chance constraint applied only at time k = 0 and a corresponding robust constraint applied for k = 1, . . ., N .Note that at time k = 0, we may choose α 0 = V x 0 .

VII. SAMPLE-BASED MPC ALGORITHM
This section considers implementing Optimization 1 using random samples, q [1] , . . ., q [n ] drawn from the distribution of q at each time t.The chance constraint may be approximated by specifying that it should hold for at least n − r samples.For the rest of the prediction horizon, the constraint is applied robustly.To determine the r samples allowed to violate constraints, we use the following heuristic: if an approximate solution is available, then a good choice of samples to ignore are those that have the greatest constraint function values.
This motivates an algorithm where successively better approximations to the solution of Problem 1 are generated by solving with a fixed set of samples, sorting those samples, and solving again with the most restrictive samples removed.To facilitate this, we introduce slack variables, denoted s [l ] , l = 1, 2, . . ., n, leading to the quadratic program.
tube constraints, for k = 1, 2, . . ., N − 1 : terminal constraints : slack variable conditions : The online algorithm involves solving Optimization 2 and refining the index set I. This procedure is repeated to improve the accuracy of the solution.In contrast to the approach of [8], sampled constraints are employed only for the first prediction time k = 1 rather than over the whole horizon.This may be conservative, but guarantees recursive feasibility.
3) Solve Optimization 2 using I. 4) Set m [l ] as the minimum element of s [l ] for each l. 5) Set I to contain the n − r indices "l" with greatest m [l ] .6) If I = I l stop.Otherwise set I l = I and return to 3).
Because of the nonconvexity of Optimization 2, the stopping criterion is necessarily heuristic, but in practice Algorithm 1 converges in a few iterations when the index set I is repeated.

VIII. CLOSED-LOOP PROPERTIES
Due to the bounding of predictions at time k = N in the final tube cross-section T N , it is possible to give guarantees of feasibility and stability in closed-loop operation.
Theorem 7: Optimization 2 and Algorithm 1 define a recursively feasible MPC law.That is, if it is feasible at time t, then it remains feasible at all subsequent times in closed-loop.
Proof: Let (c 0 , . . ., c N −1 ), (α 0 , . . ., α N ) denote a feasible point at any given time t.We will show that (c 1 , . . ., c N −1 , 0) and (α 1 , . . ., α N , α N ) are feasible at time t + 1 with I = {1, 2, . . ., n}.For the terminal constraints We observe that at time t + 1, these are identical to the terminal constraints from time t, and therefore, are satisfied.In a similar manner, the tube constraints are trivially satisfied for k = 1, . . ., N − 2 because they are identical to constraints used at time t.For k = N − 1, they are identical to the terminal conditions used at time t, noting that our chosen "c" is zero, and are also satisfied.The remaining constraints are the initial constraints, sampled constraints and slack variable conditions.If I = {1, 2, . . ., n}, then |I| ≥ n − r is satisfied and we may eliminate the slack variables to leave where the sampled constraint applies for all l = 1, . . ., n.Because the initial state x 0 at time t + 1 necessarily lies within the tube crosssection T 1 from time t, we have V x 0 ≤ α 1 .Noting that all elements of the "H" matrices are nonnegative, comparing the first two constraints with the k = 1 tube constraints from time t leads immediately to the conclusion that they are satisfied, as we have HV x 0 ≤ Hα 1 where H is either H (j ) or H h .Finally, the sampled constraint is satisfied for all l because we have p α 1 for all i and all samples q [l ] ∈ Co{q (1) , . . ., q (ν ) } by Assumption 2.
Theorem 8: The system (1) under the control law u(t) = Kx(t) + c 0 , where c 0 is the first element of c defined by Optimization 2 and Algorithm 1, is stable in the sense that where L(t) = x(t) T Qx(t) + u(t) T Ru(t), and x(t), u(t) are the state and input of the closed loop system at time t.Proof: Let V(x(t)) denote the predicted cost (21) evaluated at the solution of Optimization 2 at time t.Then the feasibility and the suboptimality of the sequences T c and (α 1 , . . ., α N , α N ) at time t + 1 implies Taking expectations and summing each side of this inequality over t = 0, 1 . . ., τ − 1, we then obtain However, by Proposition 6 and (23), V(x) is necessarily bounded from below for all feasible x, so that lim τ →∞ 1 τ E 0 [V(x(τ ))] = 0, and the bound given by (25), therefore, follows when passing to the limit τ → ∞.
We now appeal to the "sampling and discarding" theory of random convex programs found in [14] and [15] to show that the chance constraint (5) holds with high confidence in closed-loop, in the sense that we are unlikely to draw samples q [l ] , such that (5) does not hold.To state the result, let F n ,s (p) denote the binomial distribution of s or fewer successes of probability p in n trials.We also use the notion of a support constraint, which is a sampled constraint for which the optimal cost decreases if it is removed [15, Definition 2.1].Fig. 2. Closed-loop trajectories for robust MPC [4] and stochastic MPC (Algorithm 1) for 50 realizations of {q(0), q(1), . ..}.

TABLE I ROBUST AND STOCHASTIC MPC WITH SAMPLE REMOVAL
N = 4, and taking n = 250 samples of q.The feedback gain K was chosen to be the LQ-optimal for the pair (A 0 , B 0 ): K = [1.310.97], and V was constructed by finding the maximal robustly invariant set respecting the constraint [−0.5 1]x ≤ 1, which is defined by 22 linear inequalities.For this example, dim(u) = 1 and Theorem 9 can be applied to give the relationship between n, r, and .This is shown in Fig. 1 for p = 0.9.In this case, a confidence value of 99% mandated that r = 14 samples could violate the sampled constraint.Simulations were carried out in MATLAB, using Yalmip [16] and the QP solver Gurobi.The resulting state-space trajectories are shown in Fig. 2.
Table I compares the closed loop properties of the robust and stochastic implementations for 500 realizations of model uncertainty, as well as comparing with the "scenario" approach of [8] when using the same heuristic for sample removal.The mean closed loop cost is 15%  lower in the stochastic case than the robust.The proportion of trajectories satisfying constraints (95.4%) implies conservativism in the stochastic MPC with respect to the probabilistic constraint, which requires [−0.5 1]x(t + 1) ≤ 1 to be satisfied for 90% of uncertainty realizations.However, this is expected from the confidence value of 99%, which requires 1 − r/n = 0.944 for n = 250.As more samples are used, 1 − r/n approaches 0.9 asymptotically.For a given n, Algorithm 1 appears slightly more conservative than the scenario MPC of [8] in terms of mean closed-loop cost, but unlike scenario MPC, Algorithm 1 guarantees feasibility in closed-loop.
It is also of interest to compare with the case that sampling is used, but with no sample removal (r = 0).Applying the chance constraint with p = 0.9 and 99% confidence requires n = 44 samples.Table II shows that Algorithm 1 gives a 12% improvement in closed-loop cost over robust MPC but requires similar computation time.It is notable that the scenario MPC of [8] has a lower computation time than Algorithm 1 when n = 44, but a higher computation time when n = 250.The reason is that the number of constraints in the optimization is O(nN ) for the scenario MPC of [8], but is O(n + ν(N − 1)) with ν > 44 in Optimization 2 as sampled constraints are only applied at the first prediction time.
Using robust MPC in this example leads to conservativism and a significant increase in closed-loop cost compared to the stochastic MPC algorithms.This is a result of having several independent sources of uncertainty: although each is uniformly distributed, the one-step ahead probability density function resulting from them will quickly fall to a small value as distance from the mode increases.Fig. 3 shows the "normalized cost" given by division of the mean stochastic MPC cost by the mean robust MPC cost against the proportion of active samples.The improvement is significant even when 1 − r/n = 0.98.

w = w 0 where 6 :
E and T are the matrices, such that c 0 = Ec and T c = [c T 1 c T 2 , . . ., c T N −1 0] T .Then, the predicted sequences of states and control inputs are generated by ζ k + 1 = Ψζ k + w.Proposition The cost (21) is given by
2, then P 1 ⊆ P 2 if and only if there exists a matrix H ≥ 0 satisfying

TABLE II ROBUST
AND STOCHASTIC MPC WITHOUT SAMPLE REMOVAL