A Distributionally Robust Optimization Approach to Two-Sided Chance-Constrained Stochastic Model Predictive Control With Unknown Noise Distribution

In this work, we propose a distributionally robust stochastic model predictive control (DR-SMPC) algorithm to address the problem of multiple two-sided chance constrained discrete-time linear systems corrupted by additive noise. The prevalent mechanism to cope with two-sided chance constraints is the so-called risk allocation approach, which conservatively approximates the two-sided chance constraints with two single chance constraints by applying Bool's inequality. In this proposed DR-SMPC framework, an exact second-order cone approach is adopted to abstract the multiple two-sided chance constraints by considering the first and second moments of the noise. With the proposed DR-SMPC algorithm, the worst-case probability of violating safety constraints is guaranteed to be within a prespecified maximum value. By flexibly adjusting this prespecified maximum probability, the feasible region of the initial state can be increased for the SMPC problem. The recursive feasibility and convergence of the proposed DR-SMPC are rigorously established by introducing a binary initialization strategy for the nominal state. A simulation study of a single spring and double mass system was conducted to demonstrate the effectiveness of the proposed DR-SMPC algorithm.


Introduction
Conventional control methods (e.g., linear quadratic regulator and dynamic programming) have been extensively investigated to solve the problem of a discrete-time linear system.However, these control techniques are not able to deal with constraints on the system state or the control input which becomes more and more important due to the intensifying pace towards era of safe autonomy.To address this challenge, as a promising and efficient solution, model predictive control (MPC) has been proposed to solve a finite horizon, constrained optimization control problem at each sampling time and to determine a finite sequence of control actions.MPC has attracted considerable attention to both industry and academia over the last couple of decades (Rawlings & Mayne , 2009).
In almost all practical applications, the behavior of the system is falsified by various uncertainties, e.g., unknown parameters, external disturbances, and process noises.In the presence of uncertainties, the controller may fail to either guarantee safe operation or meet quality specifications.If the bound of the uncertainties can be quantified or know a priori, deterministic robust model predictive control (RMPC) approaches have been developed to address these intractable uncertainties (Mayne, Seron & Rakovic , 2005;Magni & Scattolini , 2013).It should be highlighted that in the context of RMPC, the robust constraints satisfaction, recursive feasibility and stability are all established in a conservative manner by solving a min-max optimization problem − that is minimizing the cost but considering the possible maximum impacts (worst-case) of uncertainties.Furthermore, the complicated inflation of the uncertainty quantification set throughout system dynamics, safe constraints, optimization and control loops may result in a very small feasible region of the initial states and even infeasible solutions of the optimization problem.
To tackle these drawbacks, stochastic model predictive control (SMPC) delivers a promising way to reduce the inherent conservativeness of RMPC by developing chance constraints that allow the constraints to be violated with a prespecified maximum probability (PsMP) (Mesbah , 2016;Farina, Giulioni & Scattolini , 2016;Korda & Cigler , 2012).SMPC provides an adequate way to achieve trade-off between the closed-loop control performance and the constraints satisfaction (Mesbah, Streif, Findeisen & Braatz , 2014).The potential of chance constraint quantification can be seen from many practical applications, e.g., concentrations on a chemical reactor control (Farina, Giulioni & Scattolini , 2016) or comfort on robot obstacle avoidance (Jha, Raman, Sadigh & Seshia , 2018).As a byproduct of chance constraints, the feasible set of the initial state can be enlarged under SMPC even without changing the prediction horizon.
In recent years, there have been some achievements to address SMPC problem.Broadly speaking, SMPC approaches can be classified into two categories (Farina, Giulioni & Scattolini , 2016).The first category is called randomized approaches (Schildbach, Calafiore, Fagiano & Morari , 2012;Hewing & Zeilinger , 2020;Schildbach, Fagiano, Frei & Morari , 2014;Shang & You , 2019;Mark & Liu , 2020).The randomized approaches use samples/scenarios of the noise to approximate the SMPC problem.Probabilistic closed-loop guarantees can be established by using scenario optimization tools (Hewing & Zeilinger , 2020).However, this needs a large amount of sampling data so that it consumes considerable computing time.Furthermore, it is generally quite hard to rigorously establish recursive feasibility and closed-loop stability of these approaches (Mark & Liu , 2020).
The second category is referred to as analytical approximation approaches (Korda & Cigler , 2012;Vinod, Sivaramakrishnan, & Oishi , 2019;Farina, Giulioni, Magni & Scattolini , 2013, 2015;Hewing, Wabersich & Zeilinger , 2020).In this setting of approaches, if a prior distributional information about the noise is known, the inverse of the cumulative probability function can be used to reformulate the chance constraints (Korda & Cigler , 2012;Vinod, Sivaramakrishnan, & Oishi , 2019).However, it is usually quite difficult to know the true probability distribution of process noise in practice, which may result in infeasibility or undesirable behavior of the system under SMPC.A milestone in this area (Farina, Giulioni, Magni & Scattolini , 2013, 2015) is to use Cantelli's inequality to approximate the chance constraints by taking into account the mean and variance of noise.Despite the computing burden of the SMPC is greatly reduced, it can only deal with one-sided chance constraint.However, in practical applications, two-sided constraints are pervasive, e.g., the acceleration of vehicles (Lorenzen, Dabbene, Tempo & Allgöwer , 2017) and the temperature change of the the building heating room (Shang & You , 2019) can be either positive or negative.Consequently, it is of great practical significance to investigate two-sided chance constrained SMPC problem.One feasible manner to deal with two-sided chance constraints is to develop risk allocation mechanisms -that is decomposing the constraints into a set of one-sided constraints which conservatively approximate the original chance constraints.
The risk allocation mechanisms mainly follow two strategies below.The first strategy usually uses a uniform allocation strategy to obtain a fixed violation probability value for each one-sided constraints (Nemirovski & Shapiro , 2006).This approach simplifies the optimization problem, however it may lead to significant conservatism in many situations due to the loss of active risk allocation.To address this limitation, the second strategy takes the risk allocation as the decision variables of the optimization problem (Paulson, Buehler, Braatz & Mesbah , 2020;Blackmore, Ono , 2009).The risk allocation mechanisms brought by the Boole's inequality essentially narrow feasible set of states on SMPC scheme.
To overcome the conservativeness for two-sided chance constraints handling of existing SMPC approaches, a distributionally robust optimization (DRO) approach is proposed to solve the two-sided chance constrained SMPC problem.Distributionally robust chance constraints (DRC2s) have direct connection to the chance constraints incorporated in the classical paradigms of SMPC (Rahimian & Mehrotra , 2019).DRC2s assume the actual distribution of noise belongs to an ambiguity set which contains all distributions with a predefined characterizations (e.g., first or second moments of noise).Given the aforementioned benefits, DRC2s are considered in this work, where a less conservative approximation approach for DRC2s is developed by using a second-order cone (SOC) reformulation.Consequently, this proposed distributionally robust SMPC (DR-SMPC) can be reformulate as a convex optimization problem.While assigning the value of the true measurement state to the nominal state may lead to infeasible optimization problem, a binary initialization strategy is further developed to determine the nominal state, guaranteeing recursive feasibility and convergence for the proposed method.
The differences between this paper and other related DRO works (e.g., (Mark & Liu , 2020;Zhang, Shen & Mathieu , 2017;Li, Tan, Wu & Duan , 2021)) are summarized as follows.In Mark & Liu (2020) and Li, Tan, Wu & Duan (2021), the DRC2s are first introduced into SMPC problem.Only one-sided chance constraint is reformulated into computable expression which are incapable to deal with two-sided chance constraints.In Zhang, Shen & Mathieu (2017), a SOC programming reformulation of a general two-sided DRC2s is proposed to solve an optimal power flow problem, which is a normal optimization problem rather than in the context of a receding horizon optimization paradigm.While this paper proposes a new mechanism to address a two-sided distributionally robust chance constrained SMPC problem.Furthermore, specific form of terminal constraint for state is imposed, and the recursive feasibility and convergence are established.The main contributions are threefold, which are summarized below.
1.The proposed approach delivers a tractable conic optimization solution to handle two-sided chance constraints by using an ambiguity set the first and second moments of noise.2. A less conservative approach based on SOC is developed to abstract the DRC2s, which provides larger initial feasible set as compared to the existing risk allocation approaches.3. The recursive feasibility and convergence is rigorously established by introducing a binary initialization strategy to determine the nominal state.
The rest of this paper is organized as follows.We state the two-sided chance-constrained SMPC problem in Section II.Section III shows DR-SMPC-P1 can be converted into a convex SOC program.The recursive feasibility and convergence of the algorithm are proved in Section IV.Simulation results are reported in Section V to demonstrate the effectiveness of the proposed DR-SMPC algorithm.Finally, we end this paper by making some concluding remarks in Section VI.
2 Problem Statement

Stochastic System
We consider a discrete-time linear system with additive stochastic noise where is an unknown stochastic noise with known mean µ = 0 and covariance matrix W 0. The known pair (A, B) is assumed to be stabilizable.
Given the state x k at time step k, the predicted state is able to be updated according to where l|k (l ∈ N 0 ) is the predicted l steps ahead at time step k.

Two-sided DRC2s
The state and input are subject to chance constraints (3) where sets containing the origin in its interior with constant matrices H T ∈ R p×nx , F T ∈ R q×nu and constant vectors h ∈ R p , f ∈ R q , and p x , p u ∈ (0, 1] are the PsMP that the constraints x l|k ∈ X and u l|k ∈ U are allowed to be violated. Chance constraints are also named as Value-at-Risk (VaR) constraints, which are in general non-convex feasible set in SMPC optimization problem.In most SMPC approaches, the chance constraints are reformulated into deterministic feasible set through a common assumption that the probability distribution P of stochastic noise w k is known.However, this does not match the noise in the real environment where the true distribution P is unknown.Since this is an undesirable fact, a distributionally robust version of ( 3) and (4) in Mark & Liu (2020); Li, Tan, Wu & Duan ( 2021) is introducing as with an ambiguity set P = P : where P is the set of all probability distributions on P and E [P] [•] denotes the expectation under the distribution P.
Different from the existing SMPC approaches, chance constraints (3) and ( 4) are investigated in the DRO framework.
The idea of the DRO is optimizing the 'worst-case' distribution among all of the possible distributions in P as shown in ( 5) and ( 6).
In practical applications, the most of constraints are presented as two-sided constraints in Shang & You (2019); Lorenzen, Dabbene, Tempo & Allgöwer (2017).Therefore, it is of practical significance to study two-sided chance constraints in the context of SMPC.The two-sided DRC2s are defined as follows inf P∈P with constant vectors a ∈ R nx , c ∈ R nu and constants b ∈ R, d ∈ R.Although the expressions ( 8)-( 9) are complex, we will show that they are easier to be reformulated into deterministic constraints.

Cost Function
The objective function is defined as a sum of quadratic stage costs, given by where are two known positive definite weighted matrices, the terminal weight matrix S ∈ R nx×nx satisfies the following assumption.
Assumption 1 The terminal weight matrix S is chosen as the solution of the following Lyapunov equation where K is a feedback gain to be computed.

Optimization Problem
Now we formally state the DR-SMPC optimization problem with two-sided DRC2s over a task horizon N as DR-SMPC-P1, which follows: min x 0|k = x k .
(16) Unfortunately, it can be observed from ( 12)-( 16) that DR-SMPC-P1 contains several sources of intractability, that is (i) the expectation in ( 12) is taken the unknown probability measure; (ii) the two-sided DRC2s ( 14)-( 15) are generally intractable and nonconvex.In next section, we shall develop a computational method to address these challenges, DR-SMPC-P1 is efficiently achieved by reformulating it into a conic optimization problem, which is computationally tractable.

DR-SMPC Algorithm
In this section, we shall reformulate DR-SMPC-P1 into a computationally tractable conic optimization problem.To this end, we shall find a state feedback structure.

Feedback Structure
We define the predicted nominal state and input as xl|k , ūl|k , and the nominal dynamics model of ( 2) is expressed as To obtain the computable form of cost function and two-sided DRC2s, a state feedback control law (Mayne, Seron & Rakovic , 2005;Hewing, Wabersich & Zeilinger , 2020) is designed as following where K is a selected feedback gain, the term ūl|k replaces u l|k as the new decision variable in DR-SMPC-P1.
A stochastic error between the real state x k and the nominal state xk at time k is commonly denoted as Based on (2), ( 17) and ( 18), the state error dynamic system is expressed as: At time step k = 0, a proper initialization is used, i.e., x0 = x 0 , and thus recalling that the noise is zero mean, the expected value of the stochastic error is E [P] [∆x k ] = 0, thus, the predicted covariance matrix Σ l|k is able to be updated as: The equivalent sets of the two-sided DRC2s are formed by simply substituting x l|k = xl|k + ∆x l|k and u l|k = ūl|k + K∆x l|k into ( 14)-( 15), which gives

Convex Formulation of DR-SMPC-P1
Based on the results above, we show that DR-SMPC-P1 can be represented as a convex cone program problem and hence is computationally tractable.
Theorem 1 DR-SMPC-P1 is exactly reformulated as following conic optimization problem, which are referred to as DR-SMPC (25) where ūk = ū0|k , ū1|k , • • • , ūN−1|k , y (1,l) , y (2,l) , λ (1,l) and λ (2,l) are the newly introduced decision variable, Xf is terminal constraint to guarantee recursive feasibility and convergence of the proposed approach, and Note that DR-SMPC is now converted to a conic optimization problem, which is computationally tractable and can be solved by using standard software package such as CVX (Boyd and Vandenberghe , 2004).The optimization is carried out over the optimal nominal control sequence ū * k , the first element ū * k of ū * k is applied to the the feedback control law u * k = K(x k − xk ) + ū * k .This is crucial to solve the two-sided chance constrained control problem.However, the determination of the terminal set as well as the feasibility set of nominal initial states is rather difficult but particularly important for establishing recursive feasibility and convergence of the proposed approach.

Determination of Terminal Constraint
It is well-known that terminal constraint is closely related to the stability of the nominal system.Thus, the following terminal constraint is imposed and the set Xf is a positively invariant set satisfying To achieve the recursive feasibility of the proposed approach in Section IV, the terminal set is chosen as following where y, λ are extra variables, Σ is the steady state solution of the Lyapunov equation ( 20) computed as following To facilitate the proof of recursive feasibility, we explicitly point out the technical assumption that is necessary.
It can be observed from ( 20) and ( 33) that , therefore, it must hold Xf ⊆ Xl|k .According to Assumption 2, there must be ūN|k = K xN|k ∈ ŪN|k such that a T (A + BK)x N |k ≤ a T xN|k ≤ y + λ.The chosen terminal set Xf satisfies the conditions of ( 30) and (31).

Binary Initialization Strategy
It is quite clear that the initial condition xk is critical to performance index.At each step, the most recent information available on the real state should be used to reset the nominal state xk .Specifically, the selected "optimal" current value of xk is set to x k .However, the possibility of unbounded noises cannot be completely ruled out, the choice of xk = x k may cause an infeasible optimization problem, and the basic property of recursive feasibility would be lost.Therefore, the following binary initialization strategy is defined to guarantee the recursive feasibility.
Strategy 1 -Using the most recent information available on the measured state at time step k, i.e., xk = x k .
Strategy 2 -Using the updating information according to the following past optimal solution, i.e., xk = xk|k−1 , The choice of binary initialization strategy requires to solve two optimization problems at each time step, i.e., DR-SMPC with Strategy 1 and DR-SMPC with Strategy 2. The rules for adopting Strategy 1 and Strategy 2 are stated as follows: DR-SMPC with Strategy 1 is first solved, then, if it is infeasible, DR-SMPC with Strategy 2 must be executed.On the contrary, if it is feasible, the optimal cost between DR-SMPC with Strategy 1 and Strategy 2 need to be compared.If the optimal cost of Strategy 1 is higher, DR-SMPC with Strategy 2 should be solved.Although the binary initialization strategy does not guarantee optimality, the convergence properties of the proposed method is obtained.

DR-SMPC Algorithm
Based on Theorem 1 and binary initialization strategy of nominal state, the implementation of the DR-SMPC algorithm is summarized as Algorithm 1 given below.
Algorithm 1 DR-SMPC Off-line: Compute the feedback gain K and the terminal weight S according to assumption 1 and assumption 2; On-line: if DR-SMPC with Strategy 1 is infeasible then Compare the optimal cost between DR-SMPC with Strategy 1 and DR-SMPC with Strategy 2; 6: if the optimal cost of DR-SMPC with Strategy 1 is higher then Update k = k + 1; 13: end while

Recursive Feasibility and Convergence
As an important property of the SMPC, the recursive feasibility is a precondition for the convergence properties of the stochastic dynamic system, which makes great impacts on the implementation of controller.

Recursive Feasibility
The following theorem plays an important role to establish recursive feasibility of the proposed DR-SMPC algorithm.

Convergence
Since the optimal cost of DR-SMPC with Strategy 2 is lower, DR-SMPC with Strategy 2 is uesed.Therefore, we state the main result concerning the convergence properties of the algorithm by the following theorem.
Theorem 3 Let J(x k , ū * k ) be the optimal cost of DR-SMPC with Strategy 2 at time step k, then PROOF See Appendix C.
According to theorem 3 and similar arguments to the paper (Lorenzen, Dabbene, Tempo & Allgöwer , 2017;Li, Tan, Wu & Duan , 2021), we can give a standard argument following as This indicates that the state of the system is driven to a neighborhood of the steady state condition.Then, the DR-SMPC algorithm is convergent for system (1) under the control law (18).

Simulation Example
In this section, two practical control systems under different types of noises are provided to tested the performance of the proposed DR-SMPC algorithm.Specifically, the noise is subject to Gaussian distribution in a Buck-Boost DC-DC power converters.To further show the superior of the proposed algorithm, we consider a twomass spring system in which the noise is subject to Laplace distribution in the second example.The proposed DR-SMPC algorithm is compared with G-SMPC algorithm in Korda & Cigler (2012) and P-SMPC algorithm in Paulson, Buehler, Braatz & Mesbah (2020).

A Buck-Boost DC-DC Power Converters
In this example, an ideal Buck-Boost circuit is considered (Lorenzen, Dabbene, Tempo & Allgöwer , 2017), the state It is well known that the current through the inductor violates the rated current with a small probability will not cause damage to its life cycle.Similarly, it is also practical to express the voltage constraint of resistance as chance constraint.
The chance constraints on state and input are given as The control objective of this example is to design a MPC control law to regulate the state to near the origin in the presence of the noise.The prediction horizon is chosen as In Fig. 1, we compare the feasible sets obtained with the proposed DR-SMPC algorithm, G-SMPC algorithm and P-SMPC algorithm.Apparently, the feasible set of the proposed DR-SMPC algorithm has 1.15 times the size of the feasible set of P-SMPC algorithm, in view of the reformulation of two-sided chance constraints, the P-SMPC algorithm is more conservative than the DR-SMPC algorithm.On the other hand, the feasible set of the G-SMPC algorithm is larger than these of the the DR-SMPC and P-SMPC algorithms since the G-SMPC algorithm utilizes the distribution information of the noise.

A two-mass spring system
A single spring and double mass system (Shang & You , 2019) is illustrated in Fig. 2. In this figure, the two blocks with mass m 1 and m 2 are linked by springs of coefficient k s .The manipulated variable is the applied force u and the output variables are the distances and speed of the mass block movement x 1 , x 3 corresponding to block one and x 2 , x 4 corresponding to block two.w 1 and w 2 are external uncertainties acting on m 1 and m 2 , respectively.By applying Newton's law to the first and second block, we obtain Defining the state as x = [x 1 , x 2 , x 3 , x 4 ], w = [w 1 , w 2 ], the discrete state space by Euler's approximation method for a sampling time T s is where the two blocks mass m 1 = 1, m 2 = 1, the elastic constant k s = 1.25 and sampling time T s = 0.1.In each step, w k is subject to Laplace distribution with zero-mean and variance 0.07I 2 .The quadratic cost matrices Q = diag(1, 1, 4, 6), R = 1 reflect the expensiveness of thruster use in space.The initial condition is chosen as x 0 = [0.5, 0.5, 0, 0] T towards the origin as the desired state, the prediction horizon N = 7.For keeping the state steady, we consider constraints on state and input as The G-MPC algorithm, P-SMPC algorithm with fixed uniform risk allocation and the proposed DR-SMPC algorithm based on a Monte-Carlo simulation of 1000 runs in Fig. 3-5.The result of is violating constraints is provided in Fig. 6, in out of 1000 sample trajectories, the biggest number of sample trajectories violating the constraints for the three algorithms are respectively 462, 149 and 28.The probability violating the constraints on the P-SMPC algorithm  is 2.8%, this is due to the two-sided chance constraints is approximately broken into two single chance constraints that lead to conservatism However, the violation probabilities of G-MPC algorithm exceeds the set value 20%, since deviation of the assumed distribution from the true one caused by poor assumptions result in unwanted behavior of the system.As a consequence, we can draw a conclusion that P-SMPC algorithm is more conservative than DR-SMPC in the reformulation of two-sided chance constraints.For the G-SMPC algorithm, the assumption of gaussian noise distribution may not be consistent with the real scene, which will lead to the opposite result.

Conclusions
This paper advocates a DRO approach to SMPC under unbounded stochastic nosie following an unknown distribution.
A conic representation of two-sided DRC2s is obtained, which contributes to a tractable formulation of the SMPC problem, based on which the recursive feasibility and convergence are established.Furthermore, numerical results showed that the performance of the proposed algorithm is much better than the P-SMPC algorithm with fixed uniform risk allocation and G-SMPC.

Theorem 2
If DR-SMPC with Strategy 2 is feasible at time step k, then, DR-SMPC with Strategy 2 remains feasible at each time step.PROOF See Appendix B.
and the selected K =[−0.28,0.49].In order to state the fact that feasibility set of the proposed algorithm increases, we define the feasibility set of the initial nominal state on SMPC algorithm as following F = {x 0 ∈ X |SMPC with initial states x0 is feasible}.

Figure 1 :
Figure 1: Plots of the feasibility sets for G-SMPC, DR-SMPC and P-SMPC algorithm in example 1. m

Figure 6 :
Figure 6: Number of trajectories violating constraints in each time step in example 2.