Statistical Learning for Nonlinear Dynamical Systems with Applications to Aircraft-UAV Collisions

ABSTRACT This article investigates a physics-informed statistical approach capable of (i) learning nonlinear system dynamics by using data generated from a nonlinear system as well as the underlying governing physics, and (ii) predicting system dynamics with reasonable accuracy and a computational speed much faster than numerical methods. The proposed approach obtains the reduced-order model from the full-order governing equations. A function-to-function regression, based on multivariate Functional Principal Component Analysis, establishes the mapping between external forcing and system dynamics, while a multivariate Gaussian Process is used to capture the relationship between parameters and external forcing. In the application, the proposed approach is applied to predict aircraft nose skin deformation after Unmanned Aerial Vehicle (UAV) collisions at different impact attitudes (i.e., pitch, yaw and roll degrees). We show that the proposed physics-informed statistical model can achieve a 12% out-of-sample mean relative error, and is more than 103 times faster than Finite Element Analysis (FEA). Computer code and sample data are available on GitHub.


Introduction
Nonlinear dynamical systems are widely found in scientific and engineering applications.A nonlinear system is often described by Partial Differential Equations (PDE), but solving a highdimensional nonlinear PDE can be computationally expensive using numerical methods such as the Finite Element Analysis (FEA).An important question thus arises: given experimental or observational data generated from a nonlinear system under certain parameter settings, can a statistical model be constructed to learn and predict system dynamics under new parameter settings?Such a statistical approach needs to be interpretable following governing physics, reasonably accurate, significantly faster than direct numerical solutions, and systematically quantify the uncertainty associated with the predicted system dynamics.This is the main goal of this article.

Motivating Application and Objectives
The rapid growth of Unmanned Aerial Vehicle (UAV) within the National Airspace System has been identified as an emerging threat to manned aircraft by the Federal Aviation Administration (FAA) with more than 1.8M registered drones (Olivares 2017;FAA 2020).To ensure aviation safety of commercial flights, airborne aircraft-UAV collisions need be understood, assessed and mitigated during aircraft design and by new aviation regulations (FAA 2014;Joslin 2015;Olivares et al. 2017).It is prescribed in the Code of Federal Regulations 25.631 (bird strike) that an aircraft must continue safe flight and landing after impact with an 8-pound bird at the design cruising speed (CFR 2012).Considering the harder materials used for UAV, UAV-strike can cause potentially severer damage than bird-strike, posing a greater threat to the safe operation of commercial aircraft.
For aviation administration to design new UAV regulations and make new recommendations for airborne hazard severity thresholds to aircraft/UAV manufacturers, the current practice involves developing high-fidelity computer models, for example, Finite Element Analysis (FEA), to simulate collision processes under various impact conditions; for example, collisions at different impact attitudes (i.e., pitch degree, yaw degree and roll degree).Figure 1 shows the FEA-simulated deformation process of aircraft nose structure due to UAV collisions.In this figure, the UAV leaves permanent deformation on the aircraft nose structure.In our FEA experiment, it takes approximately 28 hr to generate such a collision process that lasts for only 8 milliseconds.Considering a potentially large number of collision scenarios and high computational cost, FEA can only be performed for a limited number of scenarios.Moreover, FEAbased approach requires repeatedly performing simulation runs to identify severe collision scenarios, and a massive amount of simulation data generated from this time-consuming process are not fully used and eventually discarded.
Motivated by the application above, this article investigates a physics-informed statistical learning approach capable of (i) learning nonlinear system dynamics by using the fundamental Figure 1.A motivating application-aircraft nose surface deformation due to the collision with a UAV (this nonlinear collision process lasts for only 8 milliseconds and is simulated by FEA that consumes 28 hr in our experiment).
physics laws and data generated from computer models at a limited number of experimental conditions, and (ii) predicting aircraft surface deformation for new collision scenarios (out-ofsample predictions) in a computationally efficient manner significantly faster than directly solving governing equations using numerical methods.The proposed statistical approach provides a viable complement to existing engineering practice for modeling nonlinear dynamical systems represented by aircraft-UAV collision, accelerating the penetration of statistics into domainknowledge-intensive engineering applications.

Literature Review
Physics-informed statistical learning has received much attention from the communities of statistics, engineering and applied mathematics.In the literature of statistics, for example, Lindgren and Rue (2011) established the explicit link between Gaussian Markov random fields and advection-diffusion processes described by stochastic Partial Differential Equations (sPDE).Sigrist, Kunsch, and Stahel (2015) proposed a Gaussian spatial-temporal process by solving an sPDE with spatially and temporally invariant advection-diffusion, and Liu, Yeo, and Lu (2021) extended the framework for spatially varying advection-diffusion.A summary of statistical spatio-temporal modeling with sPDE is provided in Cressie and Wikle (2011) and Krainski et al. (2019).In particular, statistical surrogate models play a critical role in physics-informed statistical modeling (Asher et al. 2015;Gramacy 2020).Surrogate models are typically constructed following two strategies: projectionbased and data-driven-based.For projection-based approaches, Mak et al. (2018) proposed the statistical model of large eddy simulations for design evaluation and physics extraction using the idea of Proper Orthogonal Decomposition (POD).For datadriven-based approaches, Gaussian Processes (GP) have been extensively used for constructing statistical surrogate models (Hung, Joseph, and Melkote 2015;Gu and Berger 2016;Deng et al. 2017;Gul et al. 2018;Kyzyurova, Berger, and Wolpert 2018;Zhang, Cole, and Gramacy 2021a;Sauer, Gramacy, and Higdon 2021;Zhang, Mak, and Dunson 2021b;Chen, Kang, and Lin 2021a;Chen and Tuo 2022).
Advances in physics-informed machine learning are also rapidly emerging.The goals often involve data-driven discovery of governing physics, or state/parameter/operator inference which are physically meaningful.Some recent results include (i) the Hidden Physics Model for learning PDE from noisy measurement data (Raissi and Karniadakis 2018;Raissi, Perdikaris, andKarniadakis 2017, 2018).Such an approach is related to statistical GP models and leverages the underlying physics by assigning GP priors to the latent solutions of nonlinear PDE.(ii) Physics-Informed Neural Networks (PINN)-another milestone of supervised learning that integrates laws of physics given by nonlinear PDE (Raissi, Perdikaris, and Karniadakis 2019).PINN involves simultaneously training two neural networks that respectively perform data-driven learning and discovery of physics in the form of PDE, and has been quickly adopted to generate a variety of new physics-informed machine learning approaches, such as the deep hidden physics models (Raissi 2018) (Qian et al. 2019;Swischuk et al. 2019a).Highdimensional physical processes or data are projected onto the leading principal components through POD to preserve system physics.Then, machine learning methods are employed to learn the mapping between physical parameters and the POD expansion coefficients.
In Section 2, we first provide a high-level overview of the proposed statistical modeling framework.Section 3 presents how the proposed framework is applied to learn and predict aircraft metal skin deformation processes governed by nonlinear structural dynamics equations.Section 4 presents comprehensive numerical investigations, comparison, validation and discussions on the proposed approach for aircraft-UAV collision severity assessment.Section 5 concludes the article and highlights some future research directions.

An Overview of the Proposed Framework
We consider a generic form of governing equations that characterize a wide spectrum of dynamical systems or processes, such as advection-diffusion, heat transfer, Navier-Stokes, Smoluchowski, Nernst-Planck, conservation laws, equation of motion, etc.Let ∈ R d and [0, T], respectively be the physical and time domains, and let s(x, t) = (s 1 (x, t), . . ., s d s (x, t)) T denote the d sdimensional vector state field where s j : × [0, T] → S j ⊂ R (Qian et al. 2019).Then, a dynamical system can be defined by a PDE: where g( s, t) = (g 1 ( s, t), . . ., g d s ( s, t)) T is a nonlinear function that maps the state field to its time derivative, and T is a time-dependent function that does not depend on the state of the system, for example, external forcing.Consider a finite difference discretization of the spatial domain, {x i ∈ } N i=1 , and let s(t) ∈ R Nd x be the state vector that collects the d s state variables from the N discretized locations, the PDE (1) leads to a system of Nd x ordinary differential equations: where g and f, respectively, discretize g and f .Note that, we include the parameter vector p into (2) so as to make the dependence between system dynamics and parameters explicit.Also note that, because g is nonlinear and the state vector s(t) often has a high dimension, high-fidelity numerical solutions of the PDE (2) can be expensive to obtain.
Problem Statement.The statistical model proposed in this article is focused on data arising from governing equations in the form of (2).Given either experimental, simulation or observational data generated from (2) under a set of parameter settings from the set P = {p 1 , p 2 , . . ., p N p }, we are interested in constructing a statistical model that captures the system dynamics while maintaining the interpretability of the statistical model based on (2).Using the statistical model constructed, one is able to efficiently and accurately predict the system dynamics (i.e., the evolution of system state vectors over time) under new parameter settings p / ∈ P. To achieve the objective above, this article proposes a physicsinformed statistical learning framework sketched in Figure 2. The framework consists of three connected steps.
(Step 1) Generating training data from the governing equation such as (2).In the numerical example presented in Section 4, the aircraft-UAV collision processes are simulated by FEA at selected impact conditions (i.e., a set of parameter settings).
(Step 2) Building the Reduced-Order Model (ROM) using the training data generated from Step 1. Very often, the dimension of the full-order physics is too high to be directly used to construct statistical models, and a ROM is thus needed to provide a lowrank representation of the full-order physics.In this article, we obtain a nonlinear ROM by projecting the full-order governing equation to a low-dimensional space spanned by POD bases obtained from the training data generated in Step 1.Let V be the matrix basis, the ROM takes the following form: where s r (t) = V T s(t) is the low-dimensional reduced-order state of the system.( Step 3) Constructing the statistical model that captures the relationship between parameters and the dynamics of system states.The proposed framework addresses two challenges.The first challenge is that the ROM (3) contains a nonlinear component V T g(Vs r (t), t; p) that cannot be further reduced, meaning that the nonlinear term cannot be evaluated without solving the original high-dimensional model.The existence of the nonlinear component in the ROM motivates us to consider a functionto-function regression that directly learns the mapping between the external force, V T f in (3), and the reduced-order states of the system, s r .This is achieved by multivariate Functional Principal Component Analysis (mFPCA) that projects both the reduced-order state and external force into low-dimensional spaces spanned by orthonormal eigenbases.Then, the mapping between the coefficients within the two low-dimensional spaces can be established by regression.The second challenge is to establish the relationship between parameters and external force (which is needed for prediction).The proposed framework addresses this challenging by a multivariate Gaussian Process Regression (mGPR) model that captures the relationship between parameters and external force, completing the construction of the statistical model.
Once the statistical model has been established, one is able to predict the system dynamics at new parameter settings (e.g., the prediction of aircraft surface deformation processes at new collision conditions not included in the training data).In particular, the mGPR model first predicts the external force for a given parameter setting, and the mFPCA then predicts the system state dynamics given the predicted external force.Uncertainty propagation can also be well quantified during this two-stage statistical prediction procedure.

Statistical Modeling of Aircraft-UAV Collision Processes
This section focuses on the physics-informed statistical modeling of aircraft-UAV collision processes using data generated from FEA, and presents how the model can in turn be used to predict the collision processes under new impact conditions.

The Nonlinear Structural Dynamics behind Aircraft-UAV Collisions
Airborne collisions present significant challenges in solid mechanics, including geometrical nonlinearity (e.g., deformation and rotation) and material nonlinearity (e.g., plasticity).Motivated by the conservation in mass, momentum and energy, the equation of motion governing the nonlinear structural dynamics behind aircraft-UAV collision is given by Wu and Gu (2012 where ρ is the mass density, u i , σ i and f i are the displacement, stress and force along direction i (i = 1, 2, 3), div(•) and represent divergence and the space domain, and u i (0), ui (0) and üi (0) are the initial conditions.Note that, the stress σ i (t) is a vector with three components, {σ ij } j=1,2,3 , respectively along the jth direction.Hence, the divergence of σ i (t), that is, div(σ i (t)), contains the derivatives along the three directions (Wu and Gu 2012).The boundary conditions are given by u i (t) = U i (t) on u and σ i • n = g i (t) on s , where U i (t) and g i (t) are respectively the prescribed displacement and traction on boundaries u and s , and n is the outward normal vector.
Let v i be a test function that satisfies the homogeneous displacement boundary conditions (i.e., the prescribed v i = 0 on u ).By multiplying v i to (4) and the boundary conditions, and then respectively integrating the products in the space domain and on s , we obtain The differential governing (5) can be numerically solved by the Finite Element Method (FEM).In particular, the FEM generates the mesh grid, and computes the displacement at each mesh entities known as "elements"; see Figure 3. Let N be the total number of nodes (depending on the number of elements of the discretized structure), the displacement along direction i at a given spatial location x and time t is interpolated by u , where u n i (t) is the computed displacement at the nth node and n (x) is the interpolation kernel function over the spatial domain.Similarly, the test function v i can be expressed by Substituting the u i (x, t) and v i (x, t) into (5), and for any m ∈ {1, 2, . . ., N} and any direction i ∈ {1, 2, 3}, we have where Let u(t) be a 3N × 1 vector that collects the displacement along all three directions at spatial location x 1 , x 2 , . . ., x N , we obtain the dynamical system that captures the temporal dynamics of ü(t): where M ∈ R 3N×3N is the mass matrix, and f int ∈ R 3N×1 and f ext ∈ R 3N×1 are respectively the internal and external forces: where f int i (x, t) and f ext i (x, t) are the internal and external force components at location x and time t along direction i for i = 1, 2, 3. Note that, the surface displacement process governed by (8) depends on the impact conditions.To make this parameterization explicit, let p be a set of parameters characterizing the impact condition, we rewrite (8) such that it falls into the generic form of (2): ( 9 )

Projection-Based Reduced-Order Model using Data from FEA
At any collision condition characterized by parameter p, we obtain the displacement vector u(t; p) and the force vector f ext (t; p) at time t for all N spatial locations along all three directions using FEA.Let P = {p 1 , p 2 , . . ., p N p } be a set of collision parameters (based on which the FEA is performed), and let T = {t 1 , t 2 , . . ., t N t } be a set of simulation time steps, we obtain the surface displacement snapshot data from FEA: as well as the external force snapshot data: Here, D u and D f are 3N × N t N p matrices.The force vector f ext (t; p) is the calculated nodal forces on the aircraft structure parts.Because the high-fidelity FEA only provides the joint forces (the superposition of contact forces and the reaction forces by surrounding rivets), we treat these nodal forces as the boundary force manifold; see Appendix A.
In addition, FEA also returns the mass matrix M (needed in the statistical model).Because the computation of the inverse of the mass matrix is extremely time-consuming at each FEA simulation step, we adopt the Jacobi iterative method that replaces the mass matrix M with a diagonal matrix M whose ith entry on the diagonal line is given by Mii = 3N j=1 M ij , i = 1, 2, . . ., 3N (Larson and Bengzon 2013).
Next, the two snapshot matrices, D u and D f , and the mass matrix M are used to obtain the reduced-order model.Because the full-order governing physics (9) often has a high dimension, we reduce the dimension of the full-order model ( 9) by projecting the high-dimensional vector u(t; p) to a low-dimensional space with a dimension K 3N as follows: where contains the orthogonal POD bases, and q(t; p) ∈ R K is a vector of the reduced-order states.In the example presented in Section 4, the dimension of q(t; p) is chosen as 30, while the dimension of u(t; p) is 40, 086.
The POD bases are found by the Singular Value Decomposition (SVD) of the transformed snapshot D = M 1 2 D u : where U ∈ R 3N×N t N p and Z ∈ R N t N p ×N t N p are both orthogonal matrices (U T U = I 3N and Z T Z = I N t N p ), and (Swischuk et al. 2019b).Closely related to the Karhunen-Loève expansion in stochastic process modeling, POD was introduced for turbulent flows by Lumley (1967) and  has been successfully used to compute the reduced bases in different application domains (Cressie and Wikle 2011;Benner and Willcox 2015;Mak et al. 2018;Qian et al. 2019).
Following the Eckart-Young-Mirsky theorem, the reducedorder bases consist of the first L left singular vectors of D, denoted by U * , and minimizes the projection error of the snapshots D onto all possible K-dimensional orthogonal bases in R 3N , that is, the best approximation of the column space of D among all K-dimensional subspace with K ≤ rank( D): where di is the ith column of D and The dimension K can be chosen by controlling the least squares error for some pre-specified e ∈ [0, 1] (Guo and Hesthaven 2018).
Finally, let V = M− 1 2 U * and substitute the projection (12) into the full-order governing (9), we obtain the nonlinear reduced-order model (note that, V T MV = I ∈ R K×K ): q(t; p) + V T f int (Vq(t), V q(t); t, p) = V T f ext (t; p). (15)

Statistical Learning based on the Reduced-Order Model
As discussed in Section 2, the reduced-order model ( 15) is nonlinear and cannot be explicitly computed without solving the original expensive FEA.This is because the evaluation of f int requires computing the original high-dimensional state u(t) using FEA (i.e., the computation scales with the original dimension 3N rather than K).In some cases when f int does have a special structure, it is possible to perform further reduction of f int .
For example, the nonlinear Burgers' equation can be linearized by the Cole-Hopf transformation, and Euler equations can be transformed into a quadratic structure which is fully preserved by the reduced-order model (Qian et al. 2019).In ( 15), however, f int does not fall into that category and can only be evaluated by expensive FEA.Hence, one seemingly straightforward way is to treat f int as a black box and use statistical learning approaches to learn the mapping f int , so as to avoid the expensive FEA.This is known as the hyper-reduction and often involves evaluating f int at a subset of the original state vector u(t) and computing all state variables in u(t) through interpolation.However, hyper-reduction does not work well in our application.To elaborate, in (15), both the reduced-order state q(t; p), its derivatives q(t; p) and q(t; p), and the external force V T f ext (t; p) can be obtained from FEA.Hence, the outputs of f int (Vq(t), V q(t); t, p) at each time step t can be calculated, enabling the statistical learning that approximates the function f int which takes q(t; p) and q(t; p) as its inputs (e.g., neural networks, multivariate tree-based methods, etc.).Once the input-output mapping of f int is learned, the (discretized) temporal evolution of the reduced-order state q(t) can be iteratively computed as follows: Here, the initial conditions are q(0) = q(0) = 0, f int is learned using the hyper-reduction approach described above, f ext is determined by the impact conditions, and the central difference method can be used to compute q(t) from q(t) and q(t).
However, our investigation shows that the approach ( 16) is unlikely to perform well.The main reason lies in the nonlinearity of the function f int .The errors associated with the predicted q(t) and q(t), at each time step, are quickly accumulated and magnified (the errors are inevitable and are mainly from the statistical modeling/approximation of f int ).As a result, the predicted trajectory of q(t), after a few iterations in ( 16), quickly deviates from its actual path obtained from FEA even if we are able to quantify the uncertainty associated with the output of the statistical model.It is also worth noting that, the approach outlined in (16) works well when f int is linear.To provide more insights, we present additional results on a classical example-the Thomas Young's double slit experiment-where the governing equation is a linear ODE; see Appendix B in the supplementary material.

Mapping between External Force and Reduced-Order
States.The chaotic nature of this nonlinear problem motivates us to directly establish the mapping between the reduced-order state q(t) and the (cumulative) external force F(t) ≡ t 0 s 0 V T f ext (u; p)duds.For any k = 1, . . ., K, let the kth component of q(t), q k (t), be a random function of time in L 2 (T ).In our application, q k (t) is square integrable and has a continuous mean function qk (t).Hence, the reduced-order state q(t) = (q 1 (t), q 2 (t), . . ., q K (t)) T is a vector in the K-dimensional vectors space of functions in L 2 (T ).Let q Z (t) = q(t) − q(t), and we represent q Z (t) using the multivariate Functional Principal Component Analysis (mFPCA) approach: where ξ q = (ξ q 1 , . . ., ξ q L ) T , ψ q (t) = (φ q 1 (t), . . ., φ q L (t)) T , φ q l (t) is an orthonormal eigenbasis such that φ q l (t), φ q r (t) = δ lr with δ lr being the Kronecker delta, and ξ The FPCA is based on the Karhunen-Loève decomposition of q Z (t).Similarly, let F Z (t) = F(t) − F(t), the mFPCA representation of F Z (t) is given by where ψ Hence, for given bases ψ q (t) and ψ F (t), the reduced-order state q(t) and the external force F(t) are completely determined by a set of coefficients ξ q and ξ F .To establish the mapping from ξ F to ξ q (i.e., from "force" to "deformation"), we consider a model: where = ( 1 , . . ., L ) T is a zero-mean noise with covariance and B = {b ml } m=1,...,M,l=1,...,L is a M × L matrix.The matrix B in (19) can be estimated from the data generated by FEA from all N p impact conditions.In particular, we explicitly let ξ q (p i ) be the coefficient obtained from q(t; p i ) under the impact condition p i for i = 1, 2, . . ., N p , and let ξ F (p i ) be the coefficient obtained from F(t; p i ) under the impact condition p i .Then, it follows from (19) that: where ξ q = (ξ q (p 1 ), ξ q (p 2 ), . . .T , and E = ( (p 1 ), . . ., (p N p )) T with (p i ) being a zero-mean column vector with covariance for i = 1, 2, . . ., N p .Note that, • There are two reasons why we model the relationship between cumulative force and deformation: (i) from the law of motion, instantaneous force corresponds to acceleration while cumulative force is associated with deformation; (ii) from the statistical learning and computation perspectives, cumulative force profiles are much smoother than that of instantaneous force and thus easier to be captured by a smaller number of mFPCA bases.In Appendix C of the supplementary materials, we show that the instantaneous force profiles require more mFPCA bases, increasing the dimension of the problem and computational time.
• It is important to see that the estimation of B does not depend on L and M, but on N p and M. Note that, ξ F is an N p ×M matrix (20).If N p ≥ M, ξ F has a full column rank of M, and the left inverse of ξ F , that is, ((ξ F ) T ξ F ) −1 (ξ F ) T , always exists such that B is uniquely determined by B = ((ξ F ) T ξ F ) −1 (ξ F ) T ξ q .If N p < M, the system is under-determined.In our numerical example, we have N p = 35 impact conditions and M is chosen to be 12, which satisfies the condition N p ≥ M.
• The model ( 19) is in fact equivalent to a popular functionto-function regression: where the bivariate coefficient function β(s, t) and noise ζ (t) are respectively represented by Chen et al. (2021b) proposed a novel function-on-function kriging model, and the key novelty is to use the spectral-distance (SpeD) to capture the correlation among input profiles.In our article, the multivariate function-to-function regression is used to capture the relationship between impact force and system states in the spectral domain.Under each collision condition, the input profiles consist of the temporal evolution of multiple (reduced-order) force coordinates in the space spanned by orthogonal POD bases, and these profiles are thus independent of each other.Also note that, in Striegel et al. (2022), input and output profiles are represented by the coordinates in the low-dimensional space spanned by POD bases, and the relationship between these coordinates are established through a linear regression model.In this article, we also represent the input force and system states by the coordinates in the low-dimensional space spanned by POD bases, but the coordinates of the force and system state evolve over time.Hence, functional PCA is required so that the time-varying coordinates of force and system states are represented by their PCA scores which do not change over time.
When making prediction of the reduced-order states q(t) at a new collision condition, one needs to first predict ξ F determined by that collision condition.Hence, a model is needed to capture where is a semidefinite diagonal covariance matrix, k = k(p i , p j ) + δ ij σ 2 with δ ij and σ 2 being the Kronecker delta and the variance of the additive Gaussian noise.Here, we use Exponential kernel for k(p i , p j ), that is, k(p i , p j ) = σ 2 0 exp(−(p i − p j ) A −1 (p i −p j )/2), where A is a diagonal matrix with unknown entries and the variance σ 2 0 is an unknown scalar.Based on the mGPR model, [g(p 1 ) T , g(p 2 ) T , . . ., g(p N p ) T ] T ∼ MN (0, K ⊗ ) with K being a N p × N p column covariance matrix whose (i, j)th element is given by k (p i , p j ).At a new collision condition p * , it follows from the well-established results that and we can obtain ξ q * from ( 19), ( * B+ .The hyper-parameters in K and ˆ can be obtained by minimizing the (negative) log-likelihood Once ξ q * has been obtained, it follows from the mFPCA (17) that Finally, based on the POD (12), the predicted original highdimensional state vector, that is, the surface deformation u * (t), is obtained: (26) Equations ( 23)∼( 26) not only map the collision parameter p * to the surface deformation û * (t), but also show how uncertainty propagates through each stage of this highly interpretable process, that is, from condition parameters, to external force, and to deformation.

Numerical Experiments, Results and Discussions
This section provides numerical investigations and discussions about the proposed physics-informed statistical learning for aircraft-UAV collision severity assessment.

FEA Set-Up, Experimental Design and Data Generation
FEA Set-Up.Essential details of the FEA simulations are first provided.We use FEA to generate the training and testing data  so that the statistical model can be established and validated (through cross-validation).
As shown in Figure 4(a), we consider the aircraft nose metal skin deformation process due to UAV collisions at different impact attitudes (i.e., pitch degree, yaw degree and roll degree).The length of each simulation time step is t = 0.02 millisecond in our FEA, and the impact velocity is fixed to 151 m • s −1 along the impact direction as shown in Figure 4.
In our experiment, the 3D FEA model is pre-processed in the CATIA and HyperMesh environments to extract the geometry information (Lu et al. 2020(Lu et al. , 2021)).The extracted geometries are used as the inputs to the simulation platform, PAM-CRASH, which generates the mesh, constructs the connections (rivets) between parts, sets boundary conditions, defines material properties, and prepares other computational settings including contact parameters, length of time steps, integration points, etc.In particular, In the aircraft nose finite element model, all structural parts are modeled with shell elements.The average size of mesh is 14 mm and the overall number of elements is 691,221.In particular, 13,033 elements are used for the aircraft nose metal skin and the dimension of u(t) in ( 9) is 40,086.Aluminium alloy 2524, 2024, 7075, 7050, 6061 are defined in the nose model and Johnson-Cook constitutive model is adopted to describe the elastic-plastic behavior of metal materials.A total number of 8,226 rivets that connect different structural parts is simulated in PAM-CRASH by the Plink model (Lu et al. 2020(Lu et al. , 2021)).The fixed boundary condition of the aircraft nose is shown in Figure 4(b).
In the UAV finite element model, 5044 solid and 8900 shell elements are both incorporated because some UAV parts like motors and battery cannot be modeled by shell elements.Computational material models are chosen to describe the behavior of different UAV parts; for example, the Li-Po battery cells are modeled by the type-II crushable foam model.
Experimental Design and Data Generation.In this numerical investigation, we consider 35 combinations of collision pitch, yaw and roll degrees for the UAV's flight attitude.The 35 experimental conditions are generated from the MaxPro space-filling design (Joseph, Gul, and Ba 2015), and the ranges for the three attitude degrees are all from −45 • to 45 • .The design is shown in Figure 5. Hence, a total number 35 high-fidelity FEA runs are performed to generate the data (for training and testing purposes).Each FEA run is performed on a computer configured with 16 GB RAM and Intel Xeon E-2124G CPU @ 3.41 GHz, and it takes ∼1680 min to simulate a collision process that lasts for only 8 milliseconds (strongly demonstrating the need for faster and interpretable statistical models).
For the 35 collision conditions, Figure 6 shows the deformation (in mm) of the aircraft nose metal skin 8 milliseconds after collisions.It is clearly seen that different UAV impact attitudes cause different deformation on the aircraft nose metal skin with different severity levels.The shape, depth and size of the dents on the aircraft nose metal skin change as the impact attitudes vary, justifying why collision severity predictions are needed within the design space of impact conditions.

Numerical Results
Using the data generated from our FEA experiments, Section 4.2.1 demonstrates the application of the proposed approach at one selected impact condition; Section 4.2.2 performs Leave-One-Out Cross-Validation (LOOCV) using the data generated from all 35 impact conditions, and provides further discussions on the proposal approach.

Investigation-I: Aircraft Nose Deformation at a Given
Impact Condition In Investigation-I, the FEA simulation data from 34 impact conditions are used to train the statistical model.After that, the statistical model is used to predict the aircraft nose skin deformation at the remaining impact condition (indicated by a triangle in Figure 5), and the predicted result is compared with the actual deformation simulated from FEA.
The reduced-order modeling (12) requires the rank of POD, K, to be determined.We perform the SVD of the transformed snapshot data from 34 impact conditions (see ( 13)), and Figure 7 shows the fast decay of the (log) singular values in POD modes, and a low-rank POD truncation can already provide low leastsquares errors.For example, keeping 5, 10, 20, and 30 POD modes can respectively retain 96.23%, 98.29%, 99.10%, and 99.42% of the total variation in data.The 1st to the 10th POD bases are shown in Figure 8.Note that, we obtained the POD bases where the column v i ∈ R 3N contains the displacements (either positive or negative) along the X, Y, and Z directions.For a better visualization, we plot in Figure 8 the total displacement of the N nodes, that is, squared root of the sum of the squared displacements along all three directions.We see that the ten POD modes successfully capture the prominent deformation patterns, including the major deformation areas directly impacted by the UAV, and other major deformation areas (away from the collision center) due to the propagated stress waves.
Based on the selected POD bases, the reduced-order state q(t) and external force F(t) in the reduced-order model ( 15) are obtained for the 34 collision conditions.At each impact condition, q(t; p) are F(t) are respectively vectors in the 10dimensional vectors space of (time-dependent) functions in L 2 (T ).As shown in Figure 9, for each POD basis, there exist 34 time-dependent curves representing the component in q(t) and F(t) corresponding to the 34 collision conditions.The functionto-function regression ( 19) is used to establish the functional relationship between the profiles of q(t) and F(t).The ranks, L and M of the mFPCA ( 17) and ( 18) are chosen as L = 20 and M = 12 which explain 97.80% and 97.43% of the variation, respectively (In Investigation-II, the effects of the chosen ranks, L and M, are further investigated).Finally, an mGPR model ( 22) is constructed to predict the external force F(t) at the 35th collision condition which is not used for model training.
Figure 10 first shows the (out-of-sample) predicted temporal trajectories of the nose skin deformation and their 95% confidence band at three selected nodes under the 35th collision condition.The three nodes are particularly selected such that • node #1720 and #1799 are respectively located within a major deformation area where the collision happens in the earlier (first 2 milliseconds) and later stage (after 2 milliseconds).
• node #1667 is located within a major deformation area away from the impact center and the deformation is induced by propagated stress wave.
Both the actual (from FEA) and the predicted (from our statistical model) deformation at the three nodes along the X, Y, are Z directions are shown in Figure 10.We see that the predicted trajectories present the same temporal dynamics as the FEA outputs, and the 95% confidence bands cover the true trajectories.It is worth noting that, the predictions appear to be very accurate at nodes #1799 and #1667, while at node #1720, the confidence band becomes wider.This observation can be well explained by the underlying physical collision process.Due to the configuration of UAV, different flight attitudes determine the parts of UAV that first hit the aircraft nose structure (e.g., camera, motor, etc.), creating higher uncertainty of the structural response at node #1720.At nodes #1799 and #1667, on the other hand, the deformation is less impacted by the UAV parts that first hit the aircraft, and smaller uncertainties are thus expected at these two nodes.
Figures 11-12 show the actual, predicted and standard deviation of the deformation fields along the X, Y, and Z directions and the total deformation.The proposed physicsinformed statistical modeling approach successfully generates accurate out-of-sample predictions of the surface deformation at a new collision condition (not included in the training dataset).
It is worth noting that the proposed statistical approach is significantly faster than FEA simulations (16 GB RAM and Intel Xeon 3.41 GHz E-2124G CPU).As shown in Figure 13, for impact condition #35, the high-fidelity FEA simulation consumes 1680 min, while the proposed statistical approach only requires 95.53 min.In fact, once the statistical model has been trained (which consumes 94.15 min), no repeated model training is required and the time used for predicting the collision process at a given impact condition only requires 1.38 min in our experiment.Although parallel computing can accelerate the computation for both FEA and the statistical approach, the encouraging result presented above strongly justifies the significant potential of using physics-informed statistical models for aircraft-UAV collision assessment (as a viable complement to high-fidelity compute simulations) and similar engineering/scientific problems residing in domain-knowledgeintensive environments.For one example, the statistical model can be used for guiding sequential experimental design that determines the experimental conditions for subsequent FEA simulations, so that the aircraft-UAV collision assessment can be significantly accelerated.

Investigation II: Cross-Validation of Model
Performance Using the data generated from the 35 collision conditions, Leave-One-Out Cross-Validation (LOOCV) is performed to investigate the performance of the proposed statistical model.We focus on the deformation 8 milliseconds after the collision (i.e., the final moment), and use the Mean Relative Error (MRE) as the performance measure (Mak et al. 2018 Figure 14 first shows the LOOCV MRE for all impact conditions.It is seen from this figure that the MRE of the predicted surface deformation at the final moment ranges between 10% and 20%.It is also seen that the MRE generally decreases over time, which is related to the use of global POD bases.Because of the relatively smaller deformation at the earlier stage of a collision, more POD modes (e.g., more than 30) are needed to capture these smaller scale dynamics.In addition, at the earlier stage of a collision, the denominator of MRE is so small that a little discrepancy in the numerator can lead to a large MRE value.
Next, we present the LOOCV MRE for the final-moment surface deformation for all 35 impact conditions in Figure 15(a).In particular, we investigate the effects of the number of POD bases on the model performance by ranging the value of K from 5, 10, 20 to 30 (which retain 96.23% to 99.42% of the total cumulative energy).It is clearly seen that, as long as the number of POD bases is not less than 10, the MRE is below 20% for all 35 impact conditions, and less than 15% for 24 out of the 35 impact conditions.The average MRE is close to 12%.We also investigate the effects of the chosen ranks, L and M in ( 17) and (18), on the MRE. Figure 15(b) shows the LOOCV MRE for the final-moment surface deformation for all 35 impact conditions and for different choices of L and M (the number of POD bases is fixed to 30 in this figure).We see that the MRE is very robust against L and M, and the overall MRE remains at 12% for a range of combinations of L and M.
To quantify the uncertainty of MRE, we randomly sample from ξ q ∼ N( ξ q , ˆ q ) for 1000 times.For each sampled ξ q , the reduced-order system state q (i) (x, t) and the highdimensional system state u (i) (x, t) are computed respectively using the mFPCA and POD, for i = 1, 2, . . ., 1000.Finally, the MRE corresponding to each u (i) (x, t) is computed.The boxplot of MRE, for all 35 impact conditions, is shown in Figure 15(c).In this figure, we also include the MRE which is computed by directly inserting ûq into (27) (shown by the blue solid line).Note that, the MRE obtained in this way is not the same as the mean MRE of the boxplot.It is easy to see that,

Conclusions
This article investigated a physics-informed statistical modeling approach for nonlinear dynamical systems as represented by the aircraft-UAV collision.The proposed statistical approach is motivated by the governing physics and is constructed using data generated from computer simulations.It has been shown that the model provides accurate out-of-sample predictions with its LOOCV MRE ranging between 10% and 20%.The interpretability of the proposed model has been greatly enhanced by integrating fundamental physics.The uncertainty associated with the predicted system dynamics has also been systematically quantified by leveraging the advantages of statistical approaches.
Finally, the proposed model has been shown to be at least 10 times faster than FEA when considering both model training and prediction and the prediction step is 10 3 times faster than that of FEA.Such findings strongly demonstrate the great potential of using physics-informed statistical model for engineering applications that traditionally rely on expensive computer simulations and field experiments.Handling nonlinear dynamics is known to be a challenging task.In this work, the nonlinear term is bypassed by directly establishing the mapping from external force and reduced-order state.Ideally, if the nonlinear term can be explicitly accounted for by combining statistical learning and nonlinear model reduction techniques, both the model accuracy and interpretability are expected to be further enhanced.In addition, the numerical example only considered the impact attitudes as input parameters.When the parameter space becomes larger with more parameters, such as speed, UAV weight, materials, etc., future research is needed to investigate the performance of the proposed framework.

Figure 2 .
Figure 2.An overview of the proposed physics-informed statistical learning framework for learning and prediction of nonlinear dynamical systems.

Figure 3 .
Figure 3.An example of the Finite Element mesh grid.

Figure 4 .
Figure 4. (a) Aircraft (nose)-UAV collision and the coordinate system for UAV's flight attitude (pitch, yaw, and roll degrees).The length of each simulation time step is 0.02 millisecond, and the impact velocity is set to 151 m • s −1 along the impact direction; (b) The fixed boundary condition of the aircraft nose in our FEA experiment.

Figure 5 .
Figure 5.A total number of 35 impact conditions generated by space-filling design from the design space of UAV's flight attitudes (pitch, yaw, and roll).The triangle indicates the impact condition which is used to demonstrate the proposed approach in Section 4.2.1.

Figure 6 .
Figure 6.Deformation (in mm) of the aircraft nose metal skin for the 35 collision conditions at time step 400 (i.e., 0.02 × 400 = 8 milliseconds after collisions), and different UAV impact attitudes cause different deformation on the nose metal skin with different severity levels.

Figure 7 .
Figure 7.The change of singular values (top) and relative cumulative energy (bottom) against the number of POD modes used for constructing the reduced-order model.

Figure 9 .
Figure 9. Profiles of the POD scores of displacement and cumulative external force obtained for the 34 collision conditions.

Figure 10 .
Figure 10.The actual (from FEA) and the predicted (from our statistical model) deformation at three selected nodes along the X, Y, are Z directions.

Figure 11 .
Figure 11.(a) Predicted deformation field along the X direction; (b) Predicted deformation field along the Y direction (top: predicted; middle: actual; bottom: standard deviation).

Figure 13 .
Figure13.The proposed statistical approach is approximately 17 times faster FEA simulations, and more than 10 3 times faster than FEA for prediction when no repeated model training is needed, justifying the great potential of physics-informed statistical learning for engineering/scientific problems residing in domain-knowledge-intensive environments.

Figure 14 .
Figure 14.LOOCV MRE at all impact conditions with 30 POD bases.

Figure 15 .
Figure 15.(a) LOOCV MRE for the final-moment surface deformation for all 35 conditions: the MRE is below 20% for all 35 impact conditions, and less than 15% for 24 out of the 35 impact conditions; (b) LOOCV MRE for all 35 conditions with different ranks of mFPCA bases: the MRE is robust against L and M; (c) Boxplots that show the uncertainty associated with the LOOCV MRE for all 35 conditions, and the line is obtained by directly inserting ûq into (27).
) is the actual deformation at the new condition p * (from FEA), û(x, t; p * ) is the predicted deformation, and denotes the aircraft nose surface. *