Robust Subspace Tracking with Contamination Mitigation via α-Divergence

We studied the problem of robust subspace tracking (RST) in contaminated environments. Leveraging the fast approximated power iteration and α-divergence, a novel robust algorithm called αFAPI was developed for tracking the underlying principal subspace of streaming data over time. αFAPI is fast and it outperforms many RST methods while only having a low complexity linear to the data dimension. Some experiments were conducted to illustrate the performance of αFAPI.


INTRODUCTION
Subspace tracking (ST), which corresponds to the problem of estimating the underlying subspace of streaming data over time, plays an important role in adaptive signal processing [1].In practice, ST has already been applied to many applications, from direction of arrivals (DOA) [2] and channel estimation [3] to video background/foreground separation [4].The emergence of big data streams has recently led to several challenges for streaming data analysis and subspace tracking in particular [5,6].Among them is the problem of dealing with uncertainty, imperfection, and contamination which may occur at any level in modern online applications [6].ST handling such issues is referred to as robust ST (RST).Particularly, RST requires not only robustness against data corruption together with high subspace estimation accuracy but also fast implementation.Over the years, many RST methods have been proposed for specific scenarios (e.g., sparse outliers and missing data) and we refer the readers to [7][8][9][10] for good surveys.Most of the existing (robust) subspace trackers are, however, not designed for dealing with contaminated noises (i.e., non-standard Gaussian noises) which have been observed in several signal processing applications [11].In the literature, there exist some RST methods capable of tracking the underlying subspace in contaminated environments [10].To deal with impulsive noises for example, a few of them were developed by incorporating robust statistics into the well-known projection approximation subspace tracking (PAST) algorithm [12], such as RPAST [13], MCC-PAST [14], and BNC-PAST [15].Some others were based on the adaptive Kalman filtering technique, e.g., KFVM [16] and its variant KFVNM [17].Another good approach is the weighted recursive least-squares (WRLS) method.Some notable WRLSbased RST trackers are ROBUSTA [18], ROBUSTQR [19], RYAST [19], and TRPAST [20].To deal with colored noises, the authors in [21] proposed two variants of PAST, namely IV-PAST and extended IV-PAST, using instrumental variable (IV) technique.Another IV-based PAST's variant with a variable forgetting factor and a variable regularization was proposed in [22].Oblique projection is also a potential direction to deal with colored noises, e.g., obPAST [23] and obYAST [24].In parallel, there are several RST algorithms capable of dealing with sparse outliers, such as GRASTA [25], PETRELS-ADMM [4], ReProCS [26], and FedOA-PM [27].They are mainly based on using ℓ p -norm minimization techniques.Despite having some advantages, most of the subspace trackers above are either designed for a specific kind of noise or sensitive to a particular choice of algorithmic parameters.This motivates us to develop a new generalized robust subspace tracker to avoid the drawbacks.Contribution: The main contribution of this paper is a robust subspace tracker called αFAPI which is capable of tracking successfully the underlying subspace over time in the presence of data corruption.αFAPI provides several appealing advantages over the state-of-the-art subspace trackers.Among them is that it offers robustness against outliers and contaminated mixture noises.Specifically, a robust version of the sample covariance matrix with a novel weight is introduced to mitigate the detrimental effect of such contamination, thanks to α-divergence.As a result, αFAPI outperforms most of the existing (robust) ST algorithms.In addition, αFAPI is a fast and efficient subspace tracker which only requires a low complexity linear to the dimension of data.This feature is highly beneficial for dealing with high-dimensional and high-velocity data streams in fast time-varying environments.Moreover, αFAPI guarantees the orthonormality of the estimated subspace basis which is useful to improve the numerical stability and also for some tracking applications, such as DOA estimation.

PRELIMINARIES
In this section, we first formulate the problem of robust subspace tracking and then recall the well-known FAPI algorithm on which we highly leverage for developing our method.After that, we briefly introduce the alpha-divergence which is exploited to design a novel weight against data contamination.

Subspace Tracking Problem
In this work, we consider the following signal model Here, x(t) ∈ C n×1 is an observation vector, ℓ(t) ∈ C n×1 is the low-rank component of x(t) which lives in a subspace spanned by a deterministic rank-r matrix1 A ∈ C n×r (i.e., ℓ(t) = As(t) where s(t) ∈ C r×1 is a coefficient vector), and n(t) ∈ C n×1 is a noise vector independent of the signal ℓ(t).On the arrival of x(t) at time t, we want to estimate a principal subspace such that it covers the span of A.
Ideally, the underlying subspace can be derived from the spectral analysis of C xx (t) = E{x(t)x(t) H }. Denote by {λ i , u i } the i-th pair of eigenvalue and eigenvector of C xx (t) and assume that the eigenvalues are sorted in decreasing order.Accordingly, the matrix is not always available in practice, we often use the following sample covariance matrix as a surrogate where 1 ≥ β > 0 is a forgetting factor aimed to discount the effect of past observations and to facilitate the tracking ability of subspace estimators in nonstationary environments.

FAPI Algorithm
We begin with the conventional power iteration method for computing the dominant eigenvectors of C xx (t): where U(t−1) is the old estimation of the subspace at t−1 and R(t) ∈ R r×r is a square root of C xy (t) H C xy (t).Badeau et al. in [28] proposed the well-known FAPI algorithm which offers a fast approximated implementation of (3).In particular, the following orthogonal projection was exploited: , where Θ(t) plays the role of a state transition matrix.As a result, we can rewrite the matrix C xy (t) in (3) as follows with y(t) = U(t − 1) H x(t). Based on (4), the underlying subspace U(t) can be recursively updated by and Θ(t) can be an inverse square root of I+∥e(t)∥ 2 2 g(t)g(t) H .We refer the readers to [28] for its derivations and variants.

α-Divergence
α-divergence is a family of divergence measures generalizing the Kullback-Leibler divergence (KLD) [29,30].There are Algorithm 1: αFAPI ] and Z = Ir MAIN PROGRAM: for t = 1, 2, . . ., N do // Compute weight several forms, modifications, and extensions of this divergence (see [30] for a good survey).Here, we adopt the form considered in [20,31] where g and f are two given distributions and α ∈ R/{0, 1}.At α = 0 and α = 1, D α {p || q} is defined as limiting cases for α → 0 and α → 1, respectively.Specifically when α → 1, D α {p || q} boils down to the KLD between p and q.One of the most appealing features of α-divergence is that it can offer robustness to outliers and non-Gaussian noises which is then exploited in the next section.

PROPOSED METHOD
In this section, we introduce a novel robust variant of FAPI, called αFAPI, to track the underlying principal subspace using α-divergence.The main steps of αFAPI are summarized in Algorithm 1.In the following, we show how to incorporate this divergence into FAPI in order to bolster its robustness against data contamination.Robust Sample Covariance Matrix: Instead of C xx (t), we propose to use the following robust sample covariance matrix ) Here, 0 ≤ ω(t) ≤ 1 plays as a weight to mitigate the effect of corruption caused by, for examples, outliers and impulsive noises.In the presence of strong corruption, a small weight ω(t) is set to reduce its impact on the subspace tracking.When ω(t) = 1, Cxx (t) becomes the sample covariance matrix C xx (t) in (2).The form of ω(t) will be provided later.
where the vector ḡ(t) is given by Now, we can utilize the fast implementation of FAPI to update Z(t) in ( 13) and the underlying subspace U(t) with a linear complexity as follows.Let and ξ(t) = 1 − τ (t)∥ḡ(t)∥2 .The state transition matrix Θ(t) can be given by which is an inverse square root of I + ϵ 2 (t)ḡ(t)ḡ(t) H . Finally, substituting ( 16) into ( 13) and ( 5) results in where How To Choose The Weight?Here, we propose to use the following weight: where 0 < α ≤ 1 and 0 < p ≤ 2. Particularly, the choice of ω(t) stems from the following observations.Assume that the assumed density and true density of (1) are, respectively, f (x, U) and g(x, U * ) = (1−δ)f (x, U * )+δh(x) where f (x, U * ) is the "nominal" one, h(x) is to represent contamination in the data, and δ is to control the contamination proportion.The authors in [20] indicated that when data samples are corrupted, the underlying principal subspace can be obtained by minimizing α-divergence D α (g(x, U * )||f (x, U)) which is defined as in (10).In practice g(x, U * ) is not known generally, we can solve the following optimization instead see [20, appendix] for details.In parallel, the multivariate Gaussian density is often adopted for modeling f (x, U) (i.e., standard Gaussian noise).In such a case, ( 20) is equivalent to with c = (1 − α)/2.Under the assumption that U(t)U(t) H ≃ U(t − 1)U(t − 1) H , taking the gradient of ℓ α (.) with respect to U results in the following score function ℓ ′ α (x(t), U) ≃ exp(−c∥e(t)∥ 2 )(x(t) − Uy(t))y(t) H . (23) Interestingly, (23) can lead to a weighted recursive leastsquares (LS) estimation with a weight relatively proportional to exp(−c∥e(t)∥ 2 ).On the other hand, the principal subspace of Cxx (t) in (11) can also be derived from minimizing the following exponentially weighted LS function [18] As a result, we can set ω(t) = exp(−c∥e(t)∥ p ) with 0 < p ≤ 2.
The introduction of p is to remedy the sensitivity of αFAPI to the choice of α, see Fig. 1 for an illustration.Some features of ω(t) are as follows "clean" data: lim outlier data: lim for every 0 < α < 1 and 0 < p ≤ 2. When α = 1, ω(t) = 1 which corresponds to the case where observations are outlier-free.
Complexity: With respect to the space complexity, αFAPI only requires a nr + r 2 words of memory to save U(t) and Z(t) at time t.With respect to the computational complexity, the overall cost of αFAPI is O(nr) flops, similar to the conventional FAPI algorithm.

Robust Subspace Tracking
Signals {x(t)} t≥1 were generated under the following model: Here, the underlying subspace A(t) ∈ R n×r was simulated as A(t) = A(t − 1)+εV(t) where V(t) was a normalized Gaussian noise matrix and ε > 0 was the time-varying factor aimed to control the subspace variation over time.At t = 0, each element in A 0 (i, j) was independent and identically distributed (i.i.d.) from N (0, 1).The subspace coefficient s(t) ∈ R r×1 was an i.i.d.Gaussian random vector of zero mean and unit variance.The noise vector n(t) ∈ R n×1 was derived from the following contaminated mixture model: ) where (1−δ) and δ were the mixing probabilities (i.e., δ represents the contamination rate 2 ), σ n > 0 was a standard deviation of the "main" noise distribution, and the pair {µ ≥ 0, η ≥ 1} was to define the "contaminated" distribution.To measure the tracking ability of ST algorithms, we used the following Subspace Estimation Performance (SEP) metric where (.) # denotes the pseudo-inverse operator and U(t) is the tracked subspace at time t.The smaller the value of SEP is, the better accuracy the tracker has.The experimental results were averaged over 10 independent runs.We set the experimental parameters as follows: data dimension n = 50, rank r = 5, the time varying factor ϵ = 10 −3 , the noise level σ n = 1, the contaminate rate δ = 0.2, and µ = η = 1.
The forgetting factor β was fixed at 0.99 for all subspace trackers.Both TRPAST and αFAPI are dependent on the αdivergence parameter which was set to 0.9.Fig. 1 illustrates the performance of αFAPI with different choices of α and p.
It can be seen that when p < 2, it can remedy the sensitivity of αFAPI to the particular choice of α.Thus, we fixed p = 1.5 for αFAPI in all performance comparisons.Fig. 2(a) and (b) show the performance of subspace tracking algorithms when abrupt changes (at t = 400, 600, and 800) were created by strong contaminated mixture noises {µ = 10, η = 5} and strong uniform noises with magnitude in interval [0, 10], respectively.FAPI and GRASTA failed to track the underlying subspace.RYAST and ROBUSTA provided better estimation accuracy than the two former trackers, but they were still affected by abrupt changes.TRPAST and αFAPI were capable of tracking successfully the underlying subspace over time.Specifically, the convergence rate of αFAPI was faster than that of TRPAST.

Direction of Arrival (DOA) Tracking
In this task, we considered a uniform linear array with 20 sensors (two adjacent sensors were separated by half of a wavelength) and 3 target signals impinging on it.The steering matrix had the form A(t) = [a(ω 1 (t)), . . ., a(ω 3 (t))] with a(ω k (t)) = [1, e jω k (t) , . . ., e j(n−1)ω k (t) ] ⊺ where ω k (t) was the angular frequency associated to the k-th source at time t, i.e., ω k (t) = π sin θ k (t).The 3 × 1 vector s(t) (source signal) is a complex normal random vector following a covariance matrix C s = I 3 .The background Gaussian noise level was fixed at SNR = 0dB.To create abrupt changes, data observations were corrupted by the contaminated mixture noises (with δ = 0.2, µ = 10, η = 5) from t = 500 to t = 599.
To track the DOAs of the moving sources (by linearly varying the value of θ k (t)), we first applied ST algorithms to estimate the subspace A(t) and then used the well-known ES-PRIT method [32] to extract {ω k (t)} 3 k=1 .The results are illustrated in Fig. 3.We can see that FAPI, RYAST, and RO-BUSTA were capable of tracking the DOAs over time when the background noises were Gaussian.However, they failed to deal with the contaminated mixture noises.Only TRPAST and αFAPI were robust to abrupt changes due to such noises.

CONCLUSIONS
In this paper, we have addressed the problem of robust subspace tracking in the presence of nonstandard Gaussian noises.A novel robust adaptive algorithm called αFAPI has been proposed to track the principal subspace of data streams over time.The experimental results indicated that αFAPI was capable of dealing with different data contamination and outperformed other (robust) subspace tracking algorithms.