Adaptive Joint Channel Estimation/Data Detection in Flexible Multicarrier Mimo Systems — A Tensor-Based Approach

Multiple-input multiple-output (MIMO) systems employing multicarrier modulation (MCM), including flexible MCM that unifies several MCM schemes, have been well studied recently also in their tensor-based formulation. The latter naturally allows for (semi-)blind joint channel estimation/data detection (JCD), of particular interest in future generation massive MIMO systems, given the uniqueness properties of the tensor decomposition (TD) models. Of course, in the presence of time-varying channels, adaptive JCD needs to be implemented. However, the rich literature on recursive TD has only very little been considered in this context and only for orthogonal frequency division multiplexing (OFDM). In this paper, online TD is adapted to realize adaptive JCD for time-varying flexible MIMO-MCM systems, with the emphasis put on the single-input multiple-output (SIMO) case. The common assumptions underlying online TD schemes are revisited on this occasion and shown to be inadequate to cope with fast-fading channels. This is demonstrated via simulation results on two of the most popular MCM systems. The results of this work can serve to motivate further study of online TD to extend its applicability in this and other demanding application contexts.


INTRODUCTION
Filter bank-based multicarrier (FBMC) systems have been extensively studied over the last two decades or so as a modulation solution that extends the classical cyclic prefix-based orthogonal frequency division multiplexing (CP-OFDM) scheme, mitigating a number of its drawbacks and widening its application range [1].Several FBMC variants have been considered as candidate waveforms for 5th generation (5G) communications [1] and, in their multiple-input multiple-output (MIMO) configurations [2], proved to fit well in the massive/mmWave MIMO concept [3,4].As shown in [5], the 5G waveforms can be better viewed and understood in the framework of filter bank-based modulation.
Given the difficulties that the size of massive MIMO systems implies in training-based estimation, semi-blind joint channel estimation/data detection (JCD) has re-gained interest for future generation communication systems [6], including, of course, multicarrier modulation (MCM)-based ones (e.g., [7]).Tensor-based models and methods [8] have been extensively studied in this context (see, e.g., [9] and references therein) because of their inherent ability to capture the relations among the various system's dimensions (including space, time, and frequency) in a way that is unique under mild conditions and/or constraints.Thus, applying tensor decomposition (TD) in the received (Rx) signal tensor may help estimate the channel(s) and recover the transmitted (Tx) symbols, at little or no training cost [9].This was recently demonstrated in [9] for MIMO systems using flexible MCM, a unifying framework encompassing a number of FBMC schemes, including among others CP-OFDM and offset quadrature amplitude modulation-based FBMC (FBMC/OQAM) [10].The Rx signal tensor was shown to admit a canonical polyadic decomposition (CPD) (also known as parallel factor (PARAFAC) analysis [8]) model, for which the commonly used alternating least squares (ALS) fitting algorithm, with projection onto the input symbol constellation, includes the iterative JCD schemes classically employed in MIMO-MCM systems as its special cases [9].Important estimation and detection performance gains from the adoption of such a semi-blind tensor-based receiver were demonstrated over its training-only-based counterpart [9].
Of course, in the presence of time-varying channels, involved in modern and future wireless communications characterized by high mobility, adaptive JCD should be implemented, allowing the channel variations to be tracked in time.The literature on adaptive receivers for MCM systems mostly comprises Kalman filtering-based approaches (e.g., [11][12][13][14]), which rely on, e.g., scattered pilots and interpolation among them to realize channel tracking.The statistics needed to implement the Kalman filter, whenever a-priori unknown, can be estimated along with the filter parameters.This is made possible via an extended (nonlinear) version of the filter [11] or dual Kalman filtering [13].
Given the results of [9] and the availability of a large variety of recursive TD methods [15], it thus makes sense to consider tensor-based adaptive JCD schemes for flexible MIMO-MCM systems, which will require little or no training overhead and ask for no statistical information.It should be noted that (to the best of our knowledge) recursive TD has only very little been studied in this context and only for MIMO-OFDM [16].In this paper, the (historically first) online TD method of [17] of the recursive least squares (RLS) type, RLST, is adapted to the TD model of [9] to realize adaptive JCD for time-varying flexible MIMO-MCM systems, with the emphasis put on the single-input multiple-output (SIMO) case.The assumptions underlying online TD schemes include the invariably made one of the factors in the non-evolving modes being only slowly varying [15].In the present context, this is translated to the requirement of slow channel variation and is verified below, via simulations, to be inadequate to cope with fast-fading channels, contrary to what was originally claimed in [16].Simulationbased results are included, for a realistic transmission model, on two of the most popular MCM systems, namely CP-OFDM and FBMC/OQAM.The results of this work can serve to motivate further study of the otherwise powerful online TD to overcome this limitation and extend its applicability in this and other demanding application contexts involving fast changes in more than one of the tensor dimensions.
Notation.The superscript † stands for Moore-Penrose pseudoinverse.The symbols ⊗ and ⋄ denote Kronecker and Khatri-Rao products, respectively.diag(A) is the column vector on the main diagonal of the matrix A, whereas Diag(a) stands for the diagonal matrix with the vector a on its main diagonal.The nth mode unfolding (or matricization) of a higher-order tensor A is denoted by A (n) .ȷ = √ −1 is the imaginary unit.

SYSTEM MODEL
2.1.Flexible FBMC Consider a MIMO system, with N T Tx and N R Rx antennas, employing flexible MCM, with a synthesis filter bank (SFB) and an analysis filter bank (AFB) per Tx and Rx antenna, respectively.The output of the tth synthesis filter bank (SFB), t = 1, 2, . . ., N T is formed as where M and N are the (even) number of subcarriers and the total number of FBMC symbols transmitted, respectively, x is the (phase rotated) symbol transmitted at the frequencytime (FT) point (m, n) from the tth antenna, and g m,n (l) ≜ g(l − nM ss )e ȷ 2π P ml , with g being the prototype filter impulse response, assumed real symmetric of length L g and unit energy, and M ss and P respectively denoting the symbol period and the subcarrier period (in samples).The length of the resulting FBMC signal is m,n are real (pulse amplitude modulated (PAM)) symbols, ϕ m,n is such that ϕ m,n mod π = (m + n) π 2 , and L g = KM (or ), with K being the overlapping factor [1].The result of the AFB at the (p, q) FT point of the rth Rx antenna, r = 1, 2, . . ., N R , in a back-to-back configuration, can be written as [9] with I p,q m,n being the inter-symbol (ISI) and inter-carrier (ICI) interference weights, commonly known as selfor intrinsic interference.This can be often assumed to be (approximately) confined in the first-order time-frequency neighborhood, Ω p,q , of the FT point under consideration, based on the assumption of prototype filter designs with good time-frequency localization.In general, the interference weights follow a 3 × 3 timefrequency pattern detailed in [9, Eq. ( 8)].Note that no such interference exists in CP-OFDM.In that case, the virtual symbol c (t) p,q , defined (see (1)) as the response of the transmultiplexer (TMUX) at the FT point (p, q) to x (t) p,q , coincides with d (t) p,q .Making the common assumption that the channels are of sufficiently low (relative to the FBMC symbol size) frequency selectivity so that their frequency responses, h (r,t) ≜ , are invariant over Ω p,q , and assuming perfect synchronization, the samples received at the rth AFB output can be written in the form of an M × N matrix as [9] where W (r) is built from the (assumed) white Gaussian frequency-domain noise, and the m,n ] collects the virtual symbols for Tx antenna t.

Tensor-based Formulation
Eq. ( 2) can be equivalently written as with Viewing the above as the rth frontal slice of an M × N × N R tensor Y leads to the conclusion that its noise-free part obeys a CPD model [8] of rank N T M and hence the JCD problem becomes that of fitting (in the least squares (LS) sense) such a model to Y: min where the N R × N T M matrix H has rows h (r,•)T , r = 1, 2, . . ., N R .Since Γ is a-priori known, CPD uniqueness conditions given in [18] apply here.For a general MIMO system, i.e.N T > 1, Proposition 3.2 of [18] applies, which however can be verified not to guarantee the uniqueness of the above CPD.Instead, as detailed in [9], the system is then best described by a block term decomposition (BTD) model [19] with M rank-(1, N T , N T ) terms.Identifiability is attained via decomposing the JCD problem into M bilinear ones, one per subcarrier, and exploiting the finite alphabet property of the (virtual) symbols in each to obtain a solution (see [9] for more).
For the SIMO configuration, where Γ = I M is of full column rank, BTD is reduced to a CPD, whose uniqueness is guaranteed by [18,Proposition 3.1].It is this case that will be treated henceforth, for the sake of simplicity.The intrinsic MIMO case is considered in the preprint [20].

ADAPTIVE (SEMI-)BLIND FLEXIBLE MULTICARRIER SIMO RECEIVER
For a 1 × N R system, one can rewrite (3) in the form where n , n = 1, 2, . . ., N at its columns.Let a new FBMC symbol, x n , be transmitted at each symbol instant n.This will generate a new N R M × 1 column for Y T (2) , which can be viewed as the vectorized version, y n , of a new lateral (M × N R ) slice for the streamed tensor Y. Assuming that the channel, H n , remains invariant throughout the nth FBMC symbol (i.e., block fading) and, as suggested by the corresponding assumption in [17], only "slowly" changes from symbol to symbol, that is, one can easily verify that the new C matrix will be approximately the same, only with the new virtual FBMC symbol c n appended as its new column: C n ≈ C n−1 c n .The latter is quite natural as it complies with the idea of Tx symbol streaming.Furthermore, noting the analogy of ( 5) with an input/output relation where G is the 'channel' and C is 'input', one can define the following exponentially weighted LS problem as a means of realizing adaptive JCD: where 0 < λ ≤ 1 is the forgetting factor, aiming at downweighing past samples.Applying then the RLST approach of [17], adapted to the present context, yields Algorithm 1, where F stands for the pseudo-inverse of G while P ∈ C M ×M and R ∈ C NRM ×M contain the data auto-and cross-correlation values, respectively.Remarks.
1.The algorithm alternates between decision-directed channel estimation and equalization.The 'channel' estimate, G, is updated in line 10, based on the recursively computed updates of R and P and the efficiently (via matrix inversion lemma [17]) computed inverse of the latter.The special structure of G, namely a stacking of the Diag(h (r,•) n ), is explicitly taken into account to recover the new H estimate in lines 11-13.In line 14, the pseudo-inversion is done efficiently, taking the special structure of the matrix into account.A preliminary (virtual) Tx symbol estimate is computed in lines 4-6, to help update the R and P matrices, and is refined with the updated channel estimate in lines 15-17.

The implementation of lines 5-6 and 16-17 depends on
which MCM scheme is utilized.dec(•) extracts the Tx Algorithm 1: Adaptive semi-blind receiver Data: Y in a streaming manner, 0 < λ ≤ 1 Result: Estimates of the channel and transmitted symbols 1 Initialize P 0 , R 0 , F 0 ← P 0 R † 0 (e.g., from training); 2 for n = 1, 2, . . .do 3 y n ← vec(Y(:, n, :)); Apply matrix inversion lemma: symbol estimate from the virtual symbol estimate.This may be a simple projection onto the input constellation, as in CP-OFDM, or, in addition, a recovery of the Tx symbol from its (time-frequency) filtered version with the interference weights [9], as in FBMC/OQAM.In the latter example, that would involve taking the real part of (the phase de-rotated) c.Same for the TMUX(•) operation, which refines the estimate of c with the knowledge of the (constellation-informed) Tx symbol.Ideally, this should compute c as in (1), that is, based on symbols transmitted at previous and later time instants as well.This in turn would require inserting a delay in the tracking algorithm.
For the sake of simplicity, the ISI part of the interference is ignored in the simulations presented here.Thus, after having removed the phase rotation, TMUX(•) is simply implemented as c n = d n + ȷℑ{c n }.Of course, in the CP-OFDM case, TMUX(•) is just the identity.
3. P , R can be initialized in more than one ways (line 1).
It is quite common in the online TD literature to rely on an initial piece of the tensor, which can be processed in a batch manner.This would only make sense in the present context if the channel remained invariant throughout this part, which is not realistic to assume.A compromise is made here by setting R 0 to (H train ⋄ I M )P 0 , where H train is a rough estimate of the channel matrix found with the aid of a (single-symbol) training preamble [21] and P 0 = σ 2 c I M , with σ 2 c being the average power of the (virtual) symbols, assumed independent and identically distributed (i.i.d.) for the sake of simplicity.The latter is, of course, valid only for self-interference-free MCM.

SIMULATION RESULTS
Alg. 1 was tested with two Rx antennas 1 , utilizing CP-OFDM and FBMC/OQAM modulation with M = 64 subcarriers.The PHYDYAS filter bank [1] was employed in the latter case, with K = 4.The channel impulse responses followed the PedA power delay profile and were of length 16 for a subcarrier spacing of 15 kHz.Time variation was simulated in accordance with the Gauss-Markov model [22] as h n = ρh n−1 + 1 − ρ 2 x n , where 0 < ρ ≤ 1 quantifies the variation rate and x n is a random standard normal vector.A total of 30 QPSK-based CP-OFDM symbols were transmitted (per realization), with the CP duration chosen as M  4 , and of course the double number of symbols in (CP-free) FBMC/OQAM.Fig. 1 depicts the evolution of the normalized mean squared channel estimation er- , and the (uncoded) bit error rate (BER), at a signal-to-noise ratio (SNR) of 5 dB, with ρ = 0.999 and λ set to 0.98.The algorithm manages to track the system changes at this low variation rate.Its accuracy, manifested in the gap between the performance with estimated ("Est.") and perfect ("PCI") channel information, will, of course, improve at higher SNR levels.The FBMC/OQAM performance is somewhat inferior to that of CP-OFDM and the 1 See [20] for simulation results with more antennas.reason for this is twofold.First, the input/output model in ( 2) is only approximately valid (which explains the performance error floors observed in such systems [21]), and second, the implementation of TMUX(•) is here simplified (Remark 2).A lower value for ρ would prevent the algorithm from tracking the system variations given the crucial underlying assumption in (6).An extreme example of this is found in [16, Fig. 2], for a 2 × 3 MIMO-OFDM system.The fact that, as explained previously, the uniqueness of the CPD model is not guaranteed in such a setup is overlooked in [16].The channel response is taken to be invariant throughout except for a sudden change (at the symbol no.501 of a total number of 1000 CP-OFDM symbols).A quite similar scenario was simulated here, for a SIMO system with three Rx antennas and OFDM-modulated QPSK input, with the rest of the parameters being as in [16]: M = 32 subcarriers, with a channel of three equal-power taps, operating at SNR=10 dB, and using λ = 0.5.The resulting NMSE is plotted in Fig. 2, showing that, contrary to what is claimed in [16] and as one would expect from the above discussion, the algorithm is unable to track such abrupt changes.

FUTURE WORK
This is only a preliminary study, aimed at the formulation of a tensor-based adaptive JCD solution for a large class of FBMC systems and the signaling of the pros and cons of this approach.In addition to incorporating the ability to track faster changes, further investigation of the MIMO case is needed (cf.[20, Alg.2]), especially for systems of massive size.Allowing synchronization impairments, e.g., carrier frequency offsets (CFOs) [23], will be facilitated if the system model is translated to the time domain [24].In the fully blind case, the permutation ambiguity problem, namely how to match the order of the antennas across frequencies, can be addressed through independent vector analysis (IVA) [25].Furthermore, and as suggested in [9], viewing such alternating-type algorithms from a Bayesian viewpoint bears promise to result in better-performing and understood methods.