Automated Classification of Cognitive Visual Objects Using Multivariate Swarm Sparse Decomposition From Multichannel EEG-MEG Signals

In visual object decoding, magnetoencephalogram (MEG) and electroencephalogram (EEG) activation patterns demonstrate the utmost discriminative cognitive analysis due to their multivariate oscillatory nature. However, high noise in the recorded EEG-MEG signals and subject-specific variability make it extremely difficult to classify subject's cognitive responses to different visual stimuli. The proposed method is a multivariate extension of the swarm sparse decomposition method (MSSDM) for multivariate pattern analysis of EEG-MEG-based visual activation signals. In comparison, it is an advanced technique for decomposing nonstationary multicomponent signals into a finite number of channel-aligned oscillatory components that significantly enhance visual activation-related sub-bands. The MSSDM method adopts multivariate swarm filtering and sparse spectrum to automatically deliver optimal frequency bands in channel-specific sparse spectrums, resulting in improved filter banks. By combining the advantages of the multivariate SSDM and Riemann's correlation-assisted fusion feature (RCFF), the MSSDM-RCFF algorithm is investigated to improve the visual object recognition ability of EEG-MEG signals. We have also proposed time–frequency representation based on MSSDM to analyze discriminative cognitive patterns of different visual object classes from multichannel EEG-MEG signals. A proposed MSSDM is evaluated on multivariate synthetic signals and multivariate EEG-MEG signals using five classifiers. The proposed fusion feature and linear discriminant analysis classifier-based framework outperformed all existing state-of-the-art methods used for visual object detection and achieved the highest accuracy of 86.42% using tenfold cross-validation on EEG-MEG multichannel signals.

Shailesh Vitthalrao Bhalerao is with the Department of Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Indore 453552, India (e-mail: phd2001171009@iiti.ac.in).
Ram Bilas Pachori is with the Department of Electrical Engineering, Indian Institute of Technology Indore, Indore 453552, India (e-mail: pachori@iiti.ac.in).
This article has supplementary material provided by the authors and color versions of one or more figures available at https://doi.org/10.1109/THMS.2024.3395153.
Digital Object Identifier 10.1109/THMS.2024.3395153identify objects and object categories from brain responses, MEG and EEG signals are recorded in response to visual stimuli i.e., images from different categories are presented to the participants in experimental trials while their brain activities are recorded [1].Researchers have explored several experimental techniques for recognizing visual objects, including EEG [2], MEG [3], electrocorticogram, and functional magnetic resonance imaging [1], [3].Among these studies, MEG and EEG are the widely used techniques for visual recognition systems because of their noninvasive nature and ability to offer enhanced fine-grained analysis by identifying spatial, temporal, and spectral components underlying object category discrimination [1], [3].
In order to investigate the effectiveness of using EEG and MEG signals for visual recognition, many research studies have been carried out and promising results have been reported [1], [2], [3], [4].Cichy et al. [1] have introduced a feature i.e., peak latency time points (PLSP) that is based on the 306-channel MEG of categorical representation of objects in the human visual response.This study uses PLSP with linear discriminant analysis (LDA) frameworks to identify spatial and temporal MEG components that best discriminate object categories by obtaining a classification accuracy of 68.75% on subsets of brain responses.Further, Cichy et al. [3] have extended the potential of the EEG and MEG multichannel data combinable to multivariate analysis of visual object problems.This study demonstrated that the extracted representational similarity analysis (RSA) features from combined MEG and EEG signals enhance the co-relation between temporal dynamics and category information in recognition of visual objects and deliver the highest classification performance than in MEG and EEG signals alone.In recent works, Guggenmos et al. [2] has introduced a visual object classification approach by adopting an informative representational dissimilarity matrix (RDM) as novel self-similarity measures on a 306-channel MEG.It includes multivariate analysis based on the RDM feature, which is computed on the 100 time points of the MEG signal and tested using four different supervised classifiers, namely, LDA, support vector machine (SVM) with linear kernel, Gaussian Naïve Bayes (GNB), and weighted robust distance (WeiRD).This study also incorporates multivariate noise normalization in the preprocessing stage for classification performance analysis.The study demonstrates the highest average accuracy across different classifiers: LDA (74.5%),SVM (74%), WeiRD (85%), and GNB (75%).Furthermore, Kong et al. [4] have employed a shallow convolutional neural network (CNN) network to project visual activation response into a 74-channel EEGbased learned RDM feature and achieved a classification accuracy of around 65.6% in a five-class visual category.However, in the above visual object classification approaches, the computed features from raw EEG or MEG multichannel signals show strong dependency on the formulation of the classification model, thus it may suffer in computing the most significant features with strong relevance to the visual object category [1], [2], [3], [4].
In real-world environments, complex visual object recognition faces significant challenges, such as the selection of channels, smoothing and 2168-2291 © 2024 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
data reduction, the undeterminable scalp regions, and the nonstationary nature of EEG and MEG signals, which limit mutually informative features across channels and consequently degrade the classification performance of visual object detection.These mutual features show multivariate modulated oscillations, which correspond to the matching of the oscillatory component (OC) modes with similar multiscale spectral content across multiple channels.Therefore, robust multivariate visual analysis is urgently demanded to detect the most significant features with optimizing channels and computational efficiency.In order to get improved fine-grained characteristics of the temporal dynamics of visual neural activity, the primary motivation for this work to study nonstationary multicomponent EEG-MEG signals is to enhance signal denoising and separation of spectral components to compute multispectral features for improving multivariate analysis of visual objects.However, the relation that EEG-MEG in the context of signal decomposition-based rhythm extraction for multivariate analysis has not been investigated.
To explore the multiscale analysis of real signals, various multivariate decomposition methods have been developed in recent decades [5], [6], [7], [8].Among them, the recently introduced adaptive decomposition method, namely, swarm decomposition (SWD) [9], has demonstrated excellent performance in the analysis of nonstationary signals.In SWD, the underlying OCs are extracted through the iterative extended concept of swarm filtering (SwF)-based decomposition, which is designed by swarm intelligence.Many experiments have demonstrated its superiority over state-of-the-art methods in resolving mode mixing and adaptive decomposition in real-time signal analysis [9], [10], [11].However, the above univariate SWD decomposition method shows the limitation when deal with the simultaneous processing of multivariate signals from multiple channels and decompose with different OC modes, which determine less mutual common characteristics [9].
To resolve the aforementioned limitations, we have presented a novel extension of the swarm sparse decomposition method (SSDM) [10], termed multivariate SSDM (MSSDM), to classify visual stimulus activation (VSA) patterns based on multivariate EEG-MEG multichannel signal data.In the proposed MSSDM, a mode alignment approach is used to determine the orders of effective OC modes in multivariate EEG-MEG signals so that the same frequency components across different channels can be extracted from different OCs and can establish spectral correlation among the different channels.Using multivariate SwF and sparse spectrum, the MSSDM method employs an adaptive scalespace approach to estimate spectral boundaries in the sparse Fourier spectrum (SFT) to obtain aligned OCs across different EEG-MEG channels.In addition, the approach utilizes a spectrum optimization technique, namely, fused least absolute shrinkage and selection operator (Lasso)-based sparse spectrum estimation for better TF localization and mode alignment in the channel-specific frequency domain [12].The multiscale features obtained from these effective OCs extract enhanced spectral dynamics of neural activity related to the common or unique aspects of visual object representation.As per our understanding, MSSDM decomposition-specific oscillatory rhythms-based multiscale features have not been attempted for the recognition of visual objects.4) It compared a developed automated system of visual object detection against the current state-of-the-art methods with subject-independent cross-validation from the cross-subject dataset.The rest of this article is organised as follows.Section II explains the EEG-MEG dataset used in this work.Section III discusses the basic principles of multivariate SSDM and the visual object classification framework.Results and discussions are elaborated in Section IV.Finally, Section V concludes this article.

II. MULTICHANNEL DATASET
In this study, we have evaluated our proposed technique on the publicly available human EEG and MEG dataset [3] with responses to 92 images that were used as visual stimuli.The dataset contains 74-electrode EEG and 306-electrode MEG responses from 16 subjects (eight females with a mean age 23.87±4.5 years and eight males with a mean age 24.37±4.1 years) viewing each of 92 stimuli.In our work, we have employed five categories (12 images per category), namely, human body (HB), human face (HF), animal body (AB), animal face (AF), and inanimate objects (IO) from all 380-channel for further multivariate analysis.Data were recorded at a sampling rate of 1000 Hz, and online filtering between 0.03 and 330 Hz was applied.Fig. 1 illustrates the timing scheme of the paradigm that was used in the study.Every trial extracts visual stimulus between −100 and +900 ms (around 1 s).In total, each participant completed 2820 trials, including 30 trials of 94 visual objects.

III. PROPOSED METHODOLOGY
The proposed MSSDM-based classification framework for visual object recognition using EEG and MEG multichannel signals is presented in Fig. 2 and detailed in the following section.

A. Preprocessing
The preprocessing stage mainly comprises the implementation of filtering and segmentation techniques.In filtering, it comprises two filters [smoothening low-pass filter (LPF) (30 Hz) and baseline removal infinite impulse response high-pass filter (0.8 Hz)], which filter out noise and artifacts from EEG and MEG signals.A MATLAB MNE toolbox was initially used to extract and clean EEG and MEG data from each participant.In segmentation, multichannel EEG and MEG signal epochs of 1-s duration are employed to decompose adaptive OCs.During the segmentation process, the data samples were cut into epoch sizes of 1000 ms windows of time-locked post-stimulus responses.MEG and EEG signals are acquired for 72 min using 380-channel at a sampling rate of 1000 Hz for each image.In our work, we have considered the smallest epoch size of 30 among the 16 subjects' epochs.Further, the 30 considered epochs from the 380-channel of EEG and MEG are concatenated to construct the raw pattern vectors separately as well as combined.Each vector contains 2220 (30 × 74) patterns, 9180 (30 × 306) patterns, and 10 800 (30 × 380) patterns, which correlate with EEG (74-channel), MEG (306-channel), and combined EEG-MEG (306-channel) for each class.
Fig. 3 shows the TF analysis, which is computed using univariate SSDM [10] and Hilbert spectral analysis from the visual stimuli responses of the three visual object categories: HB, HF, and AB from the five different EEG channels [EEG005 (FC2), EEG042 (T8), EEG003 (FC1), EEG041 (T7), and EEG036 (F4)] for subject 6.In the TF plots, the highest activation during the visual cognitive tasks was observed in the 8-30 Hz rhythm with imagined stimulus response during the timeframe (−100 to 900 ms) from the onset of imagination.These TF responses exhibit discriminative neural activity patterns due to the cognitive imagination tasks (highlighted in Fig. 3).From Fig. 3, it is demonstrated that the channel-wise approach is unable to recover multivariate modulated oscillation characteristics or any meaningful Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.mutual information corresponding to the matching OCs with similar multiscale spectral content for visual object classes.On the other side, analyzing joint patterns with an appropriate selection of similar channel characteristics exhibits significance in enhancing multivariate analysis for cognitive visual objects.For instance, selecting channels (FC2, T8, FC1, T7), (FC2, T8, FC1), and (FC2, T8, FC1, T7, F4) show significant to enhance mutual patterns for the visual object classes HB, HF, and AB, respectively.The obtained significant multivariate patterns for three-class imagined data are demonstrated in Fig. 3. Thus, we can infer that, due to the interdependent nature of the channels, raw MEG and EEG patterns may not be able to distinguish class-specific mutual information.In order to strongly correlate mutual multivariate information across-channel, we have proposed the multivariate SSDM method to get channel-aligned multispectral features for the enhancement of multivariate analysis for visual objects.

B. Multivariate Extension of Swarm-Sparse Decomposition
The proposed MSSDM is a novel extension version of the SFTassisted SWD [9], [10] method to decompose multichannel nonstationary signals into OCs.In this work, the OCs of the multichannel EEG-MEG signals are evaluated for multivariate analysis using the MSSDM method.The proposed MSSDM delivers a set of common multivariate modulated oscillations located in a multidimensional space of input data that exhibits minimum collective bandwidth while fully reconstructing all input channels.With the exceptional ability of noise suppression and adaptive decomposition, our proposed method leads to desirable properties, such as prominent superiority in mode mixing, mode alignment, and avoidance of extraneous mode selection [5], [6].Unlike standard SWD, we extend the optimization problem directly in the Lasso-based sparse Fourier domain through a multivariate SwF optimization approach and Bhattacharya distance [13] based on convergence criteria.The proposed method is efficient in that multivariate modulated oscillations can be produced from input data without the need for additional user-defined parameters, such as insufficient threshold selection, filter order, and the number of iterations.The mathematical formulation of the proposed multivariate SSDM model with the use of a modified univariate SSDM-based filtering technique is presented in the following section.in the frequency domain.It is expressed as follows: . ( Here, s and z are the FT coefficients and corresponding reconstructed data in the time-space domain derived from an adjoint FT.The sparse model allows simultaneous estimation of component reconstruction due to its independent spectral content and dominant peak concentration around the center frequency.In our work, the sparse spectrum optimization is reformulated via joint-optimization FT in the TF domain [12], which is as follows: where L and L H represent the L 1 -norm of vector for FT (F ) and IFT (F −1 ) kernels.In order to get optimized frequency-dependent sparse FT coefficients, the L and L H vectors are recursively selected using an iterative shrinkage algorithm [12].The modified sparse FT model is given as follows: where μ 0 and μ 1 denote the penalty factors which regulate model's sparsity and it is set as 0.01.c and cf are FT and SFT coefficients of time domain data z.Each iteration, this model updates the model basis weights using a matrix inversion and converges an optimized sparse spectrum basis.Sparsity causes more energy to be concentrated at the center of sparse coefficients, which leads to an improved FT with a unique and compact representation [10].
b) Spectrum smoothing: In order to select the highest energy peak, the Savitzky-Golay filter (SGF) [14] is carefully designed to smooth the noisy spectrum to achieve optimal decomposition performance.Essentially, it functions as an LPF in which the threshold (P th ) in the energy peak selection parameter controls SWD performance and determines the number of OCs that are extracted.To obtain desirable SwF performance, P th , SGF length (SGF ln ), and SGF degree (SGF d ) are experimentally selected with values of 0.1, 15, and 2, respectively.In consequence, the dominant frequency ω q dom for qth components can be estimated as, c) Sparse spectrum boundary estimation: Due to the overlapped nature of sparse spectra, accurate band selection in the sparse FT coefficients corresponding to the actual monocomponent signal is difficult in the case of close-frequency multicomponent signals.The inappropriate selection of the order and range of the sparse coefficients leads to an incorrect reconstruction of multicomponent signals.In contrast, no specific criteria are available to determine the best boundary conditions for band selection in sparse coefficients.To optimize sparse coefficient separability, SSDM has employed a sliding window-based SwF filtering approach for accurate boundary estimation with improved parameters of the swarm model.In SWF, the sparse spectrums are first grouped into approximately K number of boundary segments B i by computing the middle of two progressive local maxima, as, The ω i and ω i+1 are two consecutive center frequencies obtained from E ψ(s) (ω).d) SwF filtering mechanism: SWD relies on the SwF concept, which incorporates a swarming model parameterization and a swarmprey hunting mechanism.In SwF, the input signal represents the route of the prey for the swarm member and the output is the OCs obtained from the swarm's trajectory.To extract OCs efficiently, an optimized swarming model is designed, as explained below.
Step 1: Initially, ith individual swarm member M is characterized by its position p i [k] and velocity vector v i [k] and it is defined as where p prey denotes initial position of pray.d cr represents the smallest distance between members and it is set at 0.5, 1, and 2.
Step 2: In swarm-pray hunting, the movement and hunting of each member are governed by two types of interactions: driving force F k Dr, i and cohesive force F k Coh, i , which can be computed as follows: where f (d) is used to find the cohesion force contribution of each member i.e., jth member to the ith member with distance d = (4, 4).
Step 4: To track prey, the swarm updates its current states ) with respect to every member with hunting time interval δ [0, 1].
Step 5: is calculated by adding all best positions of the swarm members rather than their mean values [9], where h denotes a weighted factor.In order to optimize SwF response, the fitness function is employed to get dominant OCs as the residual signal at every consecutive iteration.It is defined as follows: The residual signal z it+1 [k] is obtained by subtracting filter signal y[k] from input signal z it [k] at each iteration.
Step 6-Check convergence condition: In iterative SwF filtering, the algorithm is stopped if it meets the following Bhattacharya distance (B) [13]-based convergence criteria for the obtained residual signal:  Smooth the energy spectrum of z it [k] using SGF filter 13:

2) Multivariate Extension of SSDM:
Estimate the spectral coefficient with the highest energy peak 14: Calculate swarm parameters M and δ 15: Repeat 16: of data channels, which is given as Due to the mode alignment problem in the univariate SWD filtering approach, it adopts channel-specific mode reconstruction with univariate modulated oscillation, which is invariant to the other channels and introduces difficulties in the multivariate analysis.Therefore to extend the multivariate analysis to find matched oscillatory signals, in MSSDM, we have adopted common boundary estimation in the obtained SFT spectrum by computing the following mean spectrum magnitude ψ(s) of multivariate signals from all channels: where z n represents the FT spectrum of the individual channel.Then, adaptive SSDM-based filter banks were formed by applying the boundary estimation technique to the obtained mean spectrum.The obtained common filter bank is applied to all channel data to get aligned modes.These OCs are the narrow-band components, which deliver common frequency support in every oscillatory level.The MSSDM-based decomposition result of MEG 4-channel (MEG0111-0114) multivariate signals corresponding to the AF visual object category is shown in Fig. 7. Through the obtained common decomposed modes OC1-OC5, the proposed MSSDM method has demonstrated its ability to detect common or joint OC modes, which having the same spectral content across the multiple channel data.The spectral analysis using Weltch power spectral density (PSD), as shown in Fig. 6, illustrated the mode alignment of the selected OCs with MSSDM filter banks.A similar spectral response found in the PSD plot clearly demonstrates that OCs have consistently detected oscillations over multiple channels and are properly aligned in cases of MEG signals.In contrast to multivariate decomposition, a univariate decomposition will not align similar OC modes in the same numbers when decomposing the multichannel signal separately [5].The proposed MSSDM is detailed in Algorithm 1.
To verify mode alignment and denoising ability of MSSDM to identify and align common mode oscillations present across multiple channels for efficient extraction of channel-aligned OCs, we have conducted an ablation experiment to examine how the various elements of our proposed MSSDM approach affect the multivariate analysis of nonstationary signals (EEG and MEG).To demonstrate the significance of the elements of the proposed MSSDM, we have assessed the four approaches: 1) FT spectrum without SGF, 2) FT spectrum with SGF, 3) SFT spectrum without SGF, and 4) SFT spectrum with SGF.
To test the effectiveness of the proposed MSSDM, we have conducted the ablation study.In this proposed paradigm, the synthetic signal x(n) has been designed for three-channel multivariate signals.Each multivariate channel signal consists of three nonstationary amplitude modulated (AM) monocomponent signals of three sinusoids with different frequencies and three linear variable amplitudes, which is constructed as, Here, s c1 (n), s c2 (n), and s c3 (n) represent the three channels and they are designed as, To inherit the complexity of EEG signal characteristics, the test synthetic signal is designed with a selection of sinusoids in the frequency range .The parameters are f 1 = 10 Hz, f 2 = 20 Hz, f 3 = 30 Hz, f 4 = 40 Hz, f 5 = 50 Hz, and sampling frequency F s is 1 kHz.The signal-to-noise ratio of white Gaussian noise ω[n] is −10 dB.ω is discrete carrier angular frequency and a is the modulation index.
In the ablation study, these approaches used to design the proposed MSSDM have been evaluated using spectrum analysis, decomposition results, and reconstruction performance, which are shown in Figs. 5  and 4, and Table I, respectively.Signal reconstruction efficiency is evaluated using the performance metric, error to signal ratio (ESR).From spectrum analysis mentioned in Fig. 5(e), it shows that, when compared with the other approach versions, the proposed MSSDM, which combines SFT spectrum and SGF filter, performs the improved frequency resolution and optimally detects boundary frequencies in the sparse domain.From Fig. 5(b)-(d), it shows that the spectrum obtained is overlapped and the obtained spectral boundaries are nondistinguishable, which introduces difficulty to reconstruct channel-aligned components.The performance of our approach suddenly declines when all two components (SFT and SGF) are taken away, demonstrating the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.significance of these parts, which is shown in Fig. 5

(b). Table I indicates the performance comparison of the obtained ESR values of x(n)
for AM reconstructed monocomponent signals s 1 (n), s 2 (n), s 3 (n), s 4 (n), and s 5 (n) using FT (without SGF), FT (with SGF), SFT (without SGF), and SFT (with SGF) approaches.The lowest values of ESR of every reconstructed monocomponent signal using the MSSDM method (SFT and SGF approach) prove the efficient reconstruction of components as compared with other approaches, whereas the reconstruction efficiency is poor in the case of other approaches as the value of ESR was found inferior (highest).It is also demonstrated in Fig. 4 indicating the decomposition performance of all approaches.It is quite easy to make an erroneous judgement that the extracted modes from the proposed MSSDM (SFT and SGF) are channel-aligned in terms of their joint frequency content, which are shown in Fig. 4(d).It is clear that the extracted modes OC2 and OC3 from channels 1, 2 and 3, which are aligned with common time-varying frequency components, i.e., 20 and 30 Hz, which are locally present with single-mode OC2 and OC3, respectively.Rest extracted components are locally associated with respective modes of channels.The proposed method shows better ability to the mode mixing with the use of SFT and SGF and demonstrates all five converged components with five modes with the lowest ESR [shown in Fig. 4(d)], whereas other approaches show inferior decomposition performance and extract components with nonalignment mode and overlapping nature [shown in Fig. 4(a)-(c)].Overall, the above test result demonstrates the effectiveness of MSSDM in separating effective multivariate modulated oscillations from multichannel data while verifying mode alignment.The above descriptions highlight   the relative significance of each element and their interdependence in designing the proposed MSSDM method for multivariate analysis to get mutual features across channels and attain the best classification performance for visual cognitive analysis.

C. Feature Extraction and Selection
In literature, several features are reported for the classification of visual objects [1], [3], [15].Here, we have adopted multiple domain feature parameters: RE [16], CSD [17], SE, and CSPTE from the α-β rhythms of 1 s multichannel EEG-MEG epochs.These selected features proved a strong exploration of spectral variability, amplitude, and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.complexity of real signals [16], [17], [18].To enhance feature selection, a novel fusion feature approach is presented based on Riemann's correlation [19] by finding the best-correlated features.Among the employed features, the two newly proposed feature extraction techniques are explained as follows.

1) Common Spatial Filter on Teager Energy:
In this feature, we have employed a multiclass CSP algorithm on Teager energy (TE), which is computed from MSSDM modes y N c of total sample N with cth class.It is common to explore EEG signals using CSP features, but CSP with TE is a new feature approach to solving multiclass visual categorization problems [18].A multiclass CSP algorithm with a one-versus-rest scheme is used to compute the TE that discriminates one class from the rest, then concatenates the features to form a feature vector The extracted CSPTE feature (f i ) is defined as where f i is the extracted binary class CSPTE feature in the selected band after the projection of spatial filters ω [20].2) Sparse Entropy: The SE feature is introduced to enhance the ability of the entropy feature to extract joint significant features using a sparse filter bank and an adaptive windowing technique.Essentially, it is a multitask filtering model that combines sparse spectrum optimization with Lasso penalties (||U || 1 , |U || 2,1 , and ||u t − u t+1 || 1 ) and entropy feature vectors H (t) = − N i=1 P (y i ) • log 2 (P (y i )) computed from the obtained rhythm-based OCs y i of ith channel to enhance the temporal characteristics of the features.The multiclass feature set (F i ) is formulated as follows [21]: where penalty parameters (β 1 , β 2 , and β 3 ) are set at 0.1.u t denotes the learned projection vector at the T th sliding window.

IV. RESULTS AND DISCUSSION
To prove the effectiveness of the proposed MSSDM-based visual object recognition framework, we have tested SSDM-based features from the EEG signals (74-channel), MEG signals (74-channel), MEG signals (308-channel), and combined EEG-MEG signals (380-channel) of the visual stimulus of the MEGEEG92 Objects Dataset [3].The five different visual object classes, namely, HB, HF, AB, AF, and IO from the EEG-MEG dataset are classified using LDA and validated using performance metrics, such as accuracy (Acc), sensitivity (Sen), specificity (Spe), and Fisher (F)1-score.LDA classifier has been chosen based on its proven capability to investigate features with appropriate parameters and counteract over-fitting problems with less computational complexity in VSA decoding applications [2].To prove its potential against the existing methods [1], [2], [3], [4], the proposed method has been tested for its mode-alignment property by aligning common frequency scales across multiple EEG-MEG data.Also, we have compared the proposed method with univariate SSDM (USSDM) approach (channel specific analysis) [10], and direct rhythms (delta, theta, alpha, beta, and gamma) analysis, which is computed using a bandpass filter (BPF) [22] from raw EEG signals.In our work, the 1 s epoch of multichannel data are decomposed using MSSDM into different OCs.Further, rhythms: Delta (δ: 0.1-4 Hz), theta (θ: 4-8 Hz), alpha (α: 8-13 Hz), beta (β: 13-30 Hz), gamma (γ: 30-80 Hz), and combined alpha and beta (α-β: 8-30 Hz), are computed on mean frequency from the decomposed MSSDM modes.In our work, features are extracted from the combined α-β rhythms.α-β rhythms selection depends exclusively on experimentation and delivers highest performance (see the Supplementary Materials).Fig. 8 shows the multivariate TF representations of the α-β rhythms from three selected channels (FC3, T8, and F4 of subject 1) using proposed MSSDM method for the five-class visual object MEG data.It exhibited discriminative temporal and spectral characteristics in the obtained α-β rhythms related to the different visual object classes.In order to extract the most discriminative features from decomposed OCs, a novel feature scheme has been implemented, in which fusion features are computed by finding the most correlated features from the normalized features (RE, SE, CSPTE, and CSD) using Riemann's correlation.The objective is to formulate the most significant VSA features set by separating and eliminating nondiscriminative features in subjectspecific without compromising classification performance.Table II tabulates the classification accuracy achieved in the different channel schemes based on different Riemann's correlation coefficient factors (Q) from computed rhythms using MSSDM, USSDM, and BPF methods.Here, FF features are formulated with a selection of different correlation values (>0.4,>0.5, >0.6, >0.7, >0.8, and >0.9) and all remaining features are eliminated.From Table II, it is shown that FF features with a correlation value (> 0.7) demonstrated the best classification results for five visual object classes corresponding to four channel selection schemes: EEG (74), MEG (74), MEG (306), and EEG-MEG (380).In the study, a subject-specific and cross-subject tenfold cross-validation scheme was conducted, along with a statistical t-test (p < 0.05) [10].It employed leave-one-out cross-validation in feature vectors from the cross-subject dataset considered in a trainingtesting configuration of 80%-20%.This process was repeated for all subjects and average results were reported.To test the computed FF features, different MSSDM-based classification frameworks have been developed, such as the MSSDM method with LDA (MSSDM-FF-LDA), in with the USSDM method with LDA (USSDM-FF-LDA) and the BPF method with LDA (BPF-FF-LDA) in four channel selection schemes.Table III tabulates the feature-specific average classification performance, which is obtained from all 16 subjects using the MSSDM, USSDM, and BPF methods.Here, classification has been carried out over the ten times repetition, and average accuracies were reported for four-channel selection schemes of MEG and EEG sensor data.
As given in Table III, the proposed classification framework MSSDM-FF-LDA has achieved the highest average accuracy of 86.42% among all frameworks in the case of channel scheme of EEG-MEG (380-channel) with fusion feature (RE, CSPTE, SE, CSD).The obtained sensitivity, specificity, and F1-score are 87.08%,83.11%, and 84.89%, respectively.The performance of MSSDM-SE-LDA slightly deteriorates in the SE feature when using combined EEG-MEG channels.We have achieved approximately the same average accuracy of 84.06%, sensitivity rate of 82.65%, specificity of 88.15%, and F1-score of 80.50%.In contrast, MSSDM-RE-LDA reported relatively low performance with 77.90% for the RE features with combined EEG-MEG channels, respectively.It is noted that the introduction of new features: SE, CSPTE feature, and FF with LDA, have delivered the highest average accuracy for visual object classification in the case of the EEG-MEG (380-channel) scheme.Fig. 9 presents an evaluation of classification accuracies by using FF features obtained from the MSSDM, USSDM, and BPF in all 16 subject-specific cases.In our analysis, we have computed the FF features based on Riemann's correlation index (Q) with threshold value (Q > 0.7) from features (RE, SE, CSPTE, and CSD).FF feature selection is intended to reduce the complexity of the feature while improving performance.In the subject-specific, the MSSDM-FF-LDA framework demonstrated the highest average accuracy (86.42%) within the EEG-MEG (380-channel) scheme.However, the USSDM-FF-LDA Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE IV PERFORMANCE COMPARISON AGAINST STATE-OF-THE-ART METHODS AVAILABLE IN THE LITERATURE
and BPF-FF-LDA classification frameworks achieved comparatively lower average accuracies of 77.29% and 69.96%, respectively.In tenfold cross-validation, the MSSDM-FF-LDA framework achieved the highest performance with accuracy of 84.98%.
In the case of MEG or EEG data, the proposed classification frameworks show slightly inferior performance than the combined EEG-MEG channel data.In MEG (306-channel) scheme, performance is slightly improved with the proposed MSSDM-FF-LDA comparatively and delivered accuracy, sensitivity, specificity, and F1-score of values 83.05%, 85.80%, 83.04%, and 83.69%, respectively, whereas the obtained lowest accuracy rates for USSDM-FF-LDA and BPF-FF-LDA classification frameworks are 75.71%and 68.34%, respectively.Also, the performance of the LDA classifier with new features (SE and CSPTE) in evaluating the proposed method is also noteworthy as the obtained average accuracy rates are 80.98% and 81.97%, respectively.The sensitivity and specificity rates with the CSPTE feature are found higher for subjects 2, 11, 12, and 13 (only an average of 86.05% of VSA classes are truly classified), but it maintains a high average F1-score (82.21%) for all these subjects (given in Table III).For subjects (2-9,11-13), the LDA classifier with FF achieved very good sensitivity rates (more than 85.80%) with high specificity rates.
In similar cases, MEG (74-channel) and EEG (74-channel) schemes, also provided good sensitivity and specificity rates for most of the subjects.The highest average accuracy, sensitivity, specificity rates, and F1-score for SE feature with LDA (MSSDM-SE-LDA) for MEG (74-channel) scheme are 75.85%,72.40%, 70.01%, and 69.49%, respectively, whereas in the case of EEG (74-channel), the performance of features computed from 74-channel EEG signals falls significantly in evaluating the proposed method for all subjects, as the MSSDM-SE-LDA framework achieved the lowest classification performance with average accuracy, sensitivity, specificity rates, and F1-score for RE features are 69.65%,66.38%, 65.91%, and 66.85%, respectively.However, it is clear from Table III that the sensitivity of the proposed method for EEG (74-channel) is comparatively less for all subjects.For those subjects, visual object detection is difficult to detect using the obtained MSSDM features because of nondiscriminative overlapping multivariate modulated oscillations and highly contaminated artifacts with a short visual stimulation response.Also, overall classification performance using different feature schemes has been presented in Table III and Fig. 10.
In Table IV, we compare the proposed MSSDM method with existing methods for classifying visual objects using EEG signals and MEG signals already reported in the literature [1], [2], [3], [4].This table illustrates the accuracy of classification across subjects and the statistical analysis based on the different experimental conditions of the proposed method.In our work, the comparison was considered on the databases [1], [2], [3], [4] with the same experimental conditions.For classifying visual objects, researchers have explored mostly different machine learning or deep learning methods to compute different discriminative features for raw EEG or MEG signals.The Classification method proposed by Cichy et al. [1] achieved the lowest classification accuracy of 68.75% for six-class visual objects using PLTP feature with SVM classifier from MEG (306-channel) signals.Similar work mentioned in [2] has delivered five-class performance of 74.5%, 74%, 85%, and 75% for LDA, SVM, GNB, and WeiRD-based supervised classification framework with RDM features from 306-MEG channel Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.scheme, respectively.In contrast, our suggested MSSDM-FF-LDA framework using EEG 306-channel yields an enhanced average classification accuracy of 86.05%.Against the above approach, the classification model proposed by Kong et al. [4] has achieved a competent accuracy of 65.60% using a CNN-based machine learning approach.The mentioned studies have delivered poor classification performance even though they have utilized a deep learning-based approach for classification purposes.In comparison, our proposed MSSDM-FF-LDA framework achieves an improved average classification accuracy of 70.35% on EEG 74-channel.However, the above studies are limited to EEG-based visual cognitive analysis.On the other side, Cichy et al. [3] have employed raw extracted features using RSA and SVM for different channel selection schemes i.e., EEG (74-channel), MEG (306-channel), and combined EEG-MEG (380-channel) to improve visual object classification and yielded highest average classification performance.In extension to this, our work delivers competent VSA classification results when EEG and MEG channels are selected separately or combined EEG-MEG channels.In addition, we have compared the proposed method with the univariate SSDM approach (channelspecific analysis) and direct rhythms analysis which is computed using a BPF [22].The comparative performance is demonstrated in the Table IV.Additional results are presented in Tables V-XX and Figs.11-14 in the Supplementary Material.Overall, we argue that the proposed MSSDM method provides more significant features that enhance interpretability and interchannel information that boosts the performance of visual object classification.These obtained features deploy the most discriminate multivariate modulation pattern of EEG-MEG data for effective visual object discrimination.Despite using subject-independent cross-validation, the proposed MSSDM method significantly delivers higher performance that verifies suitability for practical applications, especially in the visual object classification of EEG and MEG signals.The proposed MSSDM method effectively captures homogeneous spectral characteristics in multichannel with channel-aligned mode extraction and improves joint multivariate mutual features related to visual cognitive analysis.However, it requires precise parameter tuning in the designed SFT spectrum estimation and multivariate filter-bank.

V. CONCLUSION
This article proposed an integrated approach, named novel multivariate SSDM and RCFF feature, for visual imagery multiclass decoding.A novel extension of SSDM is developed for multivariate analysis of nonstationary multichannel EEG-MEG signals as well as performance improvement.We have also used multivariate TF representation for analysis of multichannel EEG-MEG signals and enhanced underlying VSA patterns for visual object recognition.Sparse FT spectrumassisted MSSDM delivers an optimized OC mode with a high spectral resolution, and thus could potentially exhibit highly correlated multispectral modulation among all channels.It has been demonstrated that these novel fusion features were very effective in distinguishing fiveclass visual objects and improved classification performance in a 1 s short-duration epoch.The analysis results of the MEGEEG92 objects dataset demonstrate that the proposed MSSDM-FF-LDA framework has shown distinguished performance and achieved average recognition accuracy of 70.35%, 78.25%, 86.05%, and 86.42% for EEG (74-channel), MEG (74-channel), MEG (306-channel), EEG-MEG (380-channel), respectively.Overall, it is indicated that the proposed MSDDM method with combined EEG-MEG data can be a useful tool for the analysis of visual objects instead of using EEG and MEG separately.As the future unfolds, the prospective work involves the development of a methodology based on a multivariate TF approach to enhance VSA classification performance with a broader range of cognitive visual object classes.

Automated
Classification of Cognitive Visual Objects Using Multivariate Swarm Sparse Decomposition From Multichannel EEG-MEG Signals Shailesh Vitthalrao Bhalerao and Ram Bilas Pachori , Senior Member, IEEE Abstract-In visual object decoding, magnetoencephalogram (MEG) and electroencephalogram (EEG) activation patterns demonstrate the utmost discriminative cognitive analysis due to their multivariate oscillatory nature.However, high noise in the recorded EEG-MEG signals and subject-specific variability make it extremely difficult to classify subject's cognitive responses to different visual stimuli.The proposed method is a multivariate extension of the swarm sparse decomposition method (MSSDM) for multivariate pattern analysis of EEG-MEG-based visual activation signals.In comparison, it is an advanced technique for decomposing nonstationary multicomponent signals into a finite number of channel-aligned oscillatory components that significantly enhance visual activation-related sub-bands.The MSSDM method adopts multivariate swarm filtering and sparse spectrum to automatically deliver optimal frequency bands in channel-specific sparse spectrums, resulting in improved filter banks.By combining the advantages of the multivariate SSDM and Riemann's correlationassisted fusion feature (RCFF), the MSSDM-RCFF algorithm is investigated to improve the visual object recognition ability of EEG-MEG signals.We have also proposed time-frequency representation based on MSSDM to analyze discriminative cognitive patterns of different visual object classes from multichannel EEG-MEG signals.A proposed MSSDM is evaluated on multivariate synthetic signals and multivariate EEG-MEG signals using five classifiers.The proposed fusion feature and linear discriminant analysis classifier-based framework outperformed all existing stateof-the-art methods used for visual object detection and achieved the highest accuracy of 86.42% using tenfold cross-validation on EEG-MEG multichannel signals.Index Terms-Fusion features, multivariate swarm sparse decomposition method, sparse spectrum, visual object recognition.I. INTRODUCTION O VER the recent decades, substantial improvement in multi- variate pattern analysis of magnetoencephalogram (MEG) and electroencephalogram (EEG) has placed fundamental importance on the recognition of visual objects through neural activation patterns.To Manuscript received 25 January 2024; accepted 24 April 2024.Date of publication 17 May 2024; date of current version 17 July 2024.This article was recommended by Associate Editor T. H Falk. (Corresponding author: Ram Bilas Pachori.)

Fig. 1 .
Fig. 1.Timing scheme of the experimental paradigm for the visual object database.

Fig. 2 .
Fig. 2. Block diagram of the proposed approach for visual object recognition using MSSDM method.

1 )
Univariate SSDM Based SwF: An SSDM is an adaptive algorithm that uses a data-driven approach to decompose signals into OCs through iterative SwF.In SwF, the dominant OCs are derived from the frequency band that exhibits the highest energy peak on the basis of the energy spectral density (ESD).The following steps are performed for efficient extraction of OCs through sparse Fourier spectrum (FT) spectrum-based SwF.a) Sparse FT spectrum estimation: The sparse FT technique is employed to obtain spectral coefficients for multivariate nonstationary signals.A sparse-FT spectrum estimation model uses an iteratively reweighted least-square basis estimation function to optimize sparse constraints in the time domain.The sparse FT model of the signal z(v) is designed with the use of forward sparse FT and its adjoint operator Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
wherein |y δ,M [k]| and |z[k]| are the amplitude of FT of the output |y δ,M (k)| and input signal z[k].d is the number of extracted OCs.

Fig. 4 .
Fig. 4. Decomposition of multivariate of three-channel synthetic signal with similar frequency modes using MSSDM method in (a) FFT without SGF, (b) FFT with SGF, (c) SFT without SGF, and (d) SFT with SGF approaches.

Fig. 6 .
Fig. 6.Filter-bank structure for MSSDM-based SwF filter bank of fourchannel EEG and MEG signals with selected OCs.
y 3 c , . .., y n c , . .., y N c } are obtained nth rhythm-based OCs of class (c = 1, 2, 3, 4, 5) of total sample N .T indicates matrix transpose.In our work, pairing selection parameter m = 2 is used to formulate a significant feature subset of the spatial filter.
In order to apply the concept of SWD to multichannel signals, it is necessary to extract the common OC mode for each channel, which locates the common spectral information of the analyzed signal.In MSSDM, the OCs are extracted by selecting the common frequency band from multichannel signals, which possess the highest amplitude peak through the ESD.Therefore, the prime task of MSSDM is to extract matched OCs from multichannel signals so that the obtained joint instantaneous frequency at each oscillatory level possesses common centre frequencies with aligned compact bandwidth.This extension of SSDM delivers the most common multivariate modulated oscillation in the multichannel data due to the following reasons: 1) MSSDM extracts common modes from multichannel data to accurately extract the original signal with the least common minimum bandwidth.2) Contrary to the univariate SWD approach, the MSSDM adopts the mode-alignment property to obtain multivariate modulated oscillations, which correspond to the matching of modes with similar frequency content across multiple channels.Let find out the k number of modes of the multivariate modulated oscillation from nonstationary multichannel input data z(t) containing N number Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II CLASSIFICATION
ACCURACY (IN %) OF COMBINED (α-β) RHYTHMS-BASED FUSION FEATURES COMPUTED FROM EXTRACTED OCS USING MSSDM, USSDM, AND BPF WITH RIEMANN'S CORRELATION ANALYSIS Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.