Terahertz Sensing using Deep Neural Network for Material Identification

Terahertz (THz) spectrum is identified as a potential enabler for advanced sensing and positioning, where THz-Time domain spectroscopy (THz-TDS) is specified for investigating the unique material properties. The transmission THz-TDS measures the light absorption of materials. This paper proposes a novel low-complex deep neural network (DNN)-based multi-class classification architecture to sense a wide variety of materials from the transmission spectroscopy. Based on the spectroscopic measurements made across a chosen THz region of interest, DNN extracts and learns the distinctive crystal structure of materials as features. With sufficient quantities of noisy spectroscopic data and labels, we train and validate the model. In low SNR regions, the proposed DNN classification architecture achieves about 92% success rate, which is greater than those of the state-of-the-art methods.


I. INTRODUCTION
Wireless communication is consistently evolving throughout several decades with the expansion of carrier frequency and bandwidth. Accordingly, the upcoming sixth generation (6G) is anticipated to offer truly intelligent wireless services [1] with a wide variety of new applications such as holoport (holographic teleportation), extended reality (XR), metaverse, self-sustaining machinery, and many more [2], [3]. Consequently, the unexplored Terahertz (THz) spectrum (range: 0.1 − 10 THz, where 0.1 − 0.3 THz, named as a sub-THz band) attracts interest because of its greater bandwidth and distinctive characteristics [4]. Even though the THz spectrum has a broader bandwidth, it is highly sensitive to molecule absorption and obstructions, leading to a limited coverage range. Nevertheless, the THz band is a potential enabler for accurate localization and advanced sensing [5] . Exploring localization and sensing under realistic circumstances may facilitate a wide range of critical wireless challenges.
THz sensing has the potential to support a wide variety of applications in various domains, such as bio-medical, manufacturing industries, security, and agriculture, for measuring the quality of products, sensing environmental conditions, and identifying or detecting materials [4] which should be carefully handled. THz spectroscopy is the technique to measure the properties of objects using THz signals, whereas time-domain spectroscopy (THz-TDS) is specified as a potential technique This work was supported by the Academy of Finland 6Genesis Flagship (grant no. 346208) and ALMLBEAM project (grant no. 24303959).
to examine the materials based on its unique feature of identifying the "fingerprints" of materials [6]. Specifically, THz spectroscopy identifies the crystal structure of molecules using hydrogen bonds, which extracts the unique feature of materials in the spectrum [6]. THz-TDS uses a pulsed wave to measure the properties passively, where the transmitter sends a short pulse toward the specimen, and the receiver measures the strength of the received electromagnetic field both with and without the specimen [4]. The ratio between the transmitted and received signals illustrates the material properties. Specifically, two varieties of THz-TDS techniques are available to measure the aforementioned properties: transmission and reflection spectroscopy [4], [7]. Here, transmission spectroscopy refers to measuring the amount of light absorption, whereas reflection spectroscopy measures light reflection or scattering. Nevertheless, because of its simpler architecture and more accurate observations, transmission spectroscopy is much more common for sensing applications [4].
Extracting entangled features of the transmission spectroscopy creates many opportunities to detect the materials and their potential use cases. Recent advancements in artificial intelligence (AI) and machine learning (ML) show [8] the prospect of investigating detection [9], estimation, and prediction in wireless communication using realistic data-driven approaches [10]- [12]. In this regard, joint feature extraction and detection of noisy spectroscopic measurements are required to identify the materials uniquely. In contrast, THz-TDS measures can be perturbed by noise due to absorption. Furthermore, it is necessary to extract the specific latent features of each material to distinguish a wide variety of materials [4].
Several studies explore material identification using conventional feature extraction techniques and deep learning [7], [13]. A material identification using approximate entropy and deep neural network (DNN) is presented in [13], where 14 different materials are investigated for identification. Also, the proposed approach shows 80.4% accuracy in detecting the materials without noise. A recent study on THz sensing based on signal processing and ML is presented in [7], where pre-processing, feature extraction, and performance analysis of THz spectroscopy are discussed. Correspondingly, authors in [7] compared different pre-processing techniques and existing ML algorithms for 5 materials. How these show that a low-complex classification algorithm without additional preprocessing for numerous materials is a promising opening for THz sensing.
The main contribution of this paper is to propose a lowcomplex DNN algorithm for THz sensing, where we integrate feature extraction and material identification in a single unit. We further investigate a large number of material properties and tune the model to detect them over a signal-to-noise (SNR) range of interest. The rest of the paper is structured as follows: Section II describes the system model of THz sensing and the details of the transmission spectroscopy. The proposed DNN architecture and descriptions are discussed in Section III. Numerical simulations are presented in Section IV. Finally, Section V concludes the paper.

A. System Model
Consider a THz sensing system to identify a set N of N materials with a transmission spectroscopy technique to measure the absorption of each material. Each material sample is powdered from the source components, then pressed into solid pieces 1 , and tested in THz-TDS experimental setup as shown in [7]. As mentioned earlier, each measurement is compared with and without the sample at the test probe, namely the transmitted and incident pulse, respectively. After that, the complex-valued spectral property of the two measures is then obtained using Fourier transformation. Consequently, the optical properties of the sample in the frequency domain are calculated by the ratio of the electric field strength of the transmitted and incident pules. As stated, the amplitude of THz-TDS transmittance of i-th material is defined as where E t,i is the transmitted amplitude, E 0 is the incident amplitude, n i is the refractive index,ñ i is the complex refractive index, d i is the thickness of the material, f is the frequency, and c is the free-space light speed. We can calculate the refractive index n i , based on the measurement as where φ i is the phase difference between the measurement with and without sample. Based on (1) and (2), we can calculate the absorption coefficient A i as The feature extraction and identification of materials can be made based on transmittance and absorption coefficients over the THz spectrum of interest. However, we further investigate based on transmittance properties given in [14]. 1 The detailed procedure of the measurements is given in NIST.

B. Problem Definition
We measure the transmission spectrum of each of the materials for a range of THz bands and calculate (1). The output measurement vector for i-th material is given by where y i,m = T i (f m ) ∈ R 1 , M is the sub-wave number of the spectrum, and w ∼ N (0, σ 2 I) is the Gaussian noise vector. Also, we take a sufficient number of features K i ≤ M are available in the measurement spectrum. However, we do not extract them separately. We specify the label of the measurement y i as L(y i ). Then, based on (1) and (4), we define the classification problem as: where g i is the classifier for the i-th material. The main challenge of this problem is to determine a distinctive classifier when N increases, where mutually non-exclusive featuresbased classifiers can only detect the accurate materials.

A. Deep Neural Network based Classification
The objective of this problem is to determine the corresponding classifiers for the problem (5). Considering the mutually non-exclusive nature of this data, we formulate this problem as a multi-class classification problem of {1, . . . , N } classes. The noisy measurement (4) is the input data to the model, and a binary vector of N elements is the corresponding label, where the index of "1" represents the specific material. We propose a low-complex DNN architecture to solve this problem, as shown in Fig. 1. The proposed architecture consists of three main layers: the input layer, the hidden layer, and the output layer, where the multiple hidden layers are designed to learn the latent variables of the materials.
The input layer is designed as a fully connected flattened layer; multiple hidden layers include four dense layers with ReLU activation, batch normalization, and drop out. Finally, Softmax is chosen for the output layer. As training data, we produce a set D of D spectroscopic measurements. Input information is fed into the flattening layer, and the output vector z in = z j i ∈ R α×1 is shown as where W in ∈ R α×M , b in ∈ R α×1 , and α are weight, bias, and width of the first layer, respectively. Then, the output vector z j i learns the features of the materials in the set of hidden layers with ReLU activation, where the structure of each layer follows the same as (6). For a x input value ReLU defined as Furthermore, the batch normalization layer reduces the additive white Gaussian noise effect [11] in the measurements and improves the learning of the architecture. At the same time, dropout prevents the overfitting of the model during the training phase and enhances layer weight optimization. The architecture learns the features of the material using the collection of hidden layers [15]; the DNN architecture produces output z out =z j i ∈ R N ×1 as where W out ∈ R N ×α , b out ∈ R N ×1 are weight and bias of the last layer, respectively. Finally, the softmax layer produces the probabilities of beaning a specific material out of N potential materials for the input measurement. Clearly, the model trained to have N unique classifiers (g 1 , . . . , g N ) to solve this problem. The output probability produced by the softmax for i-th material is given by In the end, the material for the spectroscopic input data is classified as the index with the highest likelihood.

B. Computational Complexity
We study the computational complexity of the proposed DNN architecture based on the number of floating-point operations (FLOP). First, the complexity associated with the input layer is [(2M − 1)α + α] FLOPs. Four dense layers have a ReLU activation unit; therefore, total complexity will be 4[(2M − 1)α + α + α] FLOPs. Furthermore, the dropout layer has (α) FLOPs, and three batch normalization layers have 3(4α) FLOPs. Finally, the Softmax layer has (3N −1) FLOPs. Overall, the complexity of the proposed DNN is given by

A. Simulation Configuration
We study the performance of the proposed DNN to classify different materials with noisy measurements. First, we present the transmittance properties with the wave numbers for selected materials. Then, we evaluate the same collection of materials considered in [7] to compare the probability of detection, where we assess the principal component analysis (PCA) [16], partial least squares (PLS), t-distributed stochastic neighbor embedding (t-SNE), and non-negative matrix factorization (NMF) for feature extraction. Then, examine the extracted features based on generalized regression neural networks (GRNN), which perform superior in [7]. Note that here, GRNN is a standard feed-forward neural network. Finally, we show our model's classification performance for various classes. The numerical investigation parameters are presented in Table I. Also, the proposed DNN parameters are given in Table II.
We train our DNN model using binary-cross entropy loss function, Adam optimizer in Keras with Tensorflow back end. We train, validate, and test our model with the THz spectroscopic data based on the confusion matrix. Our main objective is to achieve higher recall (probability of detection). However, we take precision, harmonic mean (F 1-score), area under the curve (AUC), and overall accuracy into consideration for the calibration. The precision, recall, and F 1-score are defined as

B. Results
First, we examine the transmission spectroscopy of selected materials over a THz frequency spectrum range to identify the potential feature variations. We show the transmittance versus wave number in Fig. 2 for alumina, aspirin, baking powder, baking soda, cascade, cellulose, and chalk. The feature variation of material with different wave numbers indicates the uniqueness of the THz-TDS. Second, we study the classification performance for selected materials considered in [7] for comparison, where we conducted the simulation with the same number of measurements for an accurate evaluation. We present the probability of detection versus SNR in Fig. 3. Further, we observe a significant improvement in identifying the materials in a low SNR regime, where DNN achieves more than 90% of classification success rate at SNR = 0 dB. Moreover, trained DNN directly takes the measurement data and detects the material without pre-processing, significantly reducing time consumption. Finally, we study the scalability of the proposed DNN classification approach and show the performance with various materials in Fig. 4. We observe a performance degradation with the increasing number of materials from N = 5 to N = 30. Interestingly, the initially selected five materials have a unique feature over the spectroscopic region, creating the possibility of detecting them more accurately. However, the proposed DNN achieves more than 90% success rate for N = 30 at SNR = 8 dB.

V. CONCLUSION
THz-TDS creates a potential pathway for sensing materials with unique crystal structure-based feature extraction. Sophisticated feature extraction and classification techniques are necessary to identify a wide variety of materials with moderate complexity. A novel DNN multi-class classification architecture is proposed in this paper. Transmission spectroscopic measurements and corresponding material indices are collected and assembled to create the dataset. The proposed DNN model is trained and validated to classify the unique materials from the test set. With lower complexity, the classification accuracy of the proposed architecture is found to be superior to state-of-the-art classification schemes. Further, we expect to investigate the missed-classification scenarios under a low SNR regime with feature-noise perturbation.