Rapid Detection
of Physicochemical Indicators of Tobacco
Flavorings Using Fourier-Transform Near Infrared Spectroscopy with
Chemometrics and Machine Learning
posted on 2025-05-06, 12:08authored byQinlin Xiao, Jian Zheng, Jing Wen, Fada Deng, Ruifang Gu, Li Li, Yong He, Juan Yang
Timely and rapid monitoring of the quality of tobacco
flavorings
is crucial for the accurate quality management of cigarette products.
In this study, FT-NIR spectroscopy combined with chemometrics and
machine learning was used to detect physicochemical indicators of
tobacco flavorings. FT-NIR spectra of 1,608 flavoring samples, encompassing
145 categories and 90 production batches from actual industrial scenarios,
were collected. The physicochemical indicators, including the acid
value, relative density, and refractive index, were accurately measured.
The effect of different spectral preprocessing methods (standard normal
variate transformation (SNV), multiplicative scatter correction (MSC),
and normalization) was compared. The least angle regression (LAR),
successive projection algorithm (SPA), and random frog (RF) were used
to select characteristic wavelengths. Partial least-squares regression
(PLSR), decision tree (DT), least-squares-support vector machine (LSSVM),
and convolutional neural network regression (CNNR) were applied to
establish detection models. For acid value, the normalization-SPA-LSSVM
model achieved the best performance, reaching an R2p of
0.929, RMSEP of 1.155, and an RPD of 3.741. For relative density,
the MSC-LAR-LSSVM model performed best, with an R2p of
0.951, RMSEP of 0.018, and an RPD of 4.481. For the refractive index,
the SNV-SPA-LSSVM model obtained satisfactory results, with an R2p at 0.955, an RMSEP at 0.004, and an RPD of 4.664. The results
illustrated that FT-NIR spectroscopy is an effective approach for
detecting physicochemical indicators of large-scale industrial tobacco
flavorings and holds promise for accurate quality assessment of tobacco
flavoring products. Also, the performance of the CNNR model is not
consistently superior to that of conventional models, especially in
situations when the number of features used for building models is
relatively limited.