posted on 2024-03-01, 19:05authored byYiming Chen, Chi Chen, Inhui Hwang, Michael J. Davis, Wanli Yang, Chengjun Sun, Gi-Hyeok Lee, Dylan McReynolds, Daniel Allan, Juan Marulanda Arias, Shyue Ping Ong, Maria K. Y. Chan
X-ray absorption spectroscopy (XAS) is a commonly employed
technique
for characterizing functional materials. In particular, X-ray absorption
near edge spectra (XANES) encode local coordination and electronic
information, and machine learning approaches to extract this information
are of significant interest. To date, most ML approaches for XANES
have primarily focused on using the raw spectral intensities as input,
overlooking the potential benefits of incorporating spectral transformations
and dimensionality reduction techniques into ML predictions. In this
work, we focused on systematically comparing the impact of different
featurization methods on the performance of ML models for XAS analysis.
We evaluated the classification and regression capabilities of these
models on computed data sets and validated their performance on previously
unseen experimental data sets. Our analysis revealed an intriguing
discovery: the cumulative distribution function feature achieves both
high prediction accuracy and exceptional transferability. This remarkably
robust performance can be attributed to its tolerance to horizontal
shifts in the spectra, which is crucial when validating models using
experimental data. While this work exclusively focuses on XANES analysis,
we anticipate that the methodology presented here will hold promise
as a versatile asset to the broader spectroscopy community.