ViBench: vibrational spectrum-to-structure benchmark
Introcution
This is an official dataset used to develop Vib2Mol. We have established a vibrational spectrum-to-structure benchmark (ViBench, VB), which consists of eight parts: VB-qm9, VB-zinc15, VB-mols, VB-geometry, VB-PAHs, VB-RXN, VB-peptide, and VB-peptide-mod. Details are listed in our paper.
Density functional theory (DFT) was employed to perform conformational optimization of these molecules and calculated the corresponding infrared and Raman spectra. All quantum chemical calculations were carried out using the Gaussian 16 program. The geometries were optimized using the B3LYP-D3BJ functional with a 6-311+G** basis set. Frequency calculations were obtained at the same level at the optimized geometry.
Furthermore, to test model's generalization on experimental spectra, we collected experimentally measured infrared spectra from the public NIST dataset.To facilitate calculations, the spectral dimensions were unified to 1024, and molecular structures were all represented using SMILES.
Fundings
This work was supported by the National Natural Science Foundation (Grant No: 22227802, 22021001, 22474117, 22272139) of China and the Fundamental Research Funds for the Central Universities (20720220009) and Shanghai Innovation Institute.