figshare
Browse

Computational quantum chemical dataset of interacting molecular pairs (IMP dataset version 1.0)

dataset
posted on 2025-05-22, 23:34 authored by Seunghun JangSeunghun Jang

The Interacting Molecular Pairs (IMP) dataset includes relaxed atomic geometries (XYZ type) for individual molecules and molecular pairs, as well as quantum chemical properties extracted from the structural optimization results. The dataset for individual molecules contains the optimization results for both the chromophore and solvent (or solute and solvent, or drug-drug pairs). To enhance user convenience, we have synchronized the ID information in our dataset with the Tag information in Ref. 1, enabling seamless integration with existing experimental data. The dataset includes quantum chemical properties (in CSV format), including total energy, the HOMO-LUMO gap, HOMO energy, and LUMO energy, extracted from structural optimization calculations for individual molecules. Additionally, our dataset includes classification numbers, which clearly distinguish between solvents and chromophores (or solutes and solvents, or drug-drug pairs), and the gradient norm values, which can be used to assess the reliability of the structure optimization results. Here, Eh represents the Hartree energy unit, and a0 denotes the Bohr radius. The quantum chemical property data format for molecular pairs in our dataset is nearly identical to that of individual molecules. However, for molecular pairs, the extracted quantum chemical properties also include the interaction energy arising from molecule-molecule interactions. The interaction energy is defined as Emolecular−pair − (E1st−molecule + E2nd−molecule ), where Emolecular−pair is the total energy of the physically bound molecular pair, and E1st−molecule and E2nd−molecule denote the total energies of isolated first and second molecules, respectively. As each molecule-molecule combination has 10 different configurational geometries, the data of molecular pairs include more extensive structural optimization results than the atomic structures available for individual molecules. For the relaxed geometry for molecule pairs, structural information for the first and second molecules is also provided separately, which is expected to be very useful when generating models that predict values ​​such as interaction energy given two different molecular structures.

1. Joung, J. F., Han, M., Jeong, M. & Park, S. Experimental database of optical properties of organic compounds. Sci. Data 7, 295, https://doi.org/10.1038/s41597-020-00634-8 (2020).

Funding

Ministry of Trade, Industry, and Energy/TS251-10R

Korea Research Institute of Chemical Technology/KK2551-10

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC