Geometries and Dipole Moments calculated by B3LYP/6-31G(d,p) for 10071 Organic Molecular Structures

modified on 19.08.2018, 00:38
Geometries and Dipole Moments calculated by B3LYP/6-31G(d,p) for 10071 Organic Molecular Structures.

Machine Learning for the Prediction of Molecular Dipole Moments Obtained by Density Functional Theory.

J. Cheminf. (2018)
DOI: 10.1186/s13321-018-0296-5

This data set is publicly available at http://dx.doi.org/10.6084/m9.figshare.5716246


dipole_moments_10071mols_sdf.tar.gz - 10071 molecules in the MDL SDFile format including the atomic coordinates of equilibrium geometries calculated by B3LYP/6-31G(d,p).

dipole_moments_10071mols.xlsx – Dipole moments calculated by B3LYP/6-31G(d,p) for 10071 neutral organic molecules.


Molecular structures were retrieved from the ZINC database [1], PubChem database [2] and the GDB-13 database [3] of small organic molecules containing up to 7 atoms of C, N, O, F, S, Cl and Br. The structures were standardized with ChemAxon Standardizer (JChem 15.4.6, 2015, ChemAxon Ltd., Budapest, Hungary, http://www.chemaxon.com) and OpenBabel (Open Babel Package, version 2.3.1 http://openbabel.org) for neutralization and inclusion of all hydrogen atoms. Duplicated molecules were discarded, based on canonical SMILES and InChI codes (stereoisomers were considered as duplicated structures). The final database consists of 10,071 molecules with molecular weights (MWs) in the range 40 – 251 g/mol, and containing up to 19 atoms of elements C, N, O, F, S, Cl, Br, and P. The total number of atoms in a molecule (including hydrogen atoms) range from 6 to 43.

Molecular geometries were first relaxed by the PM7 methods using the MOPAC software [4] and then optimized with the GAMESS program [5] with the B3LYP functional and the 6-31G(d,p) basis set, followed by dipole moment calculation at the same level of theory.


Each molecule is stored in its own file, ending in ".sdf". These are the optimized structures by B3LYP/6-31G(d,p).

The format is the standard MDL SDFile generated with ChemAxon Standardizer and OpenBabel.

Dipole moments are stored in the dipole_moments_10071mols.xlsx file.

Column Content of .xlsx files

1 Molecule ID (as appears in the corresponding .sdf file name)

2 Dipole moment (in Debye).


Financial support from Fundação para a Ciência e a Tecnologia (FCT) Portugal, under Projects PEst-C/EQB/LA0006/2013 and grant SFRH/BPD/63192/2009 (D. Latino).