Geometries and Dipole Moments calculated by B3LYP/6-31G(d,p) for 10071 Organic Molecular Structures
Related publication:
* Florbela
Pereira and Joao Aires-de-Sousa:
Machine Learning for the
Prediction of Molecular Dipole Moments Obtained by Density Functional
Theory.
J. Cheminf. (2018)
https://doi.org/10.1186/s13321-018-0296-5
DOI: 10.1186/s13321-018-0296-5
This data
set is publicly available at http://dx.doi.org/10.6084/m9.figshare.5716246
Files
-----
dipole_moments_10071mols_sdf.tar.gz
- 10071 molecules in the MDL SDFile format including the atomic
coordinates of equilibrium geometries calculated by
B3LYP/6-31G(d,p).
dipole_moments_10071mols.xlsx – Dipole
moments calculated by B3LYP/6-31G(d,p) for 10071 neutral organic
molecules.
Molecules
---------
Molecular structures were retrieved from the ZINC database [1], PubChem database [2] and the GDB-13 database [3] of small organic molecules containing up to 7 atoms of C, N, O, F, S, Cl and Br. The structures were standardized with ChemAxon Standardizer (JChem 15.4.6, 2015, ChemAxon Ltd., Budapest, Hungary, http://www.chemaxon.com) and OpenBabel (Open Babel Package, version 2.3.1 http://openbabel.org) for neutralization and inclusion of all hydrogen atoms. Duplicated molecules were discarded, based on canonical SMILES and InChI codes (stereoisomers were considered as duplicated structures). The final database consists of 10,071 molecules with molecular weights (MWs) in the range 40 – 251 g/mol, and containing up to 19 atoms of elements C, N, O, F, S, Cl, Br, and P. The total number of atoms in a molecule (including hydrogen atoms) range from 6 to 43.
Molecular
geometries were first relaxed by the PM7 methods using the MOPAC
software [4] and then optimized with the GAMESS program [5] with the
B3LYP functional and the 6-31G(d,p) basis set, followed by dipole
moment calculation at the same level of theory.
Format
------
Each
molecule is stored in its own file, ending in ".sdf". These
are the optimized structures by B3LYP/6-31G(d,p).
The format
is the standard MDL SDFile generated with ChemAxon Standardizer and
OpenBabel.
Dipole moments are stored in the
dipole_moments_10071mols.xlsx file.
Column Content
of .xlsx files
------
1 Molecule ID (as appears in the
corresponding .sdf file name)
2 Dipole moment (in Debye).
References
------
[1] Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG: ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 2012, 52:1757-1768.
[2] Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH: PubChem Substance and Compound databases. Nucleic Acids Res 2016, 44(D1):D1202-13.
[3] Blum LC, Reymond J-L: 970 Million druglike small molecules for virtual screening in the chemical universe database GDB-13. J Am Chem Soc 2009, 131: 8732-8733.
[4] MOPAC2012, James J. P. Stewart, Stewart Computational Chemistry, Colorado Springs, CO, USA, http://OpenMOPAC.net (2012).
[5] Schmidt MW, Baldridge KK, Boatz JA, Elbert ST, Gordon MS, Jensen JJ, Koseki S, Matsunaga N, Nguyen KA, Su S, Windus TL, Dupuis M, Montgomery JA: General atomic and molecular electronic structure system. J Comput Chem 1993, 14:1347-1363. GAMESS Version 1 May 2013 (R1).
Funding
Financial support from Fundação para a Ciência e a Tecnologia (FCT) Portugal, under Projects PEst-C/EQB/LA0006/2013 and grant SFRH/BPD/63192/2009 (D. Latino).
Categories
- Inorganic materials (incl. nanomaterials)
- Macromolecular materials
- Structure and dynamics of materials
- Cheminformatics and quantitative structure-activity relationships
- Computational chemistry
- Organic chemistry not elsewhere classified
- Physical organic chemistry
- Physical properties of materials
- Theoretical quantum chemistry