“Mindless” DFT Benchmarking
journal contributionposted on 14.04.2009, 00:00 by Martin Korth, Stefan Grimme
A diversity-oriented approach for the generation of thermochemical benchmark sets is presented. Test sets consisting of randomly generated “artificial molecules” (AMs) are proposed that rely on systematic constraints rather than uncontrolled chemical biases. In this way, the narrow structural space of chemical intuition is opened up and electronically difficult cases can be produced in an unforeseeable manner. For the calculation of chemically meaningful relative energies, AMs are systematically decomposed into small molecules (hydrides and diatomics). Two different example test sets containing eight-atom, single-reference, main group AMs with chemically very diverse and unusual structures are generated. Highly accurate all-electron, estimated CCSD(T)/complete basis set reference energies are also provided. They are used to benchmark the density functionals S-VWN, BP86, B-LYP, B97-D, PBE, TPSS, PBEh, BH-LYP, B3-PW91, B3-LYP, B2-PLYP, B2GP-PLYP, BMK, MPW1B95, M05, M05-2X, PW6B95, M06, M06-L, and M06-2X. In selected cases, an empirical dispersion correction (DFT-D) has been applied. Due to the composition of the sets, it is expected that a good performance indicates “robustness” in many different chemical applications. The results of a statistical analysis of the errors for the entire set with 165 entries (average reaction energy of 117 kcal/mol, dubbed as the MB08-165 set) perfectly fit to the “Jacob’s ladder” metaphor for the ordering of density functionals according to their theoretical complexity. The mean absolute deviation (MAD) decreases very strongly from LDA (20 kcal/mol) to GGAs (MAD of about 10 kcal/mol) but then was less pronounced to hybrid-GGAs (MAD of about 6−8 kcal/mol). The best performance (MAD of 4.1−4.2 kcal/mol) is found for the (fifth-rung) double-hybrid functionals B2-PLYP-D and B2GP-PLYP-D, followed by the M06-2X meta-hybrid (MAD of 4.8 kcal/mol). The significance of the proposed approach for thermodynamic benchmarking is discussed and related to the observed performance ranking also regarding wave function based methods.