DFT benchmarking for adsorption energy in wastewater treatment

Despite its potential importance, the computational chemistry of adsorption processes for wastewater treatment has received negligible attention. Exploring the literature shows several limitations in applying quantum chemistry to study adsorption processes in wastewater treatment. The choice of suitable functionals of density functional theory (DFT) is one of the critical limits of the current application of quantum chemistry in wastewater treatment. Therefore, in this work, we performed a benchmark study of sixteen DFT functionals (including dispersions) to select the most suitable one. The def2-TZVP basis set has been used with the sixteen DFT functionals. The sixteen DFT functionals are benchmarked to the CCSD(T)/CBS level of theory. We used four different pollutants (p-aminobenzoic acid, aniline, p-chloro phenol, and phenol) adsorbed on coronene to perform this benchmarking. In addition to the coronene and the pollutant, four explicit water molecules are used to consider the environmental effects. The results show that the functional MN15 and PW6B95-D3 have the lowest mean absolute deviation relative to the CCSD(T)/CBS adsorption energies. Overall, the functionals MN15, PW6B95-D3, ωB97X-V, M05-2X-D3, M05-D3, and M06-2X-D3 are recommended for studying the adsorption processes. GRAPHICAL ABSTRACT


Introduction
Computational chemistry of adsorption processes for wastewater treatment can significantly assist experimentalists in identifying the most suitable materials with high adsorption capacity.To perform this task, significant attention should be accorded to the computational modeling of the adsorption process.Literature mining shows that ground studies have been reported on the adsorption processes using computational Chemistry.These studies are mainly performed for the adsorption of atoms or small molecules that do not cover the spectrum of possible pollutants.Therefore, an appropriate methodology for studying adsorption processes for wastewater treatment needs to be developed.
Adsorption of cathinone drugs onto a covalent organic framework of boron nitride (B 6 N 6 ) has been investigated using density functional theory (DFT) [1].The investigation has been performed using the PBE functional, and the implicit solvation using the COSMO model has been used.The authors calculated the adsorption energy, which has been reported to vary from 0.268 to 0.585 eV [1].Another structure of boron nitride (a B 2 N 2 monolayer) has been assessed for the adsorption of propylene oxide using DFT [2].Calculations have been performed at the B3LYP/6-31G(d) with the inclusion of empirical dispersions.The adsorption energy of propylene oxide onto the B 2 N 2 monolayer is evaluated to be 1.23 eV [2].Recently BN, BP, AlN, and AlP edge-doped graphenes have been assessed for the adsorption (storage) of H 2 using DFT, the M06-2X/6-311G++(d,p) level of theory [3].It has been found that only the AlP edge-doped graphene has non-negligible adsorption energy, highlighting the potential of the material for H 2 storage [3].Jaiswal and Sahu [4] have assessed the performance of Si 4 Li n (n = 1 − 3) for hydrogen adsorption at the DFT level.Several other adsorption studies involving different materials and different adsorbates have been reported in the literature in recent years [5][6][7][8][9][10][11].
There are many limitations in the current computational modeling of the adsorption process in the literature.The main limitations are modeling the environment (constituted of water molecules) and modeling temperature effects.Recently, we recognized these limitations and proposed a methodological approach to address the effects of temperature and the environmental effects [12,13].In the proposed model, we used the example of phenol as a pollutant and coronene as an adsorbent.The solvent effects are considered by using explicit and implicit solvent water, adopting a hybrid solvation model.The number of explicit water molecules varies increasingly from one to twelve explicit water molecules [13].The particularity of this model lies in the accurate and affordable description of the solvation process.A schematic representation of the model used to compute the adsorption free energy is provided in Figure 1.The temperature effects have been considered by calculating the adsorption free energy as a function of temperature instead of the adsorption electronic energy, which is temperature-independent [13].It is worth mentioning that previous investigations have mainly reported the adsorption electronic energy without considering the temperature effects [14][15][16][17].We have shown in our previous work that the adsorption electronic energy tends to overestimate the adsorption power of an adsorbent toward a given pollutant [12,13].
Despite addressing some limitations, we have noted that several limitations remain unresolved toward accurate computational modeling of the adsorption processes [13].One of the limitations pointed out in our previous work is that the calculations have been performed using a randomly chosen DFT functional.The functional choice can considerably affect the evaluation of the adsorption free energy.Therefore, in the current work, we undertook to perform a benchmark study of DFT functionals to choose the most suitable DFT functional for calculating the adsorption energy.Considering the dispersive nature of the systems, we have chosen sixteen DFT functionals that include Grimme's empirical dispersion [18,19].All the functionals recommended by Mardirossian and Head-Gordon [20], and Grimme and coworkers [21], and implemented in the Gaussian 16 suite of codes [22] have been considered for the assessment.In addition, the best functional (ωB97M-V [23]) obtained by Mardirossian and Head-Gordon [20] after benchmarking of 200 functionals over nearly 5000 data points has been considered in this work.As benchmark, we used four accurate ab-initio levels of theory, including DLPNO-MP2/def2-TZVP, MP2/CBS, DLPNO-CCSD(T)/def2-TZVP, and CCSD(T)/CBS.The CCSD(T)/CBS adsorption energy has been estimated using a scheme proposed by Chen and coworkers [24].

Methodology
The methodology section comprises two subsections.In Subsection 2.1, we present the adsorption procedure and molecular systems used in this work.In Subsection 2.2, we provided the computational details and the selected DFT functionals used for the benchmarking.

Adsorption energy and studied systems
For each of the four systems reported in Figure 2, we have calculated the adsorption energy using DFT functionals and ab-initio methods.As noted in Figure 2, we used four water molecules as an example for the explicit solvation to consider the environmental effects.The adsorption energy is calculated using the Equation (1).
X represents the pollutant molecule, which can be p-aminobenzoic acid, aniline, p-chloro phenol, and phenol, for the four examples chosen in this work.The adsorption procedure followed in this work is the one reported in our previous work [13] and reproduced in Figure 1.However, we have not considered the implicit solvation for simplicity, and we used only four water molecules.For each of the four systems, we need the structures reported in Figure 2 and the structure of coronene.The structure of coronene is unique; only its energy varies depending on the computational level of the theory.For X − (H 2 O) 4 @Coronene and X − (H 2 O) 4 , there are several possible configurations.
We have explored all the possible configurations and identified the most stable structures reported in Figure 2. Thus, the structures used in this work are the most stable configurations obtained after exploring their potential energy surfaces.The exploration started with classical molecular dynamics, followed by full optimizations using a DFT functional.
Examination of the structures shows that the water molecules interact mainly with the pollutant.The water   2).Therefore, the most suitable functional is the functional that can accurately describe the bonding mentioned above interactions.Hence, the choice of DFT functionals, including empirical dispersions.
The sixteen DFT functionals have been benchmarked to ab-initio methods, including DLPNO-MP2/def2-TZVP, MP2/CBS, DLPNO-CCSD(T)/def2-TZVP, and CCSD(T)/CBS.The CCSD(T)/CBS adsorption energies have been estimated using the formula proposed by Chen and coworkers [24]: Chen et al. [24] have reported that the proposed scheme leads to a maximum deviation of 0.28 kcal/mol and a mean absolute deviation of 0.09 kcal/mol comparing to the original/canonical CCSD(T)/CBS level of theory.In this study, the medium basis set used is the def2-TZVP basis set.Therefore, to estimate the adsorption energy at the CCSD(T)/CBS level of theory, we need to compute the adsorption energy at MP2/CBS, DLPNO-MP2/def2-TZVP, and DLPNO-CCSD(T)/def2-TZVP levels of theory.Calculations at these levels of theory have been performed using the ORCA computational Chemistry code [40].For the accuracy of the calculations, we used tightpno and tightscf options.The AutoAux generation procedure has been used to generate the auxiliary basis sets automatically [41].The CBS extrapolation has been performed using the two points strategy involving electronic energies calculated using the def2-TZVPP and the def2-QZVPP basis sets.Further details on the CBS extrapolation can be found in our previous works [42,43].
The DLPNO-CCSD(T) version used in this work corresponds to the previous implementation of the method, often referred to by DLPNO-CCSD(T0).An improved implementation of the method, DLPNO-CCSD(T1), has been implemented in the ORCA computational chemistry code [44].As pointed out by the authors, the DLPNO-CCSD(T0) method can fail to reproduce the canonical CCSD(T) in a few cases.The possible limitation of DLPNO-CCSD(T0) in reproducing the canonical CCSD(T) method has also been reported by recent authors [24,45].Therefore, the DLPNO-CCSD(T0) used in this work may fail to reproduce the CCSD(T) method.
To be safe, one should have performed a test on one or two cases to assess the difference between DLPNO-CCSD(T0) and DLPNO-CCSD(T1).However, the computational resources have not allowed us to perform calculations using the DLPNO-CCSD(T1) method.Consequently, the reader should consider this possible limit of the results provided in this work.

Results and discussions
In this section, we start by presenting the adsorption energies calculated using the four ab-initio methods, followed by the adsorption energies calculated using the sixteen DFT functionals.
The results show that the adsorption energies calculated at the DLPNO-MP2/def2-TZVP reproduce with an acceptable accuracy those calculated at the MP2/CBS.The mean absolute deviation between the two adsorption energy sets is evaluated as 1.3 kcal/mol.This result highlights the accuracy of DLPNO-MP2 approximation, which reproduces its canonical form MP2. As our CCSD(T)/CBS is estimated from Equation (2), the MAD between the DLPNO-CCSD(T)/def2-TZVP and CCSD(T)/CBS is also estimated to be 1.3 kcal/mol (see Table 1).It comes out from the calculated adsorption energies that the p-aminobenzoic acid has the highest adsorption energy toward coronene.This high adsorption energy is due to the number and strength of the non-covalent bondings that the p-aminobenzoic acid establishes with coronene.Therefore, based on these adsorption energies, the adsorption power of coronene toward the studied pollutants is increasing in the following order: aniline < phenol < p-chloro-phenol < p-aminobenzoic acid (see Table 1).

Adsorption energy and DFT benchmarking
After calculating the adsorption energies using the four ab-initio levels of theory, we have calculated the adsorption energies of the four systems using sixteen different DFT functionals, including Grimme's dispersion corrections (except MN15, ωB97M-V, and ωB97X-V).The calculated adsorption energies are reported in Table 2.It has been found that the functionals B3PW91-D3, B97-D3, and BLYP-D3 have the worse performance.Therefore, the corresponding adsorption energies have not been reported in Table 2 to avoid cumbersomeness.In addition, we have also evaluated four statistical descriptors (MAD, MAX, STD, and MD) of the sixteen DFT functionals in reference to the CCSD(T)/CBS level of theory.For straightforward interpretation of the statistical descriptors, we have also plotted the diagram of the descriptors in Figure 3. Examination of the calculated adsorption energies shows that all the sixteen functionals have correctly predicted (in reference to CCSD(T)/CBS) the order of the adsorption power of coronene toward the studied pollutants.This indicates the importance of the empirical dispersion, which provides acceptable accuracy when combined with a DFT functional from GGA (generalized gradient approximation) and beyond.
It comes out from Figure 3 that the DFT functional MN15 (which does not include Grimme's empirical dispersion) is found to be the best functional in calculating the adsorption energy of the studied systems.The MAD, MAX, STD, and MD of MN15 are evaluated to be 0.2, 0.3, 0.2 and 0.0 kcal/mol, respectively (see Table 2).The MN15 functional has already been trained by its authors to be suitable for describing non-covalent interactions [32,33].As a reminder, the MN15 functional has the smallest MUE (mean unsigned error equivalent to the MAD used in this work) on 87 non-covalent data (evaluated to be 0.25 kcal/mol).Therefore, the present work is another confirmation of the accuracy of the MN15 functional.It should also be reminded that the MN15 functional has been tested to be the best DFT functional suitable for the study of neutral acetonitrile clusters [46], stabilized by non-covalent interactions.In addition, we noted that the functional PW6B95-D3 has almost the same MAD as MN15 (within a DFT accuracy).Therefore, the MN15 and the PW6B95-D3 functional can be considered to be of the same accuracy in computing the adsorption energies of the studied systems.The results show that the functional ωB97X-V has a MAD of 0.7 kcal/mol, and it is found to be the third most performant functional among the assessed DFT functionals.Generally, we noted that the mean absolute deviation of the studied functionals varies from 0.2 to 3.4 kcal/mol.As pointed out previously, the maximum value of MAD (3.4 kcal/mol) is indicative of the accuracy of the selected functionals.This accuracy is probably ascribed to empirical dispersion corrections, which are important in non-covalent systems.In addition to MN15, PW6B95-D3, and ωB97X-V, the functionals M05-2X-D3, M05-D3, and M06-2X-D3 have their MADs and RMSEs within 1.0 kcal/mol.The investigation shows that the B3PW91-D3, the B97-D3, the BLYP-D3, and the M06-D3 are the four least performant functionals among the functionals tested in this work (see Figure 3).Overall, the functionals MN15, PW6B95-D3, ωB97X-V, M05-2X-D3, M05-D3, and M06-2X-D3 are recommended for studying the adsorption energy in wastewater treatment.
It is worth noting that the functional ωB97X-V, which has been found to be the third most performant in this work, has been reported by Grimme and coworkers [21] to be the best hybrid functional followed by M052X-D3(0), and ωB97X-D3.In addition, the most successful functional ωB97M-V, reported by Mardirossian and Head-Gordon [20], has a MAD of 1.3 kcal/mol in this work.Previous DFT benchmarking by Mardirossian and Head-Gordon [20] and Grimme and coworkers [21] have suggested some functionals that have not been included in this work.Consequently, including these functionals may affect some conclusions of this work.Nevertheless, considering the smallest MAD obtained with MN15 (0.2 kcal/mol), it can be safely recommended even without testing the other functionals.
Previously, benchmarking some DFT functionals related to the interaction between aromatic molecules was performed.Prampolini, Livotto, and Cacelli [47] performed a benchmark study of four DFT functionals M06-2X, CAM-B3LYP-D3 [48], BLYP-D3, and B3LYP-D3.The functionals were used to study the interaction potential energy surfaces of benzene dimers compared to the CCSD(T) method.It has been found that the CAM-B3LYP-D3 functional lead to the most accurate results [47].The functionals M06-2X, BLYP-D3, and B3LYP-D3, over which the CAM-B3LYP-D3 is the most accurate, perform poorly in this work.Thus, the identification of CAM-B3LYP-D3 as the best functional would be attributed to the limited set of functionals used by the authors [47].Another study by Smith and Patkowski [49] has assessed the performance of B3LYP-D2, B3LYP-D3, B97-D2, B97-D3, PBE-D2, PBE-D3, M05-2X, M06-2X, and ωB97X-D in the evaluation of the interaction energy between methane and some aromatic molecules.The authors reported that the B3LYP-D3/aug-cc-pVDZ level of theory, including the counterpoise correction, performs best among the assessed functionals [49].A similar study by the authors has been reported to assess the performance of several DFT functionals in calculating the interaction energy between carbon dioxide and polyheterocyclic aromatic compounds [50].The assessed functionals include M05-2X, M06-2X, B2PLYP, B3LYP, BLYP, PBE, PBE0, BP86, B97, and LC-ωPBE associated with def2-SVP, TZVP, QZVP, aug-cc-pVDZ, and aug-cc-PVTZ.D2 and D3 dispersion corrections with different damping orders and counterpoise corrections have been considered.Overall, the authors found   Note: For DFT functionals, the basis set used in the calculations is def2-TZVP.All the DFT functionals used include the D3 empirical dispersion, except MN15, ωB97M-V, and ωB97X-V.Statistical descriptors, including the mean absolute deviation (MAD), the maximum deviation (MAX), the standard deviation (STD), and the mean (signed) deviation (MD) are calculated in reference to the CCSD(T)/CBS adsorption energies.Some names have been truncated to fit the table on the page: PW6 = PW6B95-D3.All energies are reported in kcal/mol.Amino stands for p-Aminobenzoic acid, while pChloro is p-Chloro Phenol.
We have calculated the adsorption energies using a double zeta basis set (def2-SVP) to seek a cheap solution.The calculated adsorption energies are reported in Table 3.Similar to Table 2, statistical descriptors (MAD, MAX, and STD) are presented in Table 3.The results show that the smallest MAD is 2.9 kcal/mol, which is significant when accuracy is sought.Moreover, it has been found that the largest value of the maximum deviation (MAX) is 11.1 kcal/mol, while the corresponding MAD is 8.9 kcal/mol.This result indicates that the def2-SVP does not achieve similar accuracy as the def2-TZVP.Therefore, to accurately calculate the adsorption energy of the studied systems using DFT, one needs to use the def2-TZVP basis set.On the other hand, The results show that the MN15 functional has the smallest MAD (among the studied functionals) even with the def2-SVP basis set.Therefore, in the case of limited computational resources, the MN15 functional associated with a double zeta basis set can be recommended.
The results show that the adsorption power of coronene toward the studied pollutants is increasing in the following order: aniline < phenol < pchloro-phenol < p-aminobenzoic acid.In addition, we have found that the MN15 functional has the smallest mean absolute deviation (MAD) compared to the CCSD(T)/CBS level of theory.The MN15 MAD is evaluated to be 0.2 kcal/mol.Besides, the PW6B95-D3 functional has been found to perform with similar accuracy to the MN15 functional.Overall, we have noted that the functionals MN15, PW6B95-D3, ωB97X-V, M05-2X-D3, M05-D3, and M06-2X-D3 have their MADs and RMSEs within 1.0 kcal/mol, and can be recommended for further investigation of adsorption processes.Furthermore, we noted that the functionals B3PW91-D3, B97-D3, BLYP-D3, and M06-D3 have the largest MADs among the assessed DFT functionals.Therefore, these functionals should be avoided when considering non-covalent systems for adsorption processes.
We calculated the adsorption energies using the def2-SVP basis set to explore possible cheap solutions.It has been found that the smallest MAD obtained using the def2-SVP is 2.9 kcal/mol, which is considerable.It has been concluded that the MN15 functional can be recommended associated with a double zeta basis set in the case of limited computational resources.

Figure 1 .
Figure 1.Schematic representation of the model used in this work to compute the adsorption energy of phenol onto coronene with four explicit water molecules.

Figure 2 .
Figure 2. The four systems studied in this work for the benchmarking of DFT functionals.The structures reported here are optimized at the PW6B95-D3/def2-TZVP level of theory.The structures are the most stable configurations obtained after optimizations in our previous works.

Figure 3 .
Figure 3. Three statistical descriptors of the sixteen DFT functionals as compared to the CCSD(T)/CBS level of theory.All the assessed DFT functionals include the D3 empirical dispersion, except MN15, ωB97M-V, and ωB97X-V.The statistical descriptors are the mean absolute deviation (MAD), the maximum deviation (MAX), and the standard deviation (STD).

Table 3 .
Adsorption energies of the studied systems were calculated using sixteen different DFT functionals benchmarked to our estimated CCSD(T)/CBS level of theory.Note: For DFT functionals, the basis set used in the calculations is def2-SVP.All the DFT functionals used include the D3 empirical dispersion, except MN15, ωB97M-V, and ωB97X-V.See caption of Table2for details.All energies are reported in kcal/mol.Amino stands for p-Aminobenzoic acid, while pChloro is p-Chloro Phenol.