Applicability of medium-size basis sets in calculations of molecular dynamic polarisabilities

Static and dynamic average polarisabilities and polarisability anisotropies of seven linear non-polar and polar molecules are calculated within the CCS, CC2, and CC3 approximations using a range of medium-sized basis sets: the polarised LPol-n (n = ds, dl, fs, fl), the aug-pc-n (n = 1, 2), the def2-SVPD, and -TZVPD basis sets. Reference values are obtained using a hierarchy of Dunning's (d-)aug-cc-pVXZ (X = D, T, Q, 5) basis sets. The results are discussed together with the available CCSD values in terms of basis set and correlation method errors, and their ratio. Detailed analysis shows that already the def2-SVPD basis set can be used in CCS polarisability calculations. When affordable, the slightly larger aug-pc-1 basis set is recommended, as it leads to significant reduction of basis set error. The def2-TZVPD, LPol-ds, and aug-pc-2 basis sets are optimal choice within the CC2 approximation, with the latter allowing to approach the CC2 basis set limit. The LPol-ds, -dl, and def2-TZVPD sets outperform the aug-cc-pVTZ set in average polarisability CCSD calculations, with the def2-TZVPD being competitive to other reduced-size sets also in determination of polarisability anisotropy. The aug-pc-2 basis is a particularly attractive choice for CCSD, giving the accuracy of aug-cc-pVQZ at a significantly reduced computational cost. The polarisability anisotropy is shown to be more computationally demanding than the average polarisability, in particular with respect to the accuracy of the correlation method and an accurate evaluation of this property requires at least the CCSD model.


Introduction
In a panorama that sees the fast ongoing progress in the development of efficient quantum chemical methods, pioneered by the late and much missed Prof. Nicholas Handy, reduction of the basis set size in the evaluation of electric properties remains a necessity whenever large molecular systems are considered or a method exhibiting a steep computational scaling with the number of basis functions is employed.
In contemporary quantum chemistry, methods based on the coupled cluster (CC) theory are the choice for molecular property calculations that ensures high accuracy of the results. Exponential parametrisation of the CC wave function yields highly accurate results in a twofold manner: (1) it guarantees size-extensivity and intensivity, i.e. the proper scaling of the calculated molecular properties with the molecular size, also when the CC expansion is truncated at a given excitation level [1,2] and (2) it leads to a fast convergence of the CC expansion and thus to reliable values of molecular properties within the truncated CC models [2]. In particular, the CC model involving the single and double triples equations, reducing the cost compared to CCSDT calculations.
Contrary to the fast convergence of the CC expansion, one-electron expansions commonly employed in CC calculations are based on Gaussian-type functions and are, therefore, slowly convergent towards the basis set limit. To ensure an accuracy of the one-electron expansion matching that of the CC expansion, it is necessary to use large one-electron basis sets in CC calculations. Combined with a steep computational scaling of CC models with the number of basis set functions, the large one-electron basis sets delimit the size of systems which can routinely be investigated with CC models. In particular, the commonly used all-purpose multiply augmented correlation consistent polarised valence multiple-zeta, x-aug-cc-pVXZ basis sets (abbreviated as xaVXZ throughout) developed by Dunning and co-workers [10][11][12], are well known to lead to highly accurate CC results, and to a high cost of the calculations due to their large number of basis functions. A simple reduction of the basis set size by employing the aVTZ (or even the aVDZ) basis set very often means simultaneous large deterioration of the accuracy of the results.
Theoretical developments aimed at reducing the computational cost of the CC calculations, without sacrificing their accuracy, are currently pursued in two major directions. One direction is the development of the linear scaling CC methods and the other is the development of propertyoriented, size-reduced one-electron basis sets.
In the linear scaling CC methods (see the Introduction to Ref. [13] for a concise review of the linear scaling CC methods), the steep computational scaling with the number of orbitals is reduced to linear. In many of these methods, this is achieved by replacing the standard molecular orbitals, which by construction are delocalised and therefore generate the steep computational scaling of the CC methods, by a set of local orbitals [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28], which allow for exploiting the local nature of inter-electronic interaction, thus reducing the computational scaling to linear. As exploiting the locality of inter-electronic interaction usually does not give any computational gains for small systems and, simultaneously, the construction of local orbitals introduces an additional computational overhead, there exists an onset point that defines the size of a molecular system for which linear scaling CC calculations become computationally more efficient than standard molecular-orbital-based CC calculations.
The presence of the onset point in linear scaling CC methods hinders these methods from being efficiently applied to small molecular systems, for which standard molecular-orbital-based implementations are usually more efficient. For small molecular systems, development of reduced-size property-oriented basis sets is, therefore, an attractive remedy lowering the cost of steepscaling standard CC methods. Moreover, such basis sets may also be applied in linear scaling CC calculations on large molecular systems further reducing the cost of the calculations.
We have shown in the earlier work that the use of LPoln basis sets [29] belonging to the Pol family, as well as Jensen's aug-pc-n (n = 1,2) [30][31][32][33], and Rappoport and Furche's def2 [34] basis sets yields reasonable CCSD electric property values for non-interacting systems [35,36]. Furthermore, the LPol-n and Jensen's basis sets have been successfully employed in CCSD calculations of induced linear electric properties in the CO-Ne van der Waals complex [37], as well as in finite-field CCSD and CCSD(T) calculations of induced linear and nonlinear electric properties in model hydrogen-bonded complexes [38][39][40].
Recently, we have evaluated the accuracy guaranteed by the above-mentioned basis sets in CCSD calculations of static and dynamic electric polarisabilities and first hyperpolarisabilities in non-interacting molecules [36]. For that purpose, we have determined reference results using a hierarchy of the xaVXZ basis sets. The main conclusion of that work was that the LPol-ds and TZVPD basis sets are a promising alternative to larger basis sets in the calculations of electric dipole polarisabilities, in particular when large computational savings are a priority. The LPol-dl basis set has been found to be an interesting choice for the evaluation of electric first hyperpolarisabilities and the augpc-2 basis set has been recommended for the calculations of both polarisabilities and first hyperpolarisabilities in molecules.
In the present work, we extend the polarisability part of our previous study [36] to comprehend also the CCS, CC2, and CC3 members of the CC response hierarchy. We thus investigate the convergence of each basis set towards the basis set limit for a given correlation method, by comparison with the largest basis set (lbs) for that method. We also analyse the convergence of CCS, CC2, and CCSD methods towards the FCI limit in the given basis set, by comparison with the CC3 results in that basis set. We further determine the ratio of the basis set error with respect to the correlation method error, obtaining in this way information about the convergence of each combination of correlation method and basis set towards the exact solution to the Schrödinger equation, which may be viewed as an FCI result in a complete basis. We compare the basis set error, correlation error, and their ratio obtained for the reduced-size basis sets against the corresponding values obtained for Dunning's basis sets. This allows to draw conclusions on how the application of the reduced-size basis sets may lower the cost of the calculations compared to Dunning's basis sets, without sacrificing the accuracy. We choose the same basis sets as previously [36]: LPol-n (n = ds, dl, fs, fl), Jensen's aug-pcn (n = 1, 2), and Rappoport and Furche's def2-SVPD and def2-TZVPD.
Among the reduced-size basis sets which we use, the polarised basis sets of the Pol family have a long tradition. The idea of Pol basis sets [41,42] originated from the physical model of harmonic oscillator in a static external electric field [43], and was next generalised to the case of dynamic electric field [44]. This allowed the development of new types of basis sets: the ZPol sets intended for moderately accurate calculations of linear electric properties in molecules [45][46][47] and the LPol-n [29] basis sets developed for accurate calculations of linear and non-linear molecular electric properties. The latter have been shown to be competitive to larger all-purpose basis sets in the evaluation of electric properties in non-interacting systems [29,35,36,48] as well as in hydrogen-bonded [38,39] and van der Waals complexes [37].
Jensen's polarisation consistent basis sets are based on convergence studies of Hartree-Fock energies. They include higher angular momentum functions, the importance of which decreases geometrically, providing a faster convergence than correlation consistent basis sets [30]. The bases have been extended to the evaluation of molecular properties through the addition of diffuse functions. It is well known that for the evaluation of molecular properties like dipole and quadrupole moments, and static polarisabilities, the addition of diffuse functions up to d functions is required to reach the basis set limit in a consistent fashion, but higher order angular momentum functions are significantly less important [33]. Rappoport and Furche's bases were recently built on the Karlsruhe segmented contracted basis sets of split-valence to quadruple-zeta valence quality, and optimised variationally for the evaluation of polarisabilities.
The paper is organised as follows. In Section 2, we briefly outline the details of our computational analysis, giving the essential definitions. Section 3 is devoted to the discussion of our observations. Conclusions complete the study in Section 4.

Computational details
As in our previous study on the evaluation of electric dipole dynamic polarisabilities and first hyperpolarisabilities of isolated molecules [36], we select a test set of seven linear molecules: homonuclear (F 2 , N 2 ) and heteronuclear diatomics (HF, CO), triatomic non-polar (CO 2 ) and polar (HCN) molecules, and a small four-atomic system (C 2 H 2 ). As in Ref. [36], the geometries are taken from Ref. [49]. We evaluate the average static and dynamic electric dipole polarisability α ave (λ) at the radiation wavelength λ, and the polarisability anisotropy, α ani (λ), see Ref. [50]. Above, α ii (λ) = α ii (− λ; λ) and may be calculated using the frequency-dependent linear response function, where the frequency ω of the laser field is related to the wavelength λ by ω = (c 0 /λ), with c 0 denoting the speed of light in vacuo.
The polarisability is calculated for λ = 632.8 nm (corresponding to ω = 0.07200 au) and λ → ∞ nm (ω = 0 au) using the CC linear response code within the DALTON 2013 package [51,52]. To complement the CCSD results of our previous work [36], the calculations are carried out using CCS and CC2 methods. To be able to estimate the rate of convergence towards the FCI limit for the CCS, CC2, and CCSD methods in each basis set, additional reference calculations are carried out using the CC3 model. Since the performance of the finite basis sets and, to a smaller extent, of different correlation methods depends on the frequency used in the calculations, conclusions of the present study may not be applicable at frequencies distant from the static limit.
To analyse the basis set convergence within each correlation method, we evaluate the basis set error as the rootmean-square error (RMSE, over the seven investigated systems and the two frequencies) for each basis set with respect to the lbs values, where i = ave and ani for the average polarisability and its anisotropy, respectively. The index s denotes the molecule, f stands for the frequency, m labels the correlation method, and bs -the basis set used in the calculations. To investigate the convergence of the correlation methods towards the FCI limit for a given basis, we evaluate the RMSEs for each basis set and method with respect to the corresponding CC3 value,

Results and discussion
The results of the polarisability calculations, that are new with respect to our previous work [36], are given in the Supplementary Material. In Table 1, we report the average polarisability and the corresponding anisotropy obtained using the CCS, CC2, and CCSD models and the daV5Z basis set, together with CC3/aV5Z results. The values in Table 1 are used throughout as the reference values. In order to estimate the basis set errors for the investigated reduced-size basis sets, we report in Table 2 and in Figures 1 and 2 the values of RMSE 1 of Equation (2) for each basis set plotted versus the number of functions for oxygen atom. To be able to draw conclusions about the balance between the accuracy and computational gains obtainable with the investigated reduced-size basis sets compared to the standard Dunning's (d)aVXZ basis sets, the latter are also included in the table. Note that for CCS, CC2, and CCSD, we use daV5Z basis as the reference, whereas for CC3 we use aV5Z. The values of RMSE 1 for CC3 are, therefore, not directly comparable to those for the other methods.
For the CCS, CC2, and CCSD methods, the basis set errors in Table 2 are in general smallest for CCS, reflecting that CCS is the simplest member of the CC response hierarchy and therefore has the lowest requirements with respect to the basis set size. The errors for CC2 and CCSD results are close to each other, showing that while the approximations introduced in the doubles equations lead to the reduced computational scaling of CC2 compared to CCSD, the rate of convergence towards the basis set limit seems to be quite similar for the two methods, at least for the properties and molecules investigated here. That, of course, does not mean that CC2 and CCSD have the same accuracy, as their convergence rate towards the FCI limit is very different, as we shall see later.
Convergence within Dunning's hierarchy of augmented and doubly-augmented basis sets is smooth as expected. Comparing aVXZ and daVXZ series, we observe that increasing the cardinal number, X, is more important than additional augmentation.
Turning our attention to the reduced-size basis sets, we note from Table 2 that the SVPD basis set has the quality of Dunning's aVDZ (or even worse in some cases) for all investigated methods and properties. The TZVPD basis, with its size slightly smaller than that of aVTZ, gives results of the same, or slightly better, accuracy as aVTZ in the case of CCS, CCSD, and CC3 methods. For CC2, TZVPD provides significantly higher accuracy than aVTZ, especially for the average polarisability. Jensen's aug-pc-1 basis is very close to the aVTZ basis set for CCS, but moving towards higher members of the CC response hierarchy the similarity between the two basis sets vanishes and the quality of aug-pc-1 deteriorates, particularly for the polarisability anisotropy. This behaviour most likely results from the fact that aug-pc-1 is designed to decrease computational demands in density functional theory, rather than to be applied in CC calculations. The aug-pc-2 basis set, on the other hand, has the quality close to aVQZ across the whole CC hierarchy. Combined with the fact that for aug-pc-2, the number of basis functions is half of that in aVQZ (see Figures 1 and 2), this makes aug-pc-2 a very attractive choice for CC calculations of polarisabilities. For Figure 1. α ave RMSE 1 (au) for the different basis sets with respect to the results obtained in the largest basis set used for the given method. Basis sets are denoted with shortened labels (a, dX with X = D,T,Q,5 for aug-cc-pVXZ and d-aug-ccVXZ, respectively; an with n = 1,2 for aug-pc-n; xy with x = d,f, and y = s,l for LPol-xy; and S, and TZ for SVPD and TZVPD, respectively).
the LPol-n series, we observe that the basis sets containing only the first-order polarisation functions (ds and dl) have for the averaged polarisability smaller errors than the fs and fl sets, but the opposite happens when the anisotropy is considered. The LPol-ds and LPol-dl basis sets give in general better results than Dunning's aVTZ basis set, particularly for the average polarisability, although having a smaller number of basis functions (see Figures 1 and 2) and therefore constitute an attractive replacement for aVTZ. The basis set errors of Table 2 have been obtained by comparison with large Dunning basis set results and therefore reflect the convergence of the investigated basis sets Figure 2. α ani RMSE 1 (au) for the different basis sets with respect to the results obtained in the largest basis set used for the given method. For basis set notation see Figure 1. towards the basis set limit, within a given correlation method. To obtain the full picture of applicability of the reduced-size basis sets, in particular from the point of view of a possible cost reduction, correlation method errors need also to be discussed. We have compiled the latter in Table 3 in terms of RMSE 2 obtained using CC3 results as the reference -see Equation (3). As CC3 yields results very close to FCI, the values of RMSE 2 in Table 3 reflect the convergence of the investigated correlation methods towards the FCI limit in a given basis set. For a given basis set, the values of RMSE 2 systematically decrease when going from CCS to CCSD, showing, as expected, an increasing accuracy across the CC response hierarchy. For all basis sets, the error of a given correlation method is larger for the polarisability anisotropy than for the average polarisability. This, together with our earlier observations based on Table 2, shows that the average polarisability has lower computational demands than the polarisability anisotropy. Furthermore, within a given method, RMSE 2 remains almost independent of the basis set. This implies that, in all cases investigated here, there is approximately no coupling between the correlation method error (RMSE 2 in Table 3) and the basis set error (RMSE 1 in Table 2) and these two errors may thus safely be compared against each other.
To facilitate such comparison, we have reported in Table 4 the ratio RMSE 1 /RMSE 2 . The ratio shows, for a given combination of correlation method and basis set, whether it is more important to improve the basis set or the correlation description to ensure convergence towards the exact solution to the Schrödinger equation, which may be viewed as FCI in a complete basis set.
We see from Table 4 that for a given basis set, the RMSE 1 /RMSE 2 ratio systematically increases from CCS to CCSD, reflecting, as expected, the increasing basis set requirements across the CC response hierarchy. Furthermore, almost all considered basis sets have an accuracy significantly outperforming the accuracy of the CCS method (the RMSE 1 /RMSE 2 ratio is close to zero for CCS in almost all cases), with the exception of aVDZ and SVPD -these two basis sets are, thus, a good match for CCS as they correspond to the accuracy of the method and do not introduce an excessive computational cost. The use of aug-pc-1 basis set allows for very significant reduction of the basis set error at no additional cost with respect to that obtained with the aVDZ basis set. For Dunning's basis sets, the values of the RMSE 1 /RMSE 2 ratio obtained for CC2 with the cardinal number X roughly correspond to CCSD values with the cardinal number X + 1. This is in line with an ample numerical evidence showing that to sustain the accuracy of the results across the CC hierarchies, the cardinal number of Dunning's basis sets has to be increased at least by one when moving from one level of the hierarchy to the next one (see, for example, Chapter 15 of Ref. [2] for the discussion of this issue in the case of static properties). The aVDZ basis set deteriorates the accuracy of the CCSD calculations (the ratio much larger than one) and should, therefore, not be combined with CCSD.
Turning our attention to the results reported in Table 4 for the reduced-size basis sets, we note that all these basis sets match the quality of the CC2 and CCSD models (the values of the RMSE 1 /RMSE 2 ratio not larger than one), except SVPD which is not sufficiently accurate for the average polarisability calculations at the CCSD level (the value of the RMSE 1 /RMSE 2 ratio much larger than one). For the aug-pc-n basis sets of Jensen, we observe a similar behaviour as for Dunning's basis sets: to keep the RMSE 1 /RMSE 2 ratio approximately constant, aug-pc-1 has to be replaced by aug-pc-2, when going from CC2 to CCSD, reflecting a hierarchical nature of the aug-pc-n basis sets. At the same time, we note that the accuracy of the aug-pc-2 basis significantly outperforms that of CC2 method, as is the case also for LPol-fs and LPol-fl in the case of the polarisability anisotropy. In general, the accuracy of LPol basis sets matches the accuracy of both CC2 and CCSD models (the values of the RMSE 1 /RMSE 2 ratio less than one). The same is true for the TZVPD basis.

Summary and conclusions
The focus of this article has been the applicability of the reduced-size basis sets to CC calculations of the averaged polarisability and polarisability anisotropy in small linear molecules. For that purpose, we have estimated basis set errors, correlation method errors, and their ratio for each combination of correlation method and basis set. The basis set error reflects the convergence towards the basis set limit for a given correlation method. The correlation method error gives information about the convergence towards FCI in a given basis set. The ratio of the two errors shows whether it is more important to improve the basis set or the correlation description to move towards the exact solution to the Schrödinger equation.
Considering the CCS model, the basis set limit may be achieved for this model with aug-pc-2 and all LPol-n basis sets, whereas aug-pc-1 and TZVPD give the quality of the aVTZ basis set. The CCS/SVPD combination, corresponding to the accuracy of CCS/aVDZ, is balanced, in the sense that the basis set error and the correlation method error are comparable, and the size of the basis set, therefore, does not introduce an additional computational cost that would be unnecessary from the point of view of the quality of the correlation method. Significant improvement of CCS results with respect to the aVDZ values can be obtained at no additional cost if aug-pc-1 basis set is employed.
For CC2 and CCSD in general, all the reduced-size basis sets ensure the accuracy matching that of the correlation model, except SVPD, which should not be combined with CCSD when the averaged polarisability is calculated. The TZVPD basis set error is the same for both CC2 and CCSD and corresponds to the accuracy of aVTZ. The same is true for LPol-n basis sets. Taking into account the correlation method error for CC2 and the basis set size, TZVPD, LPoldl, and LPol-ds seem to be particularly well suited for CC2 calculations, yielding the accuracy higher than aVTZ and at the same time lowering the cost of the calculations. The CC2 basis set limit may be achieved already with the augpc-2 set, which is only marginally larger than the LPol-dl set. For CCSD calculations, TZVPD gives the accuracy of aVTZ or better for both, the average polarisability and the polarisability anisotropy, and among the LPol-n sets, the ds and dl outperform aVTZ in average polarisability calculations. The aug-pc-2 basis is a particularly attractive choice for CCSD, giving the accuracy of aVQZ and reducing the computational cost significantly.
The polarisability anisotropy has higher computational demands than the averaged polarisability. This is much more pronounced for the convergence towards FCI than for that towards the basis set limit. Finally, the correlation method error has been shown to be practically independent of the choice of the basis set.