Nonorthogonal tight-binding model with H–C–N–O parameterisation

A parametric nonorthogonal tight-binding model (NTBM1) with the set of parameters for H–C–N–O systems is presented. This model compares well with widely used semi-empirical AM1 and PM3/PM7 models but contains less fitting parameters per atom. All NTBM1 parameters are derived based on a criterion of the best agreement between the calculated and experimental values of bond lengths, valence angles and binding energies for various H–C–N–O molecules. Results for more than 200 chemical compounds are reported. Parameters are currently available for hydrogen, carbon, nitrogen, oxygen atoms and corresponding interatomic interactions. The model has a good transferability and can be used for both relaxation of large molecular systems (e.g., high-molecular compounds or covalent cluster complexes) and long-timescale molecular dynamics simulation (e.g., modelling of thermal decomposition processes). The program package based on this model is available for download at no cost from http://ntbm.info.


Introduction
Currently available methods of electronic structure calculation can be conventionally divided into ab initio techniques, semi-empirical schemes and empirical approaches. It stands to reason that each of the abovementioned method has its own advantages and drawbacks. For instance, ab initio methods (say, density functional theory [1,2]) are very accurate but computer-resources demanding and hence applicable to small systems only. In contrast, empirical approaches (e.g., Tersoff-Brenner potentials [3,4]) make it possible to simulate systems consisting of more than 1000 atoms but are applicable to restricted class of materials, and their accuracy sometimes leaves something to be desired. Semi-empirical schemes represent a reasonable compromise between the oversimplified classical interatomic potentials and more rigorous approaches based on first principles. They are suitable for modelling quite large molecular systems and long-timescale molecular dynamics simulation.
The most popular semi-empirical methods, which are realised in a majority of modern program packages, are Austin Model 1 (AM1) [5] and parameterised model number 3 (PM3), [6,7] as well as its last modification number 7 (PM7). [8] These methods are based on the modified neglect of diatomic overlap (MNDO) approximation. [9] Another approach is the tight-binding models, which are frequently referred to semi-empirical schemes as well. Contrary to classical potentials, those models explicitly account for the contribution of electron subsystem to the total energy, being not so computer time demanding as ab initio methods. The tight-binding methods provide a possibility to calculate geometrical and energy characteristics for the systems consisting of 100-1000 atoms, and to examine the evolution of a system consisting of 10 -100 atoms for a sufficiently long (on the atomic scale) time 10 ns to 1 ms (the characteristic times for ab initio approaches are 1 -10 ps). Currently, different sets of parameters of the tight-binding models are known not only for single-element compounds (e.g., carbon- [10], silicon- [11,12] and germanium-based [13,14]), but also for compounds composed of different chemical elements (for example, hydrogen-silicon, [15] hydrogen -carbon -oxygen, [16] A II B VI semiconductors, [17] boron and nitrogen dopants in graphene, [18] silicon nitride systems, [19] etc.).
Earlier, we successfully used the nonorthogonal tightbinding model for hydrocarbons [20] to study the energy, structural, electronic and kinetic characteristics of a wide class of hydrocarbon systems, such as cubane [21] and its derivatives, [22 -24] graphane nanoribbons [25] and carbon peapods. [26] The results obtained are in good agreement with published experimental data and first principles calculations. As a next step, we decided to extend this model to the HZCZNZO compounds which represent the important class of materials. They form the basis for bionanotechnology, nanoelectronics, high-energy-density materials, pharmaceutics, etc., which is why there is a need to accumulate the knowledge about them. In the case that experimental data remain incomplete and controversial, theoretical calculations become of particular importance. In most cases, both AM1 and PM3/PM7 correctly describe the geometries and binding energies of HZCZNZO molecules. [7] However a serious drawback of these approaches is unsatisfactory description of binding energies of small diatomic molecules and radicals, e.g., C 2 and NH. For example, the binding energies predicted for C 2 dimer by AM1 and PM3 are equal to 2.688 and 1.778 eV/atom, respectively, while the experimental value is 3.120 eV/ atom. [27] Note that the development of PM7 makes possible to overcome many PM3 faults, but unfortunately, some other shortcomings appear (for example, PM7 is less accurate than PM3 for calculating binding energy of nitrogen molecule). As a consequence, these approaches lead to qualitatively incorrect results in the cases that accurate values of binding energies of small clusters are needed (e.g., the thermal fragmentation processes).
The present paper is aimed at the search for such a set of parameters of the nonorthogonal tight-binding model that adequately describes a wide class of HZCZNZO molecular systems, from diatomic molecules to macromolecules, peptides consisted from the main amino acids except sulphur-containing ones, and crystals (results for a specific class of high-energy-density materials are presented in Refs. [28,29]). The parameters' fitting is based on the criterion of the best correspondence between the computed and experimental (not derived from ab initio calculations) values of binding energies, bond lengths and valence angles of several selected small H k C l N m O n molecules. The resulting tight-binding potential has a good transferability and appears to work well for various relatively large H k C l N m O n molecules and clusters. In this work, we present parameters for hydrogen, carbon, nitrogen and oxygen atoms, and in future we plan to extend the set of parameters to the other chemical elements (e.g., sulphur, phosphorus, boron, etc.).
The paper is organised as follows. In Section 2, we make an overview of the nonorthogonal tight-binding model for HZCZNZO molecular systems, and describe the algorithm used to find its parameters. In Section 3, we compare the computed binding energies, bond lengths and valence angles of various H k C l N m O n molecules and peptides with the corresponding experimental values and the results of ab initio (DFT/B3LYP/6-311G) and semiempirical (AM1, PM3/PM7, tight-binding model [16]) calculations. Section 4 concludes the paper.

Overview of the nonorthogonal tight-binding model for HZCZNZO systems
In the tight-binding model, the total potential energy of the system for a given set of atomic coordinates fR i } is where is the quantum-mechanical electronic component of E, which is equal to the sum of the one-electron energies 1 n for the occupied states (in the absence of magnetic field 1 n do not depend on the spin projection s ¼" or #), is the 'classical' component of E, which is equal to the sum of the pairwise ionic repulsive potentials, where The energy spectrum f1 n } is determined from the stationary Schrödinger equation by expanding the eigenfunctions in terms of the nonorthogonal atomic orbitals ff ia ðrÞ}, where i is the number of the atom, a is the type of atomic orbital of this atom (1S-orbitals of the hydrogen atoms and 2S, 2P x , 2P y and 2P z orbitals of the carbon, nitrogen and oxygen atoms are taken into account). Upon substitution of Equation (5) into Equation (4), multiplication from the left side by f * jb ðrÞ, and integration over r, the eigenvalue equations take the form X are the overlap matrix elements of nonorthogonal atomic orbitals, and are the matrix elements of the Hamiltonian. Thus, the one-electron Schrödinger equation (4) reduces to the generalised eigenvalue problem (6) which can be solved numerically. The number of equations in Equation (6) is equal to the total number of atomic orbitals under consideration, namely, where N H , N C , N N and N O are the numbers of hydrogen, carbon, nitrogen and oxygen atoms, respectively. Molecular orbitals (5) are occupied by electrons according to the Fermi -Dirac distribution function and the Pauli exclusion principle. The number of electrons is N H þ 4N C þ 5N N þ 6N O (one valence electron on each hydrogen atom and four, five and six valence electrons on each carbon, nitrogen and oxygen atom, respectively).
After calculation of eigenvalues 1 n and eigenvectors C n ia , the force F k ¼ 2›E=›R k acting on the k-th atom can be determined from Equations (1) -(3) using the formula derived from Equation (6): Explicit expressions for S jb ia and H jb ia (see later) greatly simplify simulations of relatively large systems. Furthermore, the second-order differentiation of Equation (6) allows one to obtain the Hessian matrix, [30] which is useful for geometry optimisation and necessary for calculation of vibrational frequencies.

NTBM1 parameterisation and fitting procedure for HZCZNZO systems
For H jb ia , we used the following parameterisation [31] (similar parameterisation was used in [16,20]): where  (3) were chosen in the form There are ten parameters b ij , according to the number of pairs of atoms of different sort (we take F 0 ¼ 1 eV for all possible atom pairs). Thus, the total number of parameters to be determined We leave unchanged (see Ref. [20]) the parameters describing the HZH, CZH, and CZC interactions, because the tight-binding model presented in Ref. [20] not only provides rather accurate description of CZH interactions in hydrocarbons H k C l , but also allows one to simulate the structure and energetics of both small (C 2 , C 3 ) and relatively large (fullerenes) carbon clusters, as well as bulk crystalline forms of carbon (diamond, graphene where S energy ¼ E exp b 2 E calc b , l exp 2 l calc and a exp 2 a calc are the differences between the experimental and calculated values of the binding energies, bond lengths and valence angles, respectively, N e ; N l and N a are the numbers of terms in the corresponding sums. Note that the root-mean-square deviations of the calculated values of binding energies s energy , bond lengths s length and valence angles s angle from the experimental ones have the following form: In order to find the global minimum of the function S, we used the Monte-Carlo approach and Rosenbrock's method. [34] The strategy was to explore the neighbourhood of the last 'best' (i.e., having the minimum value of S) point in the parameter space among the points considered up to a given iteration step. The coordinates of each new point were determined using the generator of pseudo-random numbers. The size of the region for the search of new points was periodically changed, which is necessary for both refining the coordinates of the S minimum found at the preceding step and finding other, probably more deep minima. The parameters determined in this way are listed in Tables 1 and 2. Note that the total number of fitting parameters in the NTBM1 model for HZCZNZO systems equals to 54, while widely used semi-empirical AM1 and PM3/PM7 methods contain 61 [5] and 65 [6] parameters, respectively.

Binding energies and structures of H k C l N m O n molecules
To validate the model developed, we computed the binding energies, bond lengths and valence angles for various HZCZNZO compounds not used in the fitting procedure and compared them with the experimental data from Refs. [7,27], AM1/PM3 [7] and PM7 calculations. The PM7 geometrical and energy characteristics are obtained using MOPAC2012. [35,36] The binding energies E b of the H k C l N m O n molecules were determined as where where E 0 C ¼ 7:3768 eV, [4] E 0 H ¼ 2:2398 eV, E 0 N ¼ 4.8806 eV and E 0 O ¼ 2:5585 eV. [37] The results obtained for several selected HZCZNZO molecules and clusters are listed in Table 3 along with the corresponding experimental values and the values calculated using AM1/ PM3/PM7 and the set of the model parameters from Ref. [16]. The data for other HZCZNZO compounds studied are available in Supplementary Materials, Tables S1 -S3.
The root-mean-square deviations of binding energies s energy , bond lengths s length and valence angles s angle (see Equations (20) -(22)) calculated using AM1/PM3/PM7, nonorthogonal tight-binding model from Ref. [16], and NTBM1 model are summarised in Table 4. The numbers N e ; N l and N a are equal to 200, 125 and 49, respectively, for all above-mentioned methods, except for the nonorthogonal tight-binding model from Ref. [16], which is applicable to nitrogen-free H k C l O n systems only, and hence the values of N e ; N l and N a for this model were reduced to 128, 89 and 34, respectively.
From Table 4 one can see that the accuracy of the NTBM1 compares well with the AM1 and PM3/PM7 Table 2. Parameters describing the interactions between the hydrogen, carbon, nitrogen and oxygen atoms.

Conclusions
We present the set of HZCZNZO parameters for the nonorthogonal tight-binding model NTBM1. This model provides accurate description of interatomic interactions in H k C l N m O n compounds and allows one to simulate the structure and energetics of both small and relatively large molecules and clusters. Analytical expressions for the overlap integrals greatly simplify the calculation of forces acting on the atoms. This makes possible the molecular dynamics simulations of systems for which ab initio calculations are problematic due to the limited computer power. The model proposed compares well with widely used semi-empirical AM1 and PM3/PM7 models but contains less fitting parameters per atom and is less computer-time demanding. In future, we plan to extend the set of NTBM1 parameters to other elements (e.g. sulphur, phosphorus, boron). The program package based on this NTBM1 model and corresponding documentation are available for download at no cost from http://ntbm.info. We plan to support this on-line resource regularly and update the software periodically. We hope that the model presented will be useful to the specialists in quantumchemistry modelling.