High-temperature multi-element 2021 (HME21) dataset
HME21 is the atomic structure dataset aimed for the neural network potential development. It was created in the development of PFP, a universal neural network potential for material discovery .
It contains multiple elements in a single structure and was sampled through a high-temperature molecular dynamics simulation. There are a total of 37 elements in the HME21 dataset, i.e., H, Li, C, N, O, F, Na, Mg, Al, Si, P, S, Cl, K, Ca, Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, Mo, Ru, Rh, Pd, Ag, In, Sn, Ba, Ir, Pt, Au, and Pb.
They are calculated by Spin-polarized DFT calculations using PBE exchange-correlation functional implemented in VASP  version 5.4.4. All structures are under periodic boundary conditions.
For the details of DFT calculation conditions and structure sampling method, please see the reference .
Please cite the reference  if you use this dataset.
HME21 consists of three files with extxyz format:
- train.xyz: 19956 structures
- valid.xyz: 2498 structures
- test.xyz 2495 structures
The structures were randomly split into training, validation, and test sub-datasets at a ratio of 8:1:1. They are used as training, validation, and test dataset for the benchmark of neural network potentials .
The target values are energy and atomic forces. The energy is shifted such that the energy of a single atom located in a vacuum becomes zero. The length is in angstroms (10^−10 m), and the energy is in electronvolts (eV). For supplementary, vasp_shift_energies.json which corresponds to the reference energy of single atom for each element is also included.