posted on 2024-01-26, 18:07authored byPatrick
W. V. Butler, Roohollah Hafizi, Graeme M. Day
A primary challenge in organic molecular crystal structure
prediction
(CSP) is accurately ranking the energies of potential structures.
While high-level solid-state density functional theory (DFT) methods
allow for mostly reliable discrimination of the low-energy structures,
their high computational cost is problematic because of the need to
evaluate tens to hundreds of thousands of trial crystal structures
to fully explore typical crystal energy landscapes. Consequently,
lower-cost but less accurate empirical force fields are often used,
sometimes as the first stage of a hierarchical scheme involving multiple
stages of increasingly accurate energy calculations. Machine-learned
interatomic potentials (MLIPs), trained to reproduce the results of
ab initio methods with computational costs close to those of force
fields, can improve the efficiency of the CSP by reducing or eliminating
the need for costly DFT calculations. Here, we investigate active
learning methods for training MLIPs with CSP datasets. The combination
of active learning with the well-developed sampling methods from CSP
yields potentials in a highly automated workflow that are relevant
over a wide range of the crystal packing space. To demonstrate these
potentials, we illustrate efficiently reranking large, diverse crystal
structure landscapes to near-DFT accuracy from force field-based CSP,
improving the reliability of the final energy ranking. Furthermore,
we demonstrate how these potentials can be extended to more accurately
model structures far from lattice energy minima through additional
on-the-fly training within Monte Carlo simulations.