ORDerly benchmarks of chemical reactions
Benchmark datasets generated with ORDerly for chemical reaction prediction tasks
ORDerly-forward: Forward reaction prediction (predict reaction products given reactants, solvents, and agents)
ORDerly-retro: Retrosynthesis prediction (prediction reactants given a desired product)
ORDerly-condition: Reaction condition prediction (predict solvents and agents given reactants and products). Note that reactions with rare solvents and agents (frequency <100) have been removed.
ORDerly-condition-with-rare: Reaction condition prediction (predict solvents and agents given reactants and products). Reactions with rare solvents and agents have not been removed.
Config: Contains the .log and .json files showing the parameters used in cleaning and the impact on dataset size after each cleaning step.
Note that all datasets here were created using the reaction string and our chemically informed logic to assign reaction roles.
Paper: https://chemrxiv.org/engage/chemrxiv/article-details/64ca5d3e4a3f7d0c0d78ca42
Neurips workshop paper: https://openreview.net/forum?id=R8FQMsECIS
Code: https://github.com/sustainable-processes/orderly
The supplementary datasets used for this work can be found here: https://doi.org/10.6084/m9.figshare.23502372.v3
Feel free to email me, Daniel Wigh, at dsw46@cam.ac.uk or my supervisor Alexei A. Lapkin.