Geometric Accuracy of Three-Dimensional Molecular Overlays

Chen, Qi; Higgs, Richard E.; Vieth, Michal

doi:10.1021/ci060134h.s001

ci060134h_si_001.doc (229.5 kB)

Geometric Accuracy of Three-Dimensional Molecular Overlays

journal contribution

posted on 2006-09-25, 00:00 authored by Qi Chen, Richard E. Higgs, Michal Vieth

This study examines the dependence of molecular alignment accuracy on a variety of factors including the choice of molecular template, alignment method, conformational flexibility, and type of protein target. We used eight test systems for which X-ray data on 145 ligand−protein complexes were available. The use of X-ray structures allowed an unambiguous assignment of bioactive overlays for each compound set. The alignment accuracy depended on multiple factors and ranged from 6% for flexible overlays to 73% for X-ray rigid overlays, when the conformation of the template ligand came from X-ray structures. The dependence of the overlay accuracy on the choice of templates and molecules to be aligned was found to be the most significant factor in six and seven of the eight ligand−protein complex data sets, respectively. While finding little preference for the overlay method, we observed that the introduction of molecule flexibility resulted in a decrease of overlay accuracy in 50% of the cases. We derived rules to maximize the accuracy of alignment, leading to a more than 2-fold improvement in accuracy (from 19% to 48%). The rules also allowed the identification of compounds with a low (<5%) chance to be correctly aligned. Last, the accuracy of the alignment derived without any utilization of X-ray conformers varied from <1% for the human immunodeficiency virus data set to 53% for the trypsin data set. We found that the accuracy was directly proportional to the product of the overlay accuracy from the templates in their bioactive conformations and the chance of obtaining the correct bioactive conformation of the templates. This study generates a much needed benchmark for the expectations of molecular alignment accuracy and shows appropriate usages and best practices to maximize hypothesis generation success.