Simoncini, David Allouche, David Givry, Simon de Delmas, Céline Barbe, Sophie Schiex, Thomas Guaranteed Discrete Energy Optimization on Large Protein Design Problems In Computational Protein Design (CPD), assuming a rigid backbone and amino-acid rotamer library, the problem of finding a sequence with an optimal conformation is NP-hard. In this paper, using Dunbrack’s rotamer library and Talaris2014 decomposable energy function, we use an exact deterministic method combining branch and bound, arc consistency, and tree-decomposition to provenly identify the global minimum energy sequence-conformation on full-redesign problems, defining search spaces of size up to 10<sup>234</sup>. This is achieved on a single core of a standard computing server, requiring a maximum of 66GB RAM. A variant of the algorithm is able to exhaustively enumerate all sequence-conformations within an energy threshold of the optimum. These proven optimal solutions are then used to evaluate the frequencies and amplitudes, in energy and sequence, at which an existing CPD-dedicated simulated annealing implementation may miss the optimum on these full redesign problems. The probability of finding an optimum drops close to 0 very quickly. In the worst case, despite 1,000 repeats, the annealing algorithm remained more than 1 Rosetta unit away from the optimum, leading to design sequences that could differ from the optimal sequence by more than 30% of their amino acids. CPD;arc consistency;1 Rosetta unit;annealing implementation;energy threshold;design sequences;66 GB RAM;rotamer;Talaris 2014 decomposable energy function;redesign problems;Discrete Energy Optimization;Large Protein Design ProblemsIn Computational Protein Design;annealing algorithm;search spaces 2015-12-08
    https://acs.figshare.com/articles/dataset/Guaranteed_Discrete_Energy_Optimization_on_Large_Protein_Design_Problems/2101636
10.1021/acs.jctc.5b00594.s002