pcbi.1011655.g002.tif (193.31 kB)

Optimization for the latent dimension for BPTI (left panel) and EGFR (right panel).

Download (193.31 kB)
posted on 2023-11-27, 19:10 authored by Hoda Akl, Brooke Emison, Xiaochuan Zhao, Arup Mondal, Alberto Perez, Purushottam D. Dixit

The optimized value Δ2 = (〈Hmingenerated− 〈Hminnatural)2 (y-axis) is plotted against the latent dimension K (x-axis). For each sequence in an ensemble (natural or generated) Hmin is calculated by obtaining the minimum fractional Hamming distance to the natural sequences. Each box plot represents 10 runs, each run with a different random initialization, for each latent dimension. The optimum latent dimension is determined as the one that minimizes the average value of Δ2 across the 10 runs. Optimal dimension for BPTI is 42, and for EGFR is 19.
