FusNet: an advanced fusion neural network for predicting diverse protein-mediated loops exclusively from genomic DNA sequences
The three-dimensional (3D) organization of the genome is intricately structured, necessitating an array of proteins to guarantee the accurate formation and oversight of chromatin loops, along with more advanced structures. Investigating protein-mediated loop provides insights into the sophisticated interactions among these loops and their pivotal roles in gene expression modulation across distinct cellular environments. However, current targeted chromatin conformation capture assays encounter constraints in identifying protein-mediated loops. Additionally, prevailing computational methods not only fall short in predicting the diverse spectrum of protein-mediated loops but also necessitate extensive epigenomic datasets, further complicating their applicability. To mitigate these challenges, we present an advanced fusion neural network, FusNet, designed for the precise prediction of anchors and subsequent loop formations, solely requiring simple genomic DNA sequences as input. To systematically extract salient features associated with sequence interactions, we meticulously apply convolutional and transformer operations on the FusNet model. To enhance the accuracy and generalization capacity of FusNet, we built a fusion layer by stacking the prediction of fundamental models with a meta-model. FusNet demonstrated its effectiveness in predicting protein-mediated loops, exhibiting high consistency with Hi-C data. Moreover, we found that the loops output from FusNet are highly associated with regulatory functions. Through association analysis with genetic risk variants, FusNet further revealed its potential for unraveling disease-related mechanisms. In conclusion, our study offers a novel computational approach for predicting various protein-mediated chromatin loops, which could substantially enhance research on the functional significance of chromatin loop structures across diverse cellular contexts.