HMRLBA_V1.0
HMRLBA
This is a repository to deposit the data and code for HMRLBA model. HMRLBA is a hierarchical multi-scale representation learning model for predicting protein-ligand binding affinity.
Main files
Datasets:
a). Raw_data: Three PDBbind v2019 benchmark datasets, CASF-2016 dataset from PDBbind v2016, filtered dataset from BindingDB and Enzyme classification dataset.
b). Hard_samples: 21 hard samples.
c). Virtual screening: 1). SMILES strings of 2616 FDA-approved drugs and 18 EGFR inhibitors. 2) The BindingDB dataset includes 69 testing samples. Among them, seven compounds specifically bind to the target protein Dot1L (pdb_id 1NW3).
d). PDB_id_list: The protein list of different dataset split.
hmrlba_code: Main code file for the HMRLBA model.
PLMs: Three protein language models - ESM-1b, Ankh, ProtTrans.
SOTA: Comparative methods used in the contrast experiments.
Configuration
It is recommended to use the conda environment (python 3.7), mainly installing the following dependencies:
- pytorch (1.9.0)、torch-geometric (2.0.4)、dgl-cu111 (0.6.1)、cudatoolkit (11.1.74)
- msms (2.6.1)、dssp (3.0.0)、blender (3.5.1)、pdb2pqr (2.1.1) 、biopython (1.79)、rdkit (2023.3.1)、transformers (4.24.0)、 wandb (0.15.4)、pymesh2 (0.3)、pdbfixer (1.6)
See environment.yaml for details.