figshare
Browse
John studied Chemistry at the University of Bath, where he graduated in 1987. He then studied for a PhD at Birkbeck College, University of London in the Department of Crystallography - whilst there he was involved in developing automated approaches to protein modelling, contributing to the development of the software programmes COMPOSER and MODELLER, however his major research was on sequence-sturcture relationships, exploring the constraints applied by the local physical environment of a residue in it's mutation patterns (JOY and HOMSTRAD). After completing his PhD, John held a postdoctoral position at the Imperial Cancer Research Fund extending this research. John then joined Pfizer, originally as a computational chemist, progressing to a role where he led a multidisciplinary group combining rational drug design with structural biology. During this time, John became fascinated by the reasons for target/drug attrition, and the falling productivity of the entire pharmaceutical industry. John then moved to a small biotech company, where we developed a series of platforms to improve drug discovery, including the SAR database StARLite. In 2008 John was centrally involved in the transfer of this database to the EMBL-EBI, where the successor is known as ChEMBL, a large Open database of drug discovery data. Most recently John joined a central London tech startup, Stratified Medical, where he applies artificial intelligence algorithms to drug discovery challenges. John studied Chemistry at the University of Bath, from where he graduated in 1987. He then studied for a PhD at Birkbeck College, University of London in the Department of Crystallography. Whilst there he was involved in developing automated approaches to protein modelling, contributing to the development of the software programmes COMPOSER and MODELLER, however his major research was on sequence-sturcture relationships, exploring the constraints applied by the local physical environment of a residue in it's mutation patterns (JOY and HOMSTRAD). After completing his PhD, John held a postdoctoral position at the Imperial Cancer Research Fund (ICRF, now CRUK) extending this research. John then joined Pfizer, originally as a computational chemist, progressing to a role where he led a multidisciplinary group combining rational drug design with structural biology. During this time, John became fascinated by the reasons for target/drug attrition and target validation, and the falling productivity of the entire pharmaceutical industry. John then moved to a small biotech company, Inpharmatica, where we developed a series of platforms to improve drug discovery, including the SAR database StARLite. In 2008 John was centrally involved in the transfer of this database to the EMBL-EBI, where the successor is known as ChEMBL, a large Open database of drug discovery data. More recently, the work has extended into patent informatics with the patent database SureChEMBL. John holds an honorary chair at University College London within the Institute of Cardiovascular Science at the Farr Institute. Most recently John joined a biotech company - Stratified Medical, where he continues his translational informatics research.

Publications

  • myChEMBL: a virtual machine implementation of open data and cheminformatics tools. DOI: 10.1093/bioinformatics/btt666
  • The ChEMBL bioactivity database: an update. DOI: 10.1093/nar/gkt1031
  • The functional therapeutic chemical classification system. DOI: 10.1093/bioinformatics/btt628
  • Target Prediction for an Open Access Set of Compounds Active against Mycobacterium tuberculosis. DOI: 10.1371/journal.pcbi.1003253
  • Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets. PMID: 24059743
  • Brain: biomedical knowledge manipulation. DOI: 10.1093/bioinformatics/btt109
  • The EBI enzyme portal. DOI: 10.1093/nar/gks1112
  • UniChem: a unified chemical structure cross-referencing and identifier tracking system. DOI: 10.1186/1758-2946-5-3
  • Annotating Human P-Glycoprotein Bioassay Data. DOI: 10.1002/minf.201200059
  • ChEMBL: a large-scale bioactivity database for drug discovery. DOI: 10.1093/nar/gkr777
  • Global analysis of small molecule binding to related protein targets. DOI: 10.1371/journal.pcbi.1002333
  • Mapping small molecule binding data to structural domains. DOI: 10.1186/1471-2105-13-S17-S11
  • Annotating human P-glycoprotein bioassay data
  • ChEMBL: A large-scale bioactivity database for drug discovery
  • Cheminformatics
  • Global analysis of small molecule binding to related protein targets
  • Open data for drug discovery: Learning from the biological community
  • Shouldn't enantiomeric purity be included in the 'minimum information about a bioactive entity? Response from the MIABE group
  • Toxicogenomics Investigation Under the eTOX Project OTHER_ID: c6903
  • Collation and data-mining of literature bioactivity data for drug discovery. DOI: 10.1042/BST0391365
  • Minimum information about a bioactive entity (MIABE). DOI: 10.1038/nrd3503
  • PSICQUIC and PSISCORE: accessing and scoring molecular interactions. DOI: 10.1038/nmeth.1637
  • Probing the links between in vitro potency, ADMET and physicochemical parameters
  • Chemogenomics approaches for receptor deorphanization and extensions of the chemogenomics concept to phenotypic space. DOI: 10.2174/156802611796391230
  • Rapid analysis of pharmacology for infectious diseases. DOI: 10.2174/156802611795429130
  • Chemogenomics approaches for receptor deorphanization and extensions of the chemogenomics concept to phenotypic space
  • Collation and data-mining of \literature bioactivity data for drug discovery
  • Minimum information about a bioactive entity (MIABE)
  • Probing the links between in vitro potency, ADMET and physicochemical parameters
  • PSICQUIC and PSISCORE: Accessing and scoring molecular interactions
  • Rapid analysis of pharmacology for infectious diseases
  • Ligand efficiency indices for an effective mapping of chemico-biological space: the concept of an atlas-like representation. DOI: 10.1016/j.drudis.2010.08.004
  • Ligand efficiency indices for an effective mapping of chemico-biological space: The concept of an atlas-like representation
  • Role of open chemical data in aiding drug discovery and design
  • The genome of the blood fluke Schistosoma mansoni. DOI: 10.1038/nature08160
  • ChEMBL. An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI). Interview by Wendy A. Warr. DOI: 10.1007/s10822-009-9260-9
  • New open drug activity data at EBI
  • New open drug activity data at EBI DOI: 10.1186/1752-153X-3-S1-O3
  • The genome of the blood fluke Schistosoma mansoni
  • Genomic-scale prioritization of drug targets: the TDR Targets database. DOI: 10.1038/nrd2684
  • Genomic-scale prioritization of drug targets: The TDR Targets database
  • The Molecular Basis of Predicting Druggability
  • How many drug targets are there?
  • Can we rationally design promiscuous drugs? DOI: 10.1016/j.sbi.2006.01.013
  • Can we rationally design promiscuous drugs?
  • How many drug targets are there?
  • PDBLIG: classification of small molecular protein binding in the Protein Data Bank. DOI: 10.1021/jm040804f
  • PDBLIG: Classification of small molecular protein binding in the protein data bank
  • Pleiotropic Effects of Statins
  • Pleiotropic Effects of Statins
  • Chapter 28. Recent development in cheminformatics and chemogenomics
  • Protein sequence analysis in silico: application of structure-based bioinformatics to genomic initiatives. DOI: 10.1016/S1471-4892(02)00202-3
  • Synthesis of macrocyclic, potential protease inhibitors using a generic scaffold. DOI: 10.1021/jo025615o
  • Design of selective thrombin inhibitors based on the (R)-Phe-Pro-Arg sequence. DOI: 10.1021/jm011133d
  • Prioritizing the proteome: identifying pharmaceutically relevant targets
  • Chapter 19. Expanding and exploring cellular pathways for novel drug targets
  • Design of selective thrombin inhibitors based on the (R)-Phe-Pro-Arg sequence
  • Expanding and exploring cellular pathways for novel drug targets DOI: 10.1016/S0065-7743(02)37020-9
  • Prioritizing the proteome: Identifying pharmaceutically relevant targets
  • Protein sequence analysis in silico: application of structure-based bioinformatics to genomic initiatives.
  • Synthesis of macrocyclic, potential inhibitors using a generic scaffold
  • Insights into protein function through large-scale computational analysis of sequence and structure
  • Insights into protein function through large-scale computational analysis of sequence and structure
  • Nicastrin, a presenilin-interacting protein, contains an aminopeptidase/transferrin receptor superfamily domain
  • Nicastrin, a presenilin-interacting protein, contains an aminopeptidase/transferrin receptor superfamily domain. DOI: 10.1016/S0968-0004(01)01789-3
  • Insights into protein function through large-scale computational analysis of sequence and structure.
  • Nicastrin, a presenilin-interacting protein, contains an aminopeptidase/transferrin receptor superfamily domain
  • HOMSTRAD: A database of protein structure alignments for homologous families
  • HOMSTRAD: a database of protein structure alignments for homologous families. PMID: 9828015
  • Protein three-dimensional structural databases: domains, structurally aligned homologues and superfamilies. DOI: 10.1107/S0907444998007148
  • HOMSTRAD: A database of protein structure alignments for homologous families
  • JOY: Protein sequence-structure representation and analysis
  • JOY: protein sequence-structure representation and analysis. DOI: 10.1093/bioinformatics/14.7.617
  • Protein three-dimensional structural databases: Domains, structurally aligned homologues/superfamilies
  • [34] Discrimination of common protein folds: Application of protein structure to sequence/structure comparisons
  • Discrimination of common protein folds: application of protein structure to sequence/structure comparisons. DOI: 10.1016/S0076-6879(96)66036-4
  • Comparison of structures and sequences: Alignment, searching and the detection of common folds
  • Derivation of rules for comparative protein modeling from a database of protein structure alignments
  • Derivation of rules for comparative protein modeling from a database of protein structure alignments. PMID: 7833817
  • Comparative modelling of major house dust mite allergen Der p I: structure validation using an extended environmental amino acid propensity table. DOI: 10.1093/protein/7.7.869
  • The prediction and orientation of alpha-helices from sequence alignments: the combined use of environment-dependent substitution tables, Fourier transform methods and helix capping rules. DOI: 10.1093/protein/7.5.645
  • Comparative modelling of major house dust mite allergen Der p I: Structure validation using an extended environmental amino acid propensity table
  • Derivation of rules for comparative protein modeling from a database of protein structure alignments
  • The prediction and orientation of α-helices from sequence alignments: The combined use of environment-dependent substitution tables, Fourier transform methods and helix capping rules
  • A Structural Basis for Sequence Comparisons
  • Molecular recognition in protein families: a database of aligned three-dimensional structures of related proteins. PMID: 8224474
  • Alignment and Searching for Common Protein Folds Using a Data Bank of Structural Templates
  • Alignment and searching for common protein folds using a data bank of structural templates. DOI: 10.1006/jmbi.1993.1323
  • Modelling of the lignin peroxidase LIII of Phlebia radiata: use of a sequence template generated from a 3-D structure. DOI: 10.1093/protein/6.2.177
  • Fragment Ranking in Modelling of Protein Structure
  • Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. DOI: 10.1006/jmbi.1993.1018
  • Modeling alpha-helical transmembrane domains: the calculation and use of substitution tables for lipid-facing residues. PMID: 8443590
  • A structural basis for sequence comparisons. An evaluation of scoring methodologies
  • Alignment and searching for common protein folds using a data bank of structural templates
  • Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables
  • Modeling α-helical transmembrane domains: The calculation and use of substitution tables for lipid-facing residues
  • Modelling of the lignin peroxidase LIII of Phlebia radiata: Use of a sequence template generated from a 3-D structure
  • Molecular recognition in protein families: A database of aligned three-dimensional structures of related proteins
  • Comparison of three-dimensional structures of homologous proteins
  • Comparison of three-dimensional structures of homologous proteins
  • Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. PMID: 1304904
  • Comparison of three-dimensional structures of homologous proteins
  • Environment-specific amino acid substitution tables: Tertiary templates and prediction of protein folds
  • Structural constraints on residue substitution.
  • Structural constraints on residue substitution. PMID: 1368278
  • Structure-function relationships in the cysteine proteinases actinidin, papain and papaya proteinase omega. Three-dimensional structure of papaya proteinase omega deduced by knowledge-based modelling and active-centre characteristics determined by two-hydronic-state reactivity probe kinetics and kinetics of catalysis. PMID: 1741760
  • Structure-function relationships in the cysteine proteinases actinidin, papain and papaya proteinase Ω: Three-dimensional structure of papaya proteinase Ω deduced by knowledge-based modelling and active-centre characteristics determined by two-hydronic-state reactivity probe kinetics and kinetics of catalysis
  • From the comparative analysis of proteins to knowledge based modeling
  • From the comparative analysis of proteins to knowledge based modeling DOI: 10.1016/0263-7855(90)80017-A
  • Three-dimensional structure and thiol reactivity characteristics of chymopapain M (papaya proteinase IV). PMID: 2083746
  • Three-dimensional structure of a B-type chymopapain. PMID: 2083745
  • Investigation of mechanistic consequences of natural structural variation within the cysteine proteinases by knowledge-based modelling and kinetic methods. PMID: 2276446
  • Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction. PMID: 1978340
  • From comparisons of protein sequences and structures to protein modelling and design. DOI: 10.1016/0968-0004(90)90036-B
  • An assessment of COMPOSER: a rule-based approach to modelling protein structure
  • An assessment of COMPOSER: a rule-based approach to modelling protein structure. PMID: 2099735
  • From comparative structure analysis to protein engineering: Knowledge-based protein modelling and design
  • From comparative structure analysis to protein engineering: knowledge-based protein modelling and design (in Symposium 1: Structure and engineering of proteins: New Developments) DOI: 10.1007/BF00325709
  • From comparisons of protein sequences and structures to protein modelling and design
  • Investigation of mechanistic consequences of natural structural variation with the cysteine proteinases by knowledge-based modelling and kinetic methods
  • Symposium 1: Structure and engineering of proteins: New developments
  • Tertiary structural constraints on protein evolutionary diversity: Templates, key residues and structure prediction
  • Three-dimensional structure and thiol reactivity characteristics of chymopapain M (papaya proteinase IV)
  • Three-dimensional structure of a B-type chymopapain
  • X-ray analysis of HIV-1 proteinase at 2.7 A resolution confirms structural homology among retroviral enzymes. DOI: 10.1038/342299a0
  • Protein engineering and design. PMID: 2573083
  • Protein engineering and design.
  • X-ray analysis of HIV-1 proteinase at 2.7 Å resolution confirms structural homology among retroviral enzymes
  • Knowledge-based protein modelling and design
  • A ligand's-eye view of protein similarity DOI: http://dx.doi.org/10.1038/nmeth.2339
  • A Structural Basis for Sequence Comparisons DOI: http://dx.doi.org/10.1006/jmbi.1993.1548
  • Chapter 28. Recent development in cheminformatics and chemogenomics DOI: http://dx.doi.org/10.1016/s0065-7743(03)38029-7
  • Cheminformatics DOI: http://dx.doi.org/10.1145/2366316.2366334
  • Comparison of three-dimensional structures of homologous proteins DOI: http://dx.doi.org/10.1016/0959-440x(92)90231-u
  • Comparison of three-dimensional structures of homologous proteins DOI: http://dx.doi.org/10.1016/0960-9822(92)90075-l
  • Derivation of rules for comparative protein modeling from a database of protein structure alignments DOI: http://dx.doi.org/10.1002/pro.5560030923
  • From modelling homologous proteins to prediction of structure OTHER_ID: c6887
  • How many drug targets are there? DOI: http://dx.doi.org/10.1038/nrd2199
  • Insights into protein function through large-scale computational analysis of sequence and structure DOI: http://dx.doi.org/10.1016/s0167-7799(01)01794-2
  • Insights into protein function through large-scale computational analysis of sequence and structure DOI: http://dx.doi.org/10.1016/s0167-7799(01)00011-7
  • Knowledge-based protein modelling and design DOI: http://dx.doi.org/10.1111/j.1432-1033.1988.tb13917.x
  • Knowledge-Based Protein Modelling: Human Plasma Kallikrein and Human Neutrophil Defensin OTHER_ID: c6943
  • Modeling α-helical transmembrane domains: The calculation and use of substitution tables for lipid-facing residues DOI: http://dx.doi.org/10.1002/pro.5560020106
  • MODELLER: A Program for Protein Structure Modeling OTHER_ID: c6886
  • MODULATING CELL ACTIVITY BY USING AN AGENT THAT REDUCES THE LEVEL OF CHOLESTEROL WITHIN A CELL OTHER_ID: WO2005023305
  • NICASTRIN PROTEIN OTHER_ID: WO0229023
  • Open data for drug discovery: learning from the biological community DOI: http://dx.doi.org/10.4155/fmc.12.159
  • Pleiotropic Effects of Statins DOI: http://dx.doi.org/10.1016/s0065-7743(04)39019-6
  • Prioritizing the proteome: identifying pharmaceutically relevant targets DOI: http://dx.doi.org/10.1016/s1359-6446(02)02250-x
  • Probing the links between in vitro potency, ADMET and physicochemical parameters DOI: http://dx.doi.org/10.1038/nrd3367
  • Role of open chemical data in aiding drug discovery and design DOI: http://dx.doi.org/10.4155/fmc.10.191
  • Structural Constraints on Residue Substitution DOI: http://dx.doi.org/10.1007/978-1-4615-3424-2_13
  • The comparison of structures and sequences: alignment, searching and the detection of common folds DOI: 10.1109/HICSS.1994.323567
  • The Molecular Basis of Predicting Druggability DOI: http://dx.doi.org/10.1002/9783527619368.ch36
  • ChEMBL web services: streamlining access to drug discovery data and utilities. PMID: 25883136
  • Mycobacterial dihydrofolate reductase inhibitors identified using chemogenomic methods and in vitro validation. PMID: 25799414
  • UniChem: extension of InChI-based compound mapping to salt, connectivity and stereochemistry layers. PMID: 25221628
  • diXa: a Data Infrastructure for Chemical Safety Assessment. PMID: 25505093
  • A document classifier for medicinal chemistry publications trained on the ChEMBL corpus. PMID: 25221627
  • A community computational challenge to predict the activity of pairs of compounds. PMID: 25419740
  • Antibody informatics for drug discovery. PMID: 25110827
  • Towards predictive resistance models for agrochemicals by combining chemical and protein similarity via proteochemometric modelling. PMID: 25320644
  • PPDMs-a resource for mapping small molecule bioactivities from ChEMBL to Pfam-A protein domains. PMID: 25348214
  • Transporter assays and assay ontologies: useful tools for drug discovery. PMID: 25027375
  • An atlas of genetic influences on human blood metabolites. PMID: 24816252
  • A community effort to assess and improve drug sensitivity prediction algorithms. PMID: 24880487
  • 'Big data' in pharmaceutical science: challenges and opportunities. PMID: 24962278
  • Chemical, target, and bioactive properties of allosteric modulation. PMID: 24699297
  • The ChEMBL database: a taster for medicinal chemists. PMID: 24635517
  • Shouldn't enantiomeric purity be included in the minimum information about a bioactive entity? Response from the MIABE group OTHER_ID: c7311
  • A structural basis for sequence comparisons. An evaluation of scoring methodologies. PMID: 8411177

Usage metrics

Co-workers & collaborators

Gerard JP Van westen

Full Professor of AI & Medicinal Chemistry - Leiden (NL)

Gerard JP Van westen

Alex Gutteridge

Alex Gutteridge

John Overington's public data