figshare
Browse
tbsd_a_770371_sm1530.pdf (475.56 kB)

Identifying subset errors in multiple sequence alignments

Download (0 kB)
Version 2 2014-01-03, 14:17
Version 1 2014-01-20, 15:48
journal contribution
posted on 2014-01-03, 14:17 authored by Aparna Roy, Bruck Taddese, Shabana Vohra, Phani K. Thimmaraju, Christopher J.R. Illingworth, Lisa M. Simpson, Keya Mukherjee, Christopher A. Reynolds, Sree V. Chintapalli

Multiple sequence alignment (MSA) accuracy is important, but there is no widely accepted method of judging the accuracy that different alignment algorithms give. We present a simple approach to detecting two types of error, namely block shifts and the misplacement of residues within a gap. Given a MSA, subsets of very similar sequences are generated through the use of a redundancy filter, typically using a 70–90% sequence identity cut-off. Subsets thus produced are typically small and degenerate, and errors can be easily detected even by manual examination. The errors, albeit minor, are inevitably associated with gaps in the alignment, and so the procedure is particularly relevant to homology modelling of protein loop regions. The usefulness of the approach is illustrated in the context of the universal but little known [K/R]KLH motif that occurs in intracellular loop 1 of G protein coupled receptors (GPCR); other issues relevant to GPCR modelling are also discussed.

History

Usage metrics

    Journal of Biomolecular Structure and Dynamics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC