Protein Family Classification with Partial Least Squares

2007-02-02T00:00:00Z (GMT) by Stephen O. Opiyo Etsuko N. Moriyama
The quality of protein function predictions relies on appropriate training of protein classification methods. Performance of these methods can be affected when only a limited number of protein samples are available, which is often the case in divergent protein families. Whereas profile hidden Markov models and PSI-BLAST presented significant performance decrease in such cases, alignment-free partial least-squares classifiers performed consistently better even when used to identify short fragmented sequences. Keywords: partial least square • physico-chemical properties • amino acid composition • profile hidden Markov model • G-protein coupled receptors