A novel protein motif finding algorithm for classification of the ligase subfamilies

The algorithm of extracting motifs from a family or subfamily is still a hot spot in bioinformatics. It not only contributes to understand functions of proteins and predicts the classification which a unknown protein sequence belongs to, but also helps to study the protein-protein interaction. In this paper, we present a novel algorithm to extract motifs of a subfamily, which is based on feature selection and position connection. Position connection is applied to generate motifs, which is the hybrid method with mechanism of vote decision-making to construct the classifier of the ligase subfamilies. Through testing in the database, more than 95.87% predictive accuracy is achieved. The result demonstrates that this novel method is practical. In addition, the method illuminates that motifs play an important role to classify proteins and research the characteristics of the subfamilies or families of protein database. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1 Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.