What are the most important features contributing to xylanase thermostability? Applying a feature selection modeling

Xylan is the main component of hemicellulose which is present in nature in large amounts and can be degraded by either acid or enzymic catalysis with the advantages of a highly efficient conversion rate and non-corrosive and environmentally friendly conditions. Although, the complete breakdown of xylan requires the action of several different enzymes, the depolymerizing endo-1,4,β-xylanase (EC is the key enzyme with possible applications in waste treatment, fuel and chemical production and paper manufacture. In consequence, the importance of finding or making thermostable xylanases has been highlighted. Therefore, it is inevitable to understand the features involving in xylanase thermostability. Here, we looked at more than seventy attributes of 30 xylanase proteins (active in different temperatures) by applying a feature selection algorithm which assigns a p value to each attribute based on the asymptotic distribution of a transformation on the Pearson correlation coefficient, and then, sorts them according to their p values in order to find the most contributing ones regarding the xylanase proteins thermostability. The results showed that the count of oxygen, nitrogen, Glu, Lys, Cys, Phe, Trp, the count of positively and negatively charged residues as well as the count of other residues were the most important features with respect to xylanase thermostability, and 12 more properties were recognized to have a marginal effect on this aspect, while the rest were revealed to be unimportant. The importance of "important" and "marginal" features in xylanase thermostability has been discussed in this paper. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1 Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.