A Hybrid Classifier for Protein Secondary Structure Prediction
Saad Osman Abdalla Subair,
Mohd Saberi Mohamad
Advances in molecular biology in the last few decades and the availability of equipment in this field lead to the rapid sequencing of considerable genomes of several species. These large genome sequencing projects generate huge number of protein sequences in their primary structures that are difficult for conventional molecular biology laboratory techniques like X-ray crystallography and NMR to determine their corresponding 3D structures. Protein secondary structure prediction is a fundamental step in determining the 3D structure of a protein. In this study a new method for predicting protein secondary structure from amino acid sequences has been proposed and implemented. The prediction method was analyzed together with other five well known prediction methods in this domain to allow easy comparison and clear conclusions. Cuff and Barton 513 protein data set was used in training and testing the prediction methods under the same hardware, platforms and environments. The newly developed method utilizes the knowledge of the GORV information theory and the power of the neural networks to classify a novel protein sequence in one of its three secondary structures classes. The newly developed method (NN-GORV) was rigorously tested together with the other methods and observed outperformed the GOR-V methods by 7.4% Q3 and the neural networks method (NN-II) by 5.6% Q3 accuracy. The Mathews Correlation Coefficients (MCC) showed that NN-GORV secondary structure predicted states are strongly related to the observed secondary structure states.
Cited References Fulltext