Nazar Zaki
College of IT, UAE University, AlAin 17555, UAE
Safaai Deris
SPS, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia
ABSTRACT
This study introduces a simple method based on representing protein sequence by fix dimensions of the length three. We present hidden Markov model combining scores method. Three scoring algorithms are combined to represent protein sequence of amino acids for better remote homology detection. We tested the method on the SCOP version 1.37 dataset. The results show that, with such a simple representation, we are able to achieve superior performance to previously presented protein homology detection methods while achieving better computational efficiency.
PDF References Citation
How to cite this article
Nazar Zaki and Safaai Deris, 2005. Representing Protein Sequence with Low Number of Dimensions. Journal of Biological Sciences, 5: 795-800.
DOI: 10.3923/jbs.2005.795.800
URL: https://scialert.net/abstract/?doi=jbs.2005.795.800
DOI: 10.3923/jbs.2005.795.800
URL: https://scialert.net/abstract/?doi=jbs.2005.795.800
REFERENCES
- Smith, T.F. and M.S. Waterman, 1981. Identification of common molecular subsequences. J. Mol. Biol., 147: 195-197.
CrossRefPubMedDirect Link - Pearson, W. R., 1990. Rapid and sensitive sequence comparisons with fastap and fasta. Method. Enzymol., 183: 63-98.
PubMed - Altschul, S.F., W. Gish, W. Miller, E.W. Myers and D.J. Lipman, 1990. Basic local alignment search tool. J. Mol. Biol., 215: 403-410.
CrossRefPubMedDirect Link - 5. Baldi, P., Y. Chauvin, T. Hunkapiller and M.A. Mcclure, 1994. Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. USA., 91: 1059-1063.
Direct Link - Krogh, A., M. Brown, I.S. Mian, K. Sjolander and D. Haussler, 1994. Hidden Markov models in computational biology applications to protein modeling. J. Mol. Biol., 235: 1501-1531.
Direct Link - Altschul, S.F., T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D.J. Lipman, 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucl. Acids Res., 25: 3389-3402.
CrossRefPubMedDirect Link - Karplus, K., C. Barrett and R. Hughey, 1998. Hidden Markov models for detecting remote protein homologies. Bioinformatics, 14: 846-856.
Direct Link - Leslie, C., E. Eskin, J. Weston and W. Noble, 2004. Mismatch string kernels for discriminative protein classification. Bioinformatics, 20: 467-476.
Direct Link - Liao, L. and W.S. Noble, 2003. Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships. J. Comp. Biol., 10: 857-868.
Direct Link - Murzin, A.G., S.E. Brenner, T. Hubbard and C. Chothia, 1995. Scop A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 247: 536-540.
PubMedDirect Link - Swets, J.A., 1988. Measuring the accuracy of diagnostic systems. Science, 270: 1285-1293.
CrossRefPubMedDirect Link - Henikoff, S. and J.G. Henikoff, 1997. Embedding strategies for effective use of information from multiple sequence alignments. Protein Sci., 6: 698-705.
Direct Link - Shimshoni, Y. and N. Intrator, 1998. Classifying seismic signals by integrating ensembles of neural networks. IEEE Signal Process, 46: 1194-1201.
CrossRef