HOME JOURNALS CONTACT

Trends in Applied Sciences Research

Year: 2008 | Volume: 3 | Issue: 4 | Page No.: 285-291
DOI: 10.17311/tasr.2008.285.291
On the Accuracy of Sequence-Based Computational Inference of Protein Residues Involved in Interactions with DNA
Zhenkun Gou and Igor B. Kuznetsov

Abstract: Methods for computational inference of DNA-binding residues in DNA-binding proteins are usually developed using classification techniques trained to distinguish between binding and non-binding residues on the basis of known examples observed in experimentally determined high-resolution structures of protein-DNA complexes. What degree of accuracy can be expected when a computational methods is applied to a particular novel protein remains largely unknown. We test the utility of classification methods on the example of Kernel Logistic Regression (KLR) predictors of DNA-binding residues. We show that predictors that utilize sequence properties of proteins can successfully predict DNA-binding residues in proteins from a novel structural class. We use Multiple Linear Regression (MLR) to establish a quantitative relationship between protein properties and the expected accuracy of KLR predictors. Present results indicate that in the case of novel proteins the expected accuracy provided by an MLR model is close to the actual accuracy and can be used to assess the overall confidence of the prediction.

Fulltext PDF Fulltext HTML

How to cite this article
Zhenkun Gou and Igor B. Kuznetsov, 2008. On the Accuracy of Sequence-Based Computational Inference of Protein Residues Involved in Interactions with DNA. Trends in Applied Sciences Research, 3: 285-291.

© Science Alert. All Rights Reserved