Please use this identifier to cite or link to this item:
Title: Accurate prediction of DnaK-peptide binding via homology modelling and experimental data
Authors: Van Durme J.
Maurer-Stroh S. 
Gallardo R.
Wilkinson H.
Rousseau F.
Schymkowitz J.
Keywords: cellulose
protein DnaK
dnaK protein, E coli
Escherichia coli protein
heat shock protein 70
immobilized protein
amino acid sequence
classification algorithm
controlled study
membrane binding
protein analysis
protein binding
protein protein interaction
protein structure
structural homology
automated pattern recognition
binding site
chemical structure
receiver operating characteristic
sequence analysis
Binding Sites
Computational Biology
Escherichia coli Proteins
HSP70 Heat-Shock Proteins
Immobilized Proteins
Models, Molecular
Pattern Recognition, Automated
Protein Binding
ROC Curve
Sequence Analysis, Protein
Issue Date: 2009
Citation: Van Durme J., Maurer-Stroh S., Gallardo R., Wilkinson H., Rousseau F., Schymkowitz J. (2009). Accurate prediction of DnaK-peptide binding via homology modelling and experimental data. PLoS Computational Biology 5 (8) : e1000475. ScholarBank@NUS Repository.
Rights: Attribution 4.0 International
Abstract: Molecular chaperones are essential elements of the protein quality control machinery that governs translocation and folding of nascent polypeptides, refolding and degradation of misfolded proteins, and activation of a wide range of client proteins. The prokaryotic heat-shock protein DnaK is the E. coli representative of the ubiquitous Hsp70 family, which specializes in the binding of exposed hydrophobic regions in unfolded polypeptides. Accurate prediction of DnaK binding sites in E. coli proteins is an essential prerequisite to understand the precise function of this chaperone and the properties of its substrate proteins. In order to map DnaK binding sites in protein sequences, we have developed an algorithm that combines sequence information from peptide binding experiments and structural parameters from homology modelling. We show that this combination significantly outperforms either single approach. The final predictor had a Matthews correlation coefficient (MCC) of 0.819 when assessed over the 144 tested peptide sequences to detect true positives and true negatives. To test the robustness of the learning set, we have conducted a simulated cross-validation, where we omit sequences from the learning sets and calculate the rate of repredicting them. This resulted in a surprisingly good MCC of 0.703. The algorithm was also able to perform equally well on a blind test set of binders and non-binders, of which there was no prior knowledge in the learning sets. The algorithm is freely available at © 2009 Van Durme et al.
Source Title: PLoS Computational Biology
ISSN: 1553734X
DOI: 10.1371/journal.pcbi.1000475
Rights: Attribution 4.0 International
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1371_journal_pcbi_1000475.pdf603.28 kBAdobe PDF




checked on Mar 6, 2021

Page view(s)

checked on Mar 5, 2021


checked on Mar 5, 2021

Google ScholarTM



This item is licensed under a Creative Commons License Creative Commons