Please use this identifier to cite or link to this item: https://doi.org/10.1371/journal.pone.0008155
Title: A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data
Authors: He X.
Chen C.-C.
Hong F.
Fang F.
Sinha S.
Ng H.-H. 
Zhong S.
Keywords: transcription factor NANOG
Drosophila protein
homeodomain protein
transcription factor
article
binding affinity
binding site
cis isomer
computer prediction
controlled study
DNA binding
DNA microarray
Drosophila
Drosophila melanogaster
drosophila pseudoobscura
embryonic stem cell
gene control
gene interaction
gene location
gene targeting
genetic analysis
genetic association
genetic conservation
genetic transcription
genome
protein motif
quantitative study
regulatory sequence
sequence homology
theoretical model
animal
biological model
biology
biophysics
chromatin immunoprecipitation
genetic database
genetics
genome
metabolism
mouse
nucleotide sequence
protein binding
receiver operating characteristic
reproducibility
Drosophila melanogaster
Drosophila pseudoobscura
Animals
Base Sequence
Binding Sites
Biophysical Phenomena
Chromatin Immunoprecipitation
Computational Biology
Databases, Genetic
Drosophila melanogaster
Drosophila Proteins
Genome
Homeodomain Proteins
Mice
Models, Genetic
Protein Binding
Reproducibility of Results
ROC Curve
Transcription Factors
Issue Date: 2009
Citation: He X., Chen C.-C., Hong F., Fang F., Sinha S., Ng H.-H., Zhong S. (2009). A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data. PLoS ONE 4 (12) : e8155. ScholarBank@NUS Repository. https://doi.org/10.1371/journal.pone.0008155
Rights: Attribution 4.0 International
Abstract: Background: How transcription factors (TFs) interact with cis-regulatory sequences and interact with each other is a fundamental, but not well understood, aspect of gene regulation. Methodology/Principal Findings: We present a computational method to address this question, relying on the established biophysical principles. This method, STAP (sequence to affinity prediction), takes into account all combinations and configurations of strong and weak binding sites to analyze large scale transcription factor (TF)-DNA binding data to discover cooperative interactions among TFs, infer sequence rules of interaction and predict TF target genes in new conditions with no TF-DNA binding data. The distinctions between STAP and other statistical approaches for analyzing cis-regulatory sequences include the utility of physical principles and the treatment of the DNA binding data as quantitative representation of binding strengths. Applying this method to the ChIP-seq data of 12 TFs in mouse embryonic stem (ES) cells, we found that the strength of TF-DNA binding could be significantly modulated by cooperative interactions among TFs with adjacent binding sites. However, further analysis on five putatively interacting TF pairs suggests that such interactions may be relatively insensitive to the distance and orientation of binding sites. Testing a set of putative Nanog motifs, STAP showed that a novel Nanog motif could better explain the ChIP-seq data than previously published ones. We then experimentally tested and verified the new Nanog motif. A series of comparisons showed that STAP has more predictive power than several state-of-the-art methods for cis-regulatory sequence analysis. We took advantage of this power to study the evolution of TF-target relationship in Drosophila. By learning the TF-DNA interaction models from the ChIP-chip data of D. melanogaster (Mel) and applying them to the genome of D. pseudoobscura (Pse), we found that only about half of the sequences strongly bound by TFs in Mel have high binding affinities in Pse. We show that prediction of functional TF targets from ChIP-chip data can be improved by using the conservation of STAP predicted affinities as an additional filter. Conclusions/Significance: STAP is an effective method to analyze binding site arrangements, TF cooperativity, and TF target genes from genome-wide TF-DNA binding data. � 2009 He et al.
Source Title: PLoS ONE
URI: https://scholarbank.nus.edu.sg/handle/10635/161824
ISSN: 19326203
DOI: 10.1371/journal.pone.0008155
Rights: Attribution 4.0 International
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1371_journal_pone_0008155.pdf523.8 kBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons