Please use this identifier to cite or link to this item:
https://doi.org/10.1186/s12859-017-1470-x
Title: | BATVI: Fast, sensitive and accurate detection of virus integrations | Authors: | Tennakoon, C Sung, W.K |
Keywords: | Alignment Diagnosis Diseases Forecasting Gene expression Genes Detection of virus False positive rates Fast screening Gene Expression Data Liver cancers Next-generation sequencing Real life datasets True positive Viruses DNA binding protein MLL4 protein, human telomerase TERT protein, human virus DNA algorithm cluster analysis DNA sequence genetics high throughput sequencing host pathogen interaction human human genome Liver Neoplasms metabolism software theoretical model virology virus DNA cell DNA interaction Algorithms Cluster Analysis DNA, Viral DNA-Binding Proteins Genome, Human High-Throughput Nucleotide Sequencing Host-Pathogen Interactions Humans Liver Neoplasms Models, Theoretical Sequence Analysis, DNA Software Telomerase Virus Integration |
Issue Date: | 2017 | Citation: | Tennakoon, C, Sung, W.K (2017). BATVI: Fast, sensitive and accurate detection of virus integrations. BMC Bioinformatics 18 : 71. ScholarBank@NUS Repository. https://doi.org/10.1186/s12859-017-1470-x | Rights: | Attribution 4.0 International | Abstract: | Background: The study of virus integrations in human genome is important since virus integrations were shown to be associated with diseases. In the literature, few methods have been proposed that predict virus integrations using next generation sequencing datasets. Although they work, they are slow and are not very sensitive. Results and discussion: This paper introduces a new method BatVI to predict viral integrations. Our method uses a fast screening method to filter out chimeric reads containing possible viral integrations. Next, sensitive alignments of these candidate chimeric reads are called by BLAST. Chimeric reads that are co-localized in the human genome are clustered. Finally, by assembling the chimeric reads in each cluster, high confident virus integration sites are extracted. Conclusion: We compared the performance of BatVI with existing methods VirusFinder and VirusSeq using both simulated and real-life datasets of liver cancer patients. BatVI ran an order of magnitude faster and was able to predict almost twice the number of true positives compared to other methods while maintaining a false positive rate less than 1%. For the liver cancer datasets, BatVI uncovered novel integrations to two important genes TERT and MLL4, which were missed by previous studies. Through gene expression data, we verified the correctness of these additional integrations. BatVI can be downloaded from http://biogpu.ddns.comp.nus.edu.sg/~ksung/batvi/index.html. © 2017 The Author(s). | Source Title: | BMC Bioinformatics | URI: | https://scholarbank.nus.edu.sg/handle/10635/181296 | ISSN: | 14712105 | DOI: | 10.1186/s12859-017-1470-x | Rights: | Attribution 4.0 International |
Appears in Collections: | Elements Staff Publications |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
10_1186_s12859-017-1470-x.pdf | 1.65 MB | Adobe PDF | OPEN | None | View/Download |
This item is licensed under a Creative Commons License