Please use this identifier to cite or link to this item: https://doi.org/10.1186/s12859-017-1470-x
Title: BATVI: Fast, sensitive and accurate detection of virus integrations
Authors: Tennakoon, C
Sung, W.K 
Keywords: Alignment
Diagnosis
Diseases
Forecasting
Gene expression
Genes
Detection of virus
False positive rates
Fast screening
Gene Expression Data
Liver cancers
Next-generation sequencing
Real life datasets
True positive
Viruses
DNA binding protein
MLL4 protein, human
telomerase
TERT protein, human
virus DNA
algorithm
cluster analysis
DNA sequence
genetics
high throughput sequencing
host pathogen interaction
human
human genome
Liver Neoplasms
metabolism
software
theoretical model
virology
virus DNA cell DNA interaction
Algorithms
Cluster Analysis
DNA, Viral
DNA-Binding Proteins
Genome, Human
High-Throughput Nucleotide Sequencing
Host-Pathogen Interactions
Humans
Liver Neoplasms
Models, Theoretical
Sequence Analysis, DNA
Software
Telomerase
Virus Integration
Issue Date: 2017
Citation: Tennakoon, C, Sung, W.K (2017). BATVI: Fast, sensitive and accurate detection of virus integrations. BMC Bioinformatics 18 : 71. ScholarBank@NUS Repository. https://doi.org/10.1186/s12859-017-1470-x
Rights: Attribution 4.0 International
Abstract: Background: The study of virus integrations in human genome is important since virus integrations were shown to be associated with diseases. In the literature, few methods have been proposed that predict virus integrations using next generation sequencing datasets. Although they work, they are slow and are not very sensitive. Results and discussion: This paper introduces a new method BatVI to predict viral integrations. Our method uses a fast screening method to filter out chimeric reads containing possible viral integrations. Next, sensitive alignments of these candidate chimeric reads are called by BLAST. Chimeric reads that are co-localized in the human genome are clustered. Finally, by assembling the chimeric reads in each cluster, high confident virus integration sites are extracted. Conclusion: We compared the performance of BatVI with existing methods VirusFinder and VirusSeq using both simulated and real-life datasets of liver cancer patients. BatVI ran an order of magnitude faster and was able to predict almost twice the number of true positives compared to other methods while maintaining a false positive rate less than 1%. For the liver cancer datasets, BatVI uncovered novel integrations to two important genes TERT and MLL4, which were missed by previous studies. Through gene expression data, we verified the correctness of these additional integrations. BatVI can be downloaded from http://biogpu.ddns.comp.nus.edu.sg/~ksung/batvi/index.html. © 2017 The Author(s).
Source Title: BMC Bioinformatics
URI: https://scholarbank.nus.edu.sg/handle/10635/181296
ISSN: 14712105
DOI: 10.1186/s12859-017-1470-x
Rights: Attribution 4.0 International
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1186_s12859-017-1470-x.pdf1.65 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons