Please use this identifier to cite or link to this item: https://doi.org/10.1371/journal.pone.0059494
Title: Comprehensive Characterization of Human Genome Variation by High Coverage Whole-Genome Sequencing of Forty Four Caucasians
Authors: Shen H. 
Li J.
Zhang J.
Xu C.
Jiang Y.
Wu Z.
Zhao F.
Liao L.
Chen J.
Lin Y.
Tian Q.
Papasian C.J.
Deng H.-W.
Keywords: adult
article
Caucasian
female
gene deletion
gene frequency
gene inactivation
gene insertion
gene product
genetic procedures
genetic variability
high coverage whole genome sequencing
homozygosity
human
loss of function mutation
male
nucleic acid base substitution
single nucleotide polymorphism
Adult
Chromosomes, Human, Y
Disease
DNA Copy Number Variations
DNA, Mitochondrial
European Continental Ancestry Group
Female
Genome, Human
Genomics
Humans
INDEL Mutation
Male
Mutation Rate
Polymorphism, Single Nucleotide
Sequence Analysis, DNA
Issue Date: 2013
Citation: Shen H., Li J., Zhang J., Xu C., Jiang Y., Wu Z., Zhao F., Liao L., Chen J., Lin Y., Tian Q., Papasian C.J., Deng H.-W. (2013). Comprehensive Characterization of Human Genome Variation by High Coverage Whole-Genome Sequencing of Forty Four Caucasians. PLoS ONE 8 (4) : e59494. ScholarBank@NUS Repository. https://doi.org/10.1371/journal.pone.0059494
Rights: Attribution 4.0 International
Abstract: Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels that we found were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets, and the majority of which (~96%) have a minor allele frequency less than 5%. On average, each individual genome carried ~3.3 million SNPs and ~492,000 indels/block substitutions, including approximately 179 variants that were predicted to cause loss of function of the gene products. Moreover, each individual genome carried an average of 44 such loss-of-function variants in a homozygous state, which would completely "knock out" the corresponding genes. Across all the 44 genomes, a total of 182 genes were "knocked-out" in at least one individual genome, among which 46 genes were "knocked out" in over 30% of our samples, suggesting that a number of genes are commonly "knocked-out" in general populations. Gene ontology analysis suggested that these commonly "knocked-out" genes are enriched in biological process related to antigen processing and immune response. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases. © 2013 Shen et al.
Source Title: PLoS ONE
URI: https://scholarbank.nus.edu.sg/handle/10635/161331
ISSN: 19326203
DOI: 10.1371/journal.pone.0059494
Rights: Attribution 4.0 International
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1371_journal_pone_0059494.pdf892.97 kBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons