Please use this identifier to cite or link to this item: https://doi.org/10.1038/sdata.2018.136
Title: A merged lung cancer transcriptome dataset for clinical predictive modeling
Authors: Lim, S.B
Tan, S.J
Lim, W.-T 
Lim, C.T 
Keywords: transcriptome
biology
data analysis
factual database
gene expression profiling
genetics
human
lung tumor
non small cell lung cancer
procedures
Carcinoma, Non-Small-Cell Lung
Computational Biology
Data Analysis
Databases, Factual
Gene Expression Profiling
Humans
Lung Neoplasms
Transcriptome
Issue Date: 2018
Citation: Lim, S.B, Tan, S.J, Lim, W.-T, Lim, C.T (2018). A merged lung cancer transcriptome dataset for clinical predictive modeling. Scientific data 5 : 180136. ScholarBank@NUS Repository. https://doi.org/10.1038/sdata.2018.136
Abstract: The Gene Expression Omnibus (GEO) database is an excellent public source of whole transcriptomic profiles of multiple cancers. The main challenge is the limited accessibility of such large-scale genomic data to people without a background in bioinformatics or computer science. This presents difficulties in data analysis, sharing and visualization. Here, we present an integrated bioinformatics pipeline and a normalized dataset that has been preprocessed using a robust statistical methodology; allowing others to perform large-scale meta-analysis, without having to conduct time-consuming data mining and statistical correction. Comprising 1,118 patient-derived samples, the normalized dataset includes primary non-small cell lung cancer (NSCLC) tumors and paired normal lung tissues from ten independent GEO datasets, facilitating differential expression analysis. The data has been merged, normalized, batch effect-corrected and filtered for genes with low variance via multiple open source R packages integrated into our workflow. Overall this dataset (with associated clinical metadata) better represents the diseased population and serves as a powerful tool for early predictive biomarker discovery.
Source Title: Scientific data
URI: https://scholarbank.nus.edu.sg/handle/10635/175051
ISSN: 20524463
DOI: 10.1038/sdata.2018.136
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1038_sdata_2018_136.pdf2.71 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.