Please use this identifier to cite or link to this item: https://doi.org/10.1186/s12864-017-3490-3
Title: Protein complex-based analysis is resistant to the obfuscating consequences of batch effects --- a case study in clinical proteomics
Authors: GOH WEN BIN,WILSON 
Wong L. 
Keywords: bioinformatics
false positive result
phenotype
prediction
principal component analysis
proteomics
reproducibility
simulation
chemistry
cluster analysis
human
kidney tumor
procedures
protein multimerization
proteomics
specimen handling
standards
statistics and numerical data
protein binding
tumor protein
Cluster Analysis
Humans
Kidney Neoplasms
Neoplasm Proteins
Principal Component Analysis
Protein Binding
Protein Multimerization
Proteomics
Reproducibility of Results
Specimen Handling
Issue Date: 2017
Publisher: BioMed Central Ltd.
Citation: GOH WEN BIN,WILSON, Wong L. (2017). Protein complex-based analysis is resistant to the obfuscating consequences of batch effects --- a case study in clinical proteomics. BMC Genomics 18 : 142. ScholarBank@NUS Repository. https://doi.org/10.1186/s12864-017-3490-3
Abstract: Background: In proteomics, batch effects are technical sources of variation that confounds proper analysis, preventing effective deployment in clinical and translational research. Results: Using simulated and real data, we demonstrate existing batch effect-correction methods do not always eradicate all batch effects. Worse still, they may alter data integrity, and introduce false positives. Moreover, although Principal component analysis (PCA) is commonly used for detecting batch effects. The principal components (PCs) themselves may be used as differential features, from which relevant differential proteins may be effectively traced. Batch effect are removable by identifying PCs highly correlated with batch but not class effect. However, neither PC-based nor existing batch effect-correction methods address well subtle batch effects, which are difficult to eradicate, and involve data transformation and/or projection which is error-prone. To address this, we introduce the concept of batch-effect resistant methods and demonstrate how such methods incorporating protein complexes are particularly resistant to batch effect without compromising data integrity. Conclusions: Protein complex-based analyses are powerful, offering unparalleled differential protein-selection reproducibility and high prediction accuracy. We demonstrate for the first time their innate resistance against batch effects, even subtle ones. As complex-based analyses require no prior data transformation (e.g. batch-effect correction), data integrity is protected. Individual checks on top-ranked protein complexes confirm strong association with phenotype classes and not batch. Therefore, the constituent proteins of these complexes are more likely to be clinically relevant. © 2017 The Author(s).
Source Title: BMC Genomics
URI: https://scholarbank.nus.edu.sg/handle/10635/173856
ISSN: 14712164
DOI: 10.1186/s12864-017-3490-3
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1186_s12864-017-3490-3.pdf1.41 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.