Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/37832
Title: Efficient Computational Techniques for Tag SNP Selection, Epistasis Analysis, and Genome-wide Association Study
Authors: WANG YUE
Keywords: SNP,Genome-wide association study,Hadoop,Epistasis,Tag SNP,Analysis
Issue Date: 30-Nov-2012
Source: WANG YUE (2012-11-30). Efficient Computational Techniques for Tag SNP Selection, Epistasis Analysis, and Genome-wide Association Study. ScholarBank@NUS Repository.
Abstract: GWAS is amongst the most popular study designs to identify potential genetic variants that are linked to the etiologies of diseases. We first give an independent, empirical comparison of epistasis detection methods in GWAS. The experimental results show that the methods which examine all possible candidate pairs are more powerful. The observation leads us to use a scalable, fault-tolerant, flexible and parallel technology? Hadoop. We are probably the first practitioners to effectively ?marry? the epistasis detection in GWAS with Hadoop, resulting in two new computing tools for detecting epistasis called CEO and efficient CEO (eCEO). Seeing the advantage of using Hadoop in GWAS, we adapt a powerful machine learning technique?Random Forest (RF)?to develop a Parallel Random Forest Regression (PaRFR) algorithm on Hadoop for high dimensional quantitative traits in GWAS. We finally propose efficient tag SNP selection algorithm (Fasttagger) using multi-marker linkage disequilibrium for genome-wide data.
URI: http://scholarbank.nus.edu.sg/handle/10635/37832
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
WangY.pdf8.58 MBAdobe PDF

OPEN

NoneView/Download

Page view(s)

102
checked on Dec 11, 2017

Download(s)

174
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.