Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/135848
Title: MULTI-DIMENSIONAL INTERROGATION OF DNA MUTATIONS IN CANCER
Authors: MOHAMED FEROZ BIN MOHAMMED OMAR
Keywords: mutational signatures, machine learning, next generation sequencing, oncology, mutations, circulating tumour cells, cancers of unknown primary origin
Issue Date: 25-Apr-2017
Source: MOHAMED FEROZ BIN MOHAMMED OMAR (2017-04-25). MULTI-DIMENSIONAL INTERROGATION OF DNA MUTATIONS IN CANCER. ScholarBank@NUS Repository.
Abstract: Cancers are known to develop through distinct DNA mutation events that occur during their initiation and progression. Recent studies have revealed that the analysis of point mutation patterns across cancer types can provide information on the carcinogenic origins and mutation susceptibilities of different cancers. In this study, it was postulated that additional insight can be obtained through the integrated analysis of multiple dimensions of DNA mutation events and that a tool for predicting cancer type from mutation patterns could be developed from this knowledge. The dimensions considered included the frequency, types and co-occurrence of point mutations, insertions and deletions, genes mutated and the genomic distribution of mutations. As a first step, a program was designed to provide efficient conversion of MAF files from next-generation sequencing (NGS) data into multi-dimensional mutation profiles. The programme was then applied to characterise mutation patterns from all data in The Cancer Genome Atlas (TCGA) database, comprising more than 8,000 tumours from 31 cancer types. Analysis of the results provided some interesting insights into the heterogeneity, subtypes, interrelation and biology of different cancer types. As a second step, multiple statistical and machine learning approaches were tested to determine optimal methods for building a tool to predict cancer type based on the mutation pattern of an unknown sample. Using the optimised method, close to 100% prediction accuracy was obtained in the analysis of random bootstrapped sample series from the TCGA. When applied to 5 non-TCGA NGS datasets, the prediction accuracy was 30-60%. While encouraging, the results also highlighted many issues, such as the need for standardisation of NGS protocols. In conclusion, these results have shown that multi-dimensional interrogation of DNA mutation patterns can provide novel insights into cancer biology, and may be useful for predicting cancer types of samples of unknown origin through future development.
URI: http://scholarbank.nus.edu.sg/handle/10635/135848
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Feroz Omar_Thesis.pdf13.63 MBAdobe PDF

OPEN

NoneView/Download

Page view(s)

31
checked on Jan 20, 2018

Download(s)

39
checked on Jan 20, 2018

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.