Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/237688
Title: TOWARDS MORE ACCURATE PROTEIN FUNCTION PREDICTION IN THE TWILIGHT ZONE
Authors: MOHAMMAD NEAMUL KABIR
ORCID iD:   orcid.org/0000-0002-3616-896X
Keywords: Protein function prediction, twilight zone, dissimilarity feature, similarity of dissimilarities, ensemble model, support vector machine
Issue Date: 17-Aug-2022
Citation: MOHAMMAD NEAMUL KABIR (2022-08-17). TOWARDS MORE ACCURATE PROTEIN FUNCTION PREDICTION IN THE TWILIGHT ZONE. ScholarBank@NUS Repository.
Abstract: With the advancement of next-generation sequencing technology, more and more protein sequences are being generated day by day and the public databases are overwhelmed with the exponential increase of available sequences. To understand how biological systems operate, the functional assignment of protein sequences is essential and this is one of the highly challenging tasks in biology. In this thesis, we propose a novel idea of using similarity of dissimilarities for protein function prediction using only sequence information and build computational methods. To address this, we propose our first method EnsembleFam, aiming at better protein family modeling for twilight zone proteins. Our second proposed method, e-EnsembleFam, focuses on Enzyme Commission (EC) number prediction at EC Level 3 and Level 4 for both high similarity and twilight zone proteins. Finally, our method m-EnsembleFam provides better annotation for multi-domain enzymes and provides a framework to work with multi-domain proteins in general. All these methods utilize dissimilarity features to build an ensemble model for one protein class consisting of three base SVM classifiers. Lastly, we illustrate a real-life application where we take input from a genome and make protein function predictions for it.
URI: https://scholarbank.nus.edu.sg/handle/10635/237688
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
KabirMN.pdf2.6 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.