Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/237688
DC Field | Value | |
---|---|---|
dc.title | TOWARDS MORE ACCURATE PROTEIN FUNCTION PREDICTION IN THE TWILIGHT ZONE | |
dc.contributor.author | MOHAMMAD NEAMUL KABIR | |
dc.date.accessioned | 2023-02-28T18:01:13Z | |
dc.date.available | 2023-02-28T18:01:13Z | |
dc.date.issued | 2022-08-17 | |
dc.identifier.citation | MOHAMMAD NEAMUL KABIR (2022-08-17). TOWARDS MORE ACCURATE PROTEIN FUNCTION PREDICTION IN THE TWILIGHT ZONE. ScholarBank@NUS Repository. | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/237688 | |
dc.description.abstract | With the advancement of next-generation sequencing technology, more and more protein sequences are being generated day by day and the public databases are overwhelmed with the exponential increase of available sequences. To understand how biological systems operate, the functional assignment of protein sequences is essential and this is one of the highly challenging tasks in biology. In this thesis, we propose a novel idea of using similarity of dissimilarities for protein function prediction using only sequence information and build computational methods. To address this, we propose our first method EnsembleFam, aiming at better protein family modeling for twilight zone proteins. Our second proposed method, e-EnsembleFam, focuses on Enzyme Commission (EC) number prediction at EC Level 3 and Level 4 for both high similarity and twilight zone proteins. Finally, our method m-EnsembleFam provides better annotation for multi-domain enzymes and provides a framework to work with multi-domain proteins in general. All these methods utilize dissimilarity features to build an ensemble model for one protein class consisting of three base SVM classifiers. Lastly, we illustrate a real-life application where we take input from a genome and make protein function predictions for it. | |
dc.language.iso | en | |
dc.subject | Protein function prediction, twilight zone, dissimilarity feature, similarity of dissimilarities, ensemble model, support vector machine | |
dc.type | Thesis | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.contributor.supervisor | Lim Soon Wong | |
dc.description.degree | Ph.D | |
dc.description.degreeconferred | DOCTOR OF PHILOSOPHY (SOC) | |
dc.identifier.orcid | 0000-0002-3616-896X | |
Appears in Collections: | Ph.D Theses (Open) |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
KabirMN.pdf | 2.6 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.