Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/241491
DC FieldValue
dc.titleIMPROVING ATTENTION-BASED DEEP LEARNING MODELS WITH LOCALITY
dc.contributor.authorJIANG ZIHANG
dc.date.accessioned2023-05-31T18:01:35Z
dc.date.available2023-05-31T18:01:35Z
dc.date.issued2022-12-19
dc.identifier.citationJIANG ZIHANG (2022-12-19). IMPROVING ATTENTION-BASED DEEP LEARNING MODELS WITH LOCALITY. ScholarBank@NUS Repository.
dc.identifier.urihttps://scholarbank.nus.edu.sg/handle/10635/241491
dc.description.abstractRecent attention-based deep learning models adopt a transformer architecture that uses attention mechanism to figure out which part of the input the model should be concerned. Notably, these models outperform traditional CNN and RNN-based methods significantly. However, due to the permutation in-variance propriety, the attention-based models suffer from the data-inefficiency problem during training. And these models also suffer from the redundancy lies in the attention mechanism, so the computation cost is generally very high. Thus, we propose to introduce locality to improve these attention-based models. Specifically, we propose a novel token labeling training objective to enhance local supervision to enhance training efficiency. To address the issue of redundancy in the attention mechanism, we propose to design novel and naturally local operators. We introduce two carefully designed module: outlook attention and span-based dynamic convolution to improve the efficiency and enhance the model performance.
dc.language.isoen
dc.subjectnetwork architecture,deep learning,attention,local,transformer,machine learning
dc.typeThesis
dc.contributor.departmentELECTRICAL & COMPUTER ENGINEERING
dc.contributor.supervisorXinchao Wang
dc.description.degreePh.D
dc.description.degreeconferredDOCTOR OF PHILOSOPHY (CDE-ENG)
dc.identifier.orcid0000-0002-8096-842X
Appears in Collections:Ph.D Theses (Open)

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
JiangZH.pdf3.16 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.