Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/241491
DC Field | Value | |
---|---|---|
dc.title | IMPROVING ATTENTION-BASED DEEP LEARNING MODELS WITH LOCALITY | |
dc.contributor.author | JIANG ZIHANG | |
dc.date.accessioned | 2023-05-31T18:01:35Z | |
dc.date.available | 2023-05-31T18:01:35Z | |
dc.date.issued | 2022-12-19 | |
dc.identifier.citation | JIANG ZIHANG (2022-12-19). IMPROVING ATTENTION-BASED DEEP LEARNING MODELS WITH LOCALITY. ScholarBank@NUS Repository. | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/241491 | |
dc.description.abstract | Recent attention-based deep learning models adopt a transformer architecture that uses attention mechanism to figure out which part of the input the model should be concerned. Notably, these models outperform traditional CNN and RNN-based methods significantly. However, due to the permutation in-variance propriety, the attention-based models suffer from the data-inefficiency problem during training. And these models also suffer from the redundancy lies in the attention mechanism, so the computation cost is generally very high. Thus, we propose to introduce locality to improve these attention-based models. Specifically, we propose a novel token labeling training objective to enhance local supervision to enhance training efficiency. To address the issue of redundancy in the attention mechanism, we propose to design novel and naturally local operators. We introduce two carefully designed module: outlook attention and span-based dynamic convolution to improve the efficiency and enhance the model performance. | |
dc.language.iso | en | |
dc.subject | network architecture,deep learning,attention,local,transformer,machine learning | |
dc.type | Thesis | |
dc.contributor.department | ELECTRICAL & COMPUTER ENGINEERING | |
dc.contributor.supervisor | Xinchao Wang | |
dc.description.degree | Ph.D | |
dc.description.degreeconferred | DOCTOR OF PHILOSOPHY (CDE-ENG) | |
dc.identifier.orcid | 0000-0002-8096-842X | |
Appears in Collections: | Ph.D Theses (Open) |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
JiangZH.pdf | 3.16 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.