Please use this identifier to cite or link to this item:
https://doi.org/10.1109/WACV45572.2020.9093343
DC Field | Value | |
---|---|---|
dc.title | GradMix: Multi-source transfer across domains and tasks | |
dc.contributor.author | Li, J | |
dc.contributor.author | Xu, Z | |
dc.contributor.author | Wang, Y | |
dc.contributor.author | Zhao, Q | |
dc.contributor.author | Kankanhalli, MS | |
dc.date.accessioned | 2020-08-12T08:48:29Z | |
dc.date.available | 2020-08-12T08:48:29Z | |
dc.date.issued | 2020-03-01 | |
dc.identifier.citation | Li, J, Xu, Z, Wang, Y, Zhao, Q, Kankanhalli, MS (2020-03-01). GradMix: Multi-source transfer across domains and tasks. Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020 : 3008-3016. ScholarBank@NUS Repository. https://doi.org/10.1109/WACV45572.2020.9093343 | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/172483 | |
dc.description.abstract | © 2020 IEEE. The computer vision community is witnessing an unprecedented rate of new tasks being proposed and addressed, thanks to the deep convolutional networks' capability to find complex mappings from X to Y. The advent of each task often accompanies the release of a large-scale annotated dataset, for supervised training of deep network. However, it is expensive and time-consuming to manually label sufficient amount of training data. Therefore, it is important to develop algorithms that can leverage off-the-shelf labeled dataset to learn useful knowledge for the target task. While previous works mostly focus on transfer learning from a single source, we study multi-source transfer across domains and tasks (MS-DTT), in a semi-supervised setting. We propose GradMix, a model-agnostic method applicable to any model trained with gradient-based learning rule, to transfer knowledge via gradient descent by weighting and mixing the gradients from all sources during training. GradMix follows a meta-learning objective, which assigns layer-wise weights to the source gradients, such that the combined gradient follows the direction that minimize the loss for a small set of samples from the target dataset. In addition, we propose to adaptively adjust the learning rate for each mini-batch based on its importance to the target task, and a pseudo-labeling method to leverage the unlabeled samples in the target domain. We conduct MS-DTT experiments on two tasks: digit recognition and action recognition, and demonstrate the advantageous performance of the proposed method against multiple baselines. | |
dc.publisher | IEEE | |
dc.source | Elements | |
dc.subject | cs.CV | |
dc.subject | cs.CV | |
dc.type | Article | |
dc.date.updated | 2020-08-12T07:37:21Z | |
dc.contributor.department | DEPARTMENT OF COMPUTER SCIENCE | |
dc.description.doi | 10.1109/WACV45572.2020.9093343 | |
dc.description.sourcetitle | Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020 | |
dc.description.page | 3008-3016 | |
dc.published.state | Published | |
Appears in Collections: | Staff Publications Elements |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
2002.03264v1.pdf | 1.83 MB | Adobe PDF | OPEN | Post-print | View/Download |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.