Please use this identifier to cite or link to this item: https://doi.org/10.18653/v1/P18-1172
Title: Batch IS NOT Heavy: Learning Word Representations From All Samples
Authors: Xin Xin
Fajie Yuan
Xiangnan He 
Joemon M.Jose
Issue Date: 20-Jul-2018
Publisher: Association for Computational Linguistics (ACL)
Citation: Xin Xin, Fajie Yuan, Xiangnan He, Joemon M.Jose (2018-07-20). Batch IS NOT Heavy: Learning Word Representations From All Samples. ACL 2018 : 1853-1862. ScholarBank@NUS Repository. https://doi.org/10.18653/v1/P18-1172
Abstract: Stochastic Gradient Descent (SGD) with negative sampling is the most prevalent approach to learn word representations. However, it is known that sampling methods are biased especially when the sampling distribution deviates from the true data distribution. Besides, SGD suffers from dramatic fluctuation due to the one-sample learning scheme. In this work, we propose AllVec that uses batch gradient learning to generate word representations from all training samples. Remarkably, the time complexity of AllVec remains at the same level as SGD, being determined by the number of positive samples rather than all samples. We evaluate AllVec on several benchmark tasks. Experiments show that AllVec outperforms sampling-based SGD methods with comparable efficiency, especially for small training corpora. © 2018 Association for Computational Linguistics
Source Title: ACL 2018
URI: https://scholarbank.nus.edu.sg/handle/10635/167277
ISBN: 9781948087322
DOI: 10.18653/v1/P18-1172
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Batch IS NOT Heavy Learning Word Representations From All Samples.pdf365.38 kBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.