Please use this identifier to cite or link to this item: https://doi.org/10.1109/CIS.2013.93
Title: Inducing word senses for cross-lingual document clustering
Authors: Tang, G.
Xia, Y.
Cambria, E. 
Jin, P.
Keywords: Cross-lingual document clustering
Cross-lingual document representation
Word sense
Issue Date: 2013
Citation: Tang, G., Xia, Y., Cambria, E., Jin, P. (2013). Inducing word senses for cross-lingual document clustering. Proceedings - 9th International Conference on Computational Intelligence and Security, CIS 2013 : 409-414. ScholarBank@NUS Repository. https://doi.org/10.1109/CIS.2013.93
Abstract: Cross-lingual document clustering is the task of automatically organizing a large collection of cross-lingual documents into a few groups according to their content or topic. It is well known that language barrier and translation ambiguity are two challenging issues for cross-lingual document representation. To address such issues, we propose to represent cross-lingual documents through statistical word senses, which are learned from a parallel corpus by means of a novel cross-lingual word sense induction model. Furthermore, a sense clustering method is adopted to discover semantic relation of word senses, which are used to represent cross-lingual documents through a sense-based vector space model. Evaluation on a benchmarking dataset shows that the proposed model outperforms two state-of-the-art models in cross-lingual document clustering. © 2013 IEEE.
Source Title: Proceedings - 9th International Conference on Computational Intelligence and Security, CIS 2013
URI: http://scholarbank.nus.edu.sg/handle/10635/128923
ISBN: 9781479925483
DOI: 10.1109/CIS.2013.93
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

WEB OF SCIENCETM
Citations

2
checked on Jul 29, 2021

Page view(s)

45
checked on Jul 11, 2021

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.