Please use this identifier to cite or link to this item: https://doi.org/10.1142/S0219649207001755
Title: Using latent semantic indexing to improve the accuracy of document clustering
Authors: Zhan, J.
Loh, H.T. 
Keywords: document clustering
information retrieval
Latent semantic indexing
text mining
vector space model
Issue Date: 2007
Citation: Zhan, J., Loh, H.T. (2007). Using latent semantic indexing to improve the accuracy of document clustering. Journal of Information and Knowledge Management 6 (3) : 181-188. ScholarBank@NUS Repository. https://doi.org/10.1142/S0219649207001755
Abstract: Document clustering is a significant research issue in information retrieval and text mining. Traditionally, most clustering methods were based on the vector space model which has a few limitations such as high dimensionality and weakness in handling synonymous and polysemous problems. Latent semantic indexing (LSI) is able to deal with such problems to some extent. Previous studies have shown that using LSI could reduce the time in clustering a large document set while having little effect on clustering accuracy. However, when conducting clustering upon a small document set, the accuracy is more concerned than efficiency. In this paper, we demonstrate that LSI can improve the clustering accuracy of a small document set and we also recommend the dimensions needed to achieve the best clustering performance. © 2007 iKMS & World Scientific Publishing Co.
Source Title: Journal of Information and Knowledge Management
URI: http://scholarbank.nus.edu.sg/handle/10635/85822
ISSN: 02196492
DOI: 10.1142/S0219649207001755
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.