Please use this identifier to cite or link to this item:
https://doi.org/10.1142/S0219649207001755
Title: | Using latent semantic indexing to improve the accuracy of document clustering | Authors: | Zhan, J. Loh, H.T. |
Keywords: | document clustering information retrieval Latent semantic indexing text mining vector space model |
Issue Date: | 2007 | Citation: | Zhan, J., Loh, H.T. (2007). Using latent semantic indexing to improve the accuracy of document clustering. Journal of Information and Knowledge Management 6 (3) : 181-188. ScholarBank@NUS Repository. https://doi.org/10.1142/S0219649207001755 | Abstract: | Document clustering is a significant research issue in information retrieval and text mining. Traditionally, most clustering methods were based on the vector space model which has a few limitations such as high dimensionality and weakness in handling synonymous and polysemous problems. Latent semantic indexing (LSI) is able to deal with such problems to some extent. Previous studies have shown that using LSI could reduce the time in clustering a large document set while having little effect on clustering accuracy. However, when conducting clustering upon a small document set, the accuracy is more concerned than efficiency. In this paper, we demonstrate that LSI can improve the clustering accuracy of a small document set and we also recommend the dimensions needed to achieve the best clustering performance. © 2007 iKMS & World Scientific Publishing Co. | Source Title: | Journal of Information and Knowledge Management | URI: | http://scholarbank.nus.edu.sg/handle/10635/85822 | ISSN: | 02196492 | DOI: | 10.1142/S0219649207001755 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.