Comparing keyword extraction techniques for WEBSOM text archives

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/40401

Title:	Comparing keyword extraction techniques for WEBSOM text archives
Authors:	Azcarraga, A.P. Yap Jr., T.N.
Issue Date:	2001
Citation:	Azcarraga, A.P.,Yap Jr., T.N. (2001). Comparing keyword extraction techniques for WEBSOM text archives. Proceedings of the International Conference on Tools with Artificial Intelligence : 187-194. ScholarBank@NUS Repository.
Abstract:	The WEBSOM methodology for building very large text archives has a very slow method for extracting meaningful unit labels. This is because the method computes for the relative frequencies of all the words of all the documents associated to each unit and then compares these to the relative frequencies of all the words of all the other units of the map. Since maps may have more than 100,000 units and the archive may contain up to 7 million documents, the existing WEBSOM method is not practical. A fast alternative method is based on the distribution of weights in the weight vectors of the trained map, plus a simple manipulation of the random projection matrix used for input data compression. Comparisons made using a WEBSOM archive of the Reuters text collection reveal that a high percentage of keywords extracted using this method match the keywords extracted for the same map units using the original WEBSOM method.
Source Title:	Proceedings of the International Conference on Tools with Artificial Intelligence
URI:	http://scholarbank.nus.edu.sg/handle/10635/40401
ISSN:	10636730
Appears in Collections:	Staff Publications

There are no files associated with this item.

Check