Extracting meaningful labels for WEBSOM text archives | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/40405

DC Field	Value
dc.title	Extracting meaningful labels for WEBSOM text archives
dc.contributor.author	Azcarraga, A.P.
dc.contributor.author	Yap Jr., T.N.
dc.date.accessioned	2013-07-04T08:03:33Z
dc.date.available	2013-07-04T08:03:33Z
dc.date.issued	2001
dc.identifier.citation	Azcarraga, A.P.,Yap Jr., T.N. (2001). Extracting meaningful labels for WEBSOM text archives. International Conference on Information and Knowledge Management, Proceedings : 41-48. ScholarBank@NUS Repository.
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/40405
dc.description.abstract	Self-Organizing Maps, being used mainly with data that are not pre-labeled, need automatic procedures for extracting keywords as labels for each of the map units. The WEBSOM methodology for building very large text archives has a very slow method for extracting such unit labels. It computes the relative frequencies of all the words of all the documents associated to each unit and then compares these to the relative frequencies of all the words of all the other units of the map. Since maps may have more than 100,000 units and the archive may contain up to 7 million documents, the existing WEBSOM method is not practical. This paper describes how the meaningful labels per map unit can be deduced by analyzing the relative weight distribution of the SOM weight vectors and by taking advantage of some characteristics of the random projection method used in dimensionality reduction. The effectiveness of this technique is demonstrated on archives of the well studied Reuters and CNN collections. Comparisons with the WEBSOM method are provided.
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.sourcetitle	International Conference on Information and Knowledge Management, Proceedings
dc.description.page	41-48
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.