Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/39303
DC Field | Value | |
---|---|---|
dc.title | Clustering web pages about persons and organizations | |
dc.contributor.author | Ye, S. | |
dc.contributor.author | Chua, T.-S. | |
dc.contributor.author | Kei, J.R. | |
dc.date.accessioned | 2013-07-04T07:38:36Z | |
dc.date.available | 2013-07-04T07:38:36Z | |
dc.date.issued | 2005 | |
dc.identifier.citation | Ye, S.,Chua, T.-S.,Kei, J.R. (2005). Clustering web pages about persons and organizations. Web Intelligence and Agent Systems 3 (4) : 203-216. ScholarBank@NUS Repository. | |
dc.identifier.issn | 15701263 | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/39303 | |
dc.description.abstract | One of the most frequent Web surfing tasks is to search for persons and organizations by their names. Such names are often not distinctive, commonly occurring, and non-unique. Thus, a single name may be mapped to several named target entities. This paper describes a new methodology to cluster web pages returned by a search engine so that pages belonging to different entities are clustered into different groups. The algorithm uses a combination of named entities, and link-based and structure-based information as features to partition the document set into direct and indirect pages by means of a decision-tree model. It then chooses the appropriate distinctive direct pages as seeds to cluster the document set into different clusters. The algorithm has been found to be effective for web-based information retrieval applications. © 2005-IOS Press and the authors. All rights reserved. | |
dc.source | Scopus | |
dc.subject | Information retrieval | |
dc.subject | Machine learning | |
dc.subject | Named entity | |
dc.subject | Persons and organizations | |
dc.subject | Text classification | |
dc.subject | Web clustering | |
dc.type | Article | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.description.sourcetitle | Web Intelligence and Agent Systems | |
dc.description.volume | 3 | |
dc.description.issue | 4 | |
dc.description.page | 203-216 | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Staff Publications |
Show simple item record
Files in This Item:
There are no files associated with this item.
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.