Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/39303
DC FieldValue
dc.titleClustering web pages about persons and organizations
dc.contributor.authorYe, S.
dc.contributor.authorChua, T.-S.
dc.contributor.authorKei, J.R.
dc.date.accessioned2013-07-04T07:38:36Z
dc.date.available2013-07-04T07:38:36Z
dc.date.issued2005
dc.identifier.citationYe, S.,Chua, T.-S.,Kei, J.R. (2005). Clustering web pages about persons and organizations. Web Intelligence and Agent Systems 3 (4) : 203-216. ScholarBank@NUS Repository.
dc.identifier.issn15701263
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39303
dc.description.abstractOne of the most frequent Web surfing tasks is to search for persons and organizations by their names. Such names are often not distinctive, commonly occurring, and non-unique. Thus, a single name may be mapped to several named target entities. This paper describes a new methodology to cluster web pages returned by a search engine so that pages belonging to different entities are clustered into different groups. The algorithm uses a combination of named entities, and link-based and structure-based information as features to partition the document set into direct and indirect pages by means of a decision-tree model. It then chooses the appropriate distinctive direct pages as seeds to cluster the document set into different clusters. The algorithm has been found to be effective for web-based information retrieval applications. © 2005-IOS Press and the authors. All rights reserved.
dc.sourceScopus
dc.subjectInformation retrieval
dc.subjectMachine learning
dc.subjectNamed entity
dc.subjectPersons and organizations
dc.subjectText classification
dc.subjectWeb clustering
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleWeb Intelligence and Agent Systems
dc.description.volume3
dc.description.issue4
dc.description.page203-216
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.