Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/40535
DC FieldValue
dc.titleDocument clustering based on cluster validation
dc.contributor.authorNiu, Z.-Y.
dc.contributor.authorJi, D.-H.
dc.contributor.authorTan, C.-L.
dc.date.accessioned2013-07-04T08:06:33Z
dc.date.available2013-07-04T08:06:33Z
dc.date.issued2004
dc.identifier.citationNiu, Z.-Y.,Ji, D.-H.,Tan, C.-L. (2004). Document clustering based on cluster validation. International Conference on Information and Knowledge Management, Proceedings : 501-506. ScholarBank@NUS Repository.
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/40535
dc.description.abstractThis paper presents a cluster validation based document clustering algorithm, which is capable of identifying both important feature words and true model order (cluster number). Important feature subset is selected by optimizing a cluster validity criterion subject to some constraint. For achieving model order identification capability, this feature selection procedure is conducted for each possible value of cluster number. The feature subset and cluster number which maximize the cluster validity criterion are chosen as our answer. We have applied our algorithm to several datasets from 20Newsgroup corpus. Experimental results show that our algorithm can find important feature subset, estimate the model order and yield higher micro-averaged precision than other four document clustering algorithms which require cluster number to be provided. Copyright 2004 ACM.
dc.sourceScopus
dc.subjectCluster number estimation
dc.subjectCluster validation
dc.subjectDocument clustering
dc.subjectFeature selection
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleInternational Conference on Information and Knowledge Management, Proceedings
dc.description.page501-506
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.