Please use this identifier to cite or link to this item: https://doi.org/10.1016/S0169-023X(02)00138-6
DC FieldValue
dc.titleFast hierarchical clustering and its validation
dc.contributor.authorDash, M.
dc.contributor.authorLiu, H.
dc.contributor.authorScheuermann, P.
dc.contributor.authorTan, K.L.
dc.date.accessioned2013-07-04T07:33:13Z
dc.date.available2013-07-04T07:33:13Z
dc.date.issued2003
dc.identifier.citationDash, M., Liu, H., Scheuermann, P., Tan, K.L. (2003). Fast hierarchical clustering and its validation. Data and Knowledge Engineering 44 (1) : 109-138. ScholarBank@NUS Repository. https://doi.org/10.1016/S0169-023X(02)00138-6
dc.identifier.issn0169023X
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39068
dc.description.abstractClustering is the task of grouping similar objects into clusters. A prominent and useful class of algorithm is hierarchical agglomerative clustering (HAC) which iteratively agglomerates the closest pair until all data points belong to one cluster. It outputs a dendrogram showing all N levels of agglomerations where N is the number of objects in the dataset. However, HAC methods have several drawbacks: (1) high time and memory complexities for clustering, and (2) inefficient and inaccurate cluster validation. In this paper we show that these drawbacks can be alleviated by closely studying the dendrogram. Empirical study shows that most HAC algorithms follow a trend where, except for a number of top levels of the dendrogram, all lower levels agglomerate clusters which are very small in size and close in proximity to other clusters. Methods are proposed that exploit this characteristic to reduce the time and memory complexities significantly and to make validation very efficient and accurate. Analyses and experiments show the effectiveness of the proposed method. © 2002 Elsevier Science B.V. All rights reserved.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1016/S0169-023X(02)00138-6
dc.sourceScopus
dc.subjectClustering
dc.subjectLarge and high-dimensional datasets
dc.subjectValidation
dc.subjectVoronoi diagram
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1016/S0169-023X(02)00138-6
dc.description.sourcetitleData and Knowledge Engineering
dc.description.volume44
dc.description.issue1
dc.description.page109-138
dc.description.codenDKENE
dc.identifier.isiut000180076700005
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

45
checked on Nov 20, 2020

WEB OF SCIENCETM
Citations

28
checked on Nov 20, 2020

Page view(s)

139
checked on Nov 24, 2020

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.