Please use this identifier to cite or link to this item:
Title: A comprehensive validity index for clustering
Authors: Saitta, S.
Raphael, B. 
Smith, I.F.C.
Keywords: Clustering
Number of clusters
Validity index
Issue Date: 2008
Citation: Saitta, S.,Raphael, B.,Smith, I.F.C. (2008). A comprehensive validity index for clustering. Intelligent Data Analysis 12 (6) : 529-548. ScholarBank@NUS Repository.
Abstract: Cluster validity indices are used for both estimating the quality of a clustering algorithm and for determining the correct number of clusters in data. Even though several indices exist in the literature, most of them are only relevant for data sets that contain at least two clusters. This paper introduces a new bounded index for cluster validity called the score function (SF), a double exponential expression that is based on a ratio of standard cluster parameters. Several artificial and real-life data sets are used to evaluate the performance of the score function. These data sets contain a range of features and patterns such as unbalanced, overlapped and noisy clusters. In addition, cases involving sub-clusters and perfect clusters are tested. The score function is tested against six previously proposed validity indices. In the case of hyper-spheroidal clusters, the index proposed in this paper is found to be always as good or better than these indices. In addition, it is shown to work well on multidimensional and noisy data sets. One of its advantages is the ability to handle single cluster case and sub-cluster hierarchies. © 2008 IOS Press. All rights reserved.
Source Title: Intelligent Data Analysis
ISSN: 1088467X
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.