NIFTI: An evolutionary approach for finding number of clusters in microarray data

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/89584

DC Field	Value
dc.title	NIFTI: An evolutionary approach for finding number of clusters in microarray data
dc.contributor.author	Jonnalagadda, S.
dc.contributor.author	Srinivasan, R.
dc.date.accessioned	2014-10-09T06:55:29Z
dc.date.available	2014-10-09T06:55:29Z
dc.date.issued	2009-01-30
dc.identifier.citation	Jonnalagadda, S., Srinivasan, R. (2009-01-30). NIFTI: An evolutionary approach for finding number of clusters in microarray data. BMC Bioinformatics 10 : -. ScholarBank@NUS Repository.
dc.identifier.issn	14712105
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/89584
dc.description.abstract	Background: Clustering techniques are routinely used in gene expression data analysis to organize the massive data. Clustering techniques arrange a large number of genes or assays into a few clusters while maximizing the intra-cluster similarity and inter-cluster separation. While clustering of genes facilitates learning the functions of un-characterized genes using their association with known genes, clustering of assays reveals the disease stages and subtypes. Many clustering algorithms require the user to specify the number of clusters a priori. A wrong specification of number of clusters generally leads to either failure to detect novel clusters (disease subtypes) or unnecessary splitting of natural clusters. Results: We have developed anovel method to find the number of clusters in gene expression data. Our procedure evaluates different partitions (each with different number of clusters) from the clustering algorithm and finds the partition that best describes the data. In contrast to the existing methods that evaluate the partitions independently, our procedure considers the dynamic rearrangement of cluster members when a new cluster is added. Partition quality is measured based on a new index called Net InFormation Transfer Index (NIFTI) that measures the information change when an additional cluster is introduced. Information content of a partition increases when clusters do not intersect and decreases if they are not clearly separated. A partition with the highest Total Information Content (TIC) is selected as the optimal one. We illustrate our method using four publicly available microarray datasets. Conclusion: In all four case studies, the proposed method correctly identified the number of clusters and performs better than other well known methods. Our method also showed invariance to the clustering techniques. © 2009 Jonnalagadda and Srinivasan; licensee BioMed Central Ltd.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1186/1471-2105-10-40
dc.source	Scopus
dc.type	Article
dc.contributor.department	CHEMICAL & BIOMOLECULAR ENGINEERING
dc.description.sourcetitle	BMC Bioinformatics
dc.description.volume	10
dc.description.page	-
dc.description.coden	BBMIC
dc.identifier.isiut	000265603300001
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM