Please use this identifier to cite or link to this item:
Title: Breaking the small cluster barrier of graph clustering
Authors: Ailon, N.
Chen, Y.
Xu, H. 
Issue Date: 2013
Source: Ailon, N.,Chen, Y.,Xu, H. (2013). Breaking the small cluster barrier of graph clustering. 30th International Conference on Machine Learning, ICML 2013 (PART 3) : 2032-2040. ScholarBank@NUS Repository.
Abstract: This paper investigates graph clustering in the planted cluster model in the presence of small clusters. Traditional results dictate that for an algorithm to provably correctly recover the clusters, all clusters must be sufficiently large (in particular, Ω(√n) where n is the number of nodes of the graph). We show that this is not really a restriction: by a more refined analysis of the trace-norm based matrix recovery approach proposed in Jalali et al. (2011) and Chen et al. (2012), we prove that small clusters, under certain mild assumptions, do not hinder recovery of large ones. Based on this result, we further devise an iterative algorithm to recover almost all clusters via a "peeling strategy", i.e., recover large clusters first, leading to a reduced problem, and repeat this procedure. These resuits are extended to the partial observation setting, in which only a (chosen) part of the graph is observed. The peeling strategy gives rise to an active learning algorithm, in which edges adjacent to smaller clusters are queried more often as large clusters are learned (and removed). From a high level, this paper sheds novel insights on high-dimensional statistics and learning structured data, by presenting a structured matrix learning problem for which a one shot convex relaxation approach necessarily fails, but a carefully constructed sequence of convex relaxations does the job. Copyright 2013 by the author(s).
Source Title: 30th International Conference on Machine Learning, ICML 2013
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Feb 16, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.