Please use this identifier to cite or link to this item: https://doi.org/10.5244/C.22.90
DC FieldValue
dc.titleSemi-supervised clustering via learnt codeword distances
dc.contributor.authorBatra D.
dc.contributor.authorSukthankar R.
dc.contributor.authorChen T.
dc.date.accessioned2018-08-21T05:06:05Z
dc.date.available2018-08-21T05:06:05Z
dc.date.issued2008
dc.identifier.citationBatra D., Sukthankar R., Chen T. (2008). Semi-supervised clustering via learnt codeword distances. BMVC 2008 - Proceedings of the British Machine Vision Conference 2008. ScholarBank@NUS Repository. https://doi.org/10.5244/C.22.90
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/146248
dc.description.abstractThis paper focuses on semi-supervised clustering, where the goal is to cluster a set of data-points given a set of similar/dissimilar examples. These examples provide instance-level equivalence/in-equivalence constraints (e.g., similar pairs belong to the same cluster while dissimilar pairs belong to different clusters), but in order to aid final clustering we must propagate them to feature-space level constraints (i.e., how similar are two regions in the feature space?). An increasingly popular approach to accomplish this is by learning distance metrics over the feature space that are guided by the instance-level constraints. Inspired by the success of recent bag-of-words models, we utilize codewords (or visual-words) as building blocks. Our proposed technique learns non-parametric distance metrics over codewords from these equivalence (and optionally, in-equivalence) constraints, which we are then able to propagate back to compute a dissimilarity measure between any two points in the feature space. There are two significant advances over previous work. First, unlike past efforts on global distance metric learning which try to transform the entire feature space so that similar pairs are close, we transform modes in data distribution or pockets of the feature space. This transformation is non-parametric and thus allows arbitrary non-linear deformations of the feature space. Second, while most Mahalanobis metrics are learnt using Semi-Definite Programming (SDP), our proposed solution is developed as a Linear Program (LP) and in practice, is extremely fast. Finally, we provide quantitative analysis on image datasets (MSRC, Corel) where ground-truth segmentation is available, and show that our learnt metrics can significantly improve clustering accuracy.
dc.publisherBritish Machine Vision Association, BMVA
dc.sourceScopus
dc.typeConference Paper
dc.contributor.departmentOFFICE OF THE PROVOST
dc.contributor.departmentDEPARTMENT OF COMPUTER SCIENCE
dc.description.doi10.5244/C.22.90
dc.description.sourcetitleBMVC 2008 - Proceedings of the British Machine Vision Conference 2008
dc.published.statepublished
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.