Please use this identifier to cite or link to this item:
Title: Building descriptive and discriminative visual codebook for large-scale image applications
Authors: Tian, Q.
Zhang, S.
Zhou, W.
Ji, R.
Ni, B. 
Sebe, N.
Keywords: Feature space quantization
Image search re-ranking
Large-scale image retrieval
Visual vocabulary
Issue Date: Jan-2011
Citation: Tian, Q., Zhang, S., Zhou, W., Ji, R., Ni, B., Sebe, N. (2011-01). Building descriptive and discriminative visual codebook for large-scale image applications. Multimedia Tools and Applications 51 (2) : 441-477. ScholarBank@NUS Repository.
Abstract: Inspired by the success of textual words in large-scale textual information processing, researchers are trying to extract visual words from images which function similar as textual words. Visual words are commonly generated by clustering a large amount of image local features and the cluster centers are taken as visual words. This approach is simple and scalable, but results in noisy visual words. Lots of works are reported trying to improve the descriptive and discriminative ability of visual words. This paper gives a comprehensive survey on visual vocabulary and details several state-of-the-art algorithms. A comprehensive review and summarization of the related works on visual vocabulary is first presented. Then, we introduce our recent algorithms on descriptive and discriminative visual word generation, i.e., latent visual context analysis for descriptive visual word identification [74], descriptive visual words and visual phrases generation [68], contextual visual vocabulary which combines both semantic contexts and spatial contexts [69], and visual vocabulary hierarchy optimization [18]. Additionally, we introduce two interesting post processing strategies to further improve the performance of visual vocabulary, i.e., spatial coding [73] is proposed to efficiently remove the mismatched visual words between images for more reasonable image similarity computation; user preference based visual word weighting [44] is developed to make the image similarity computed based on visual words more consistent with users' preferences or habits. © 2010 Springer Science+Business Media, LLC.
Source Title: Multimedia Tools and Applications
ISSN: 13807501
DOI: 10.1007/s11042-010-0636-6
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.