Efficient maximum appearance search for large-scale object detection | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1109/CVPR.2013.410

Title:	Efficient maximum appearance search for large-scale object detection
Authors:	Chen, Q. Song, Z. Feris, R. Datta, A. Cao, L. Huang, Z. Yan, S.
Issue Date:	2013
Citation:	Chen, Q., Song, Z., Feris, R., Datta, A., Cao, L., Huang, Z., Yan, S. (2013). Efficient maximum appearance search for large-scale object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition : 3190-3197. ScholarBank@NUS Repository. https://doi.org/10.1109/CVPR.2013.410
Abstract:	In recent years, efficiency of large-scale object detection has arisen as an important topic due to the exponential growth in the size of benchmark object detection datasets. Most current object detection methods focus on improving accuracy of large-scale object detection with efficiency being an afterthought. In this paper, we present the Efficient Maximum Appearance Search (EMAS) model which is an order of magnitude faster than the existing state-of-the-art large-scale object detection approaches, while maintaining comparable accuracy. Our EMAS model consists of representing an image as an ensemble of densely sampled feature points with the proposed Point wise Fisher Vector encoding method, so that the learnt discriminative scoring function can be applied locally. Consequently, the object detection problem is transformed into searching an image sub-area for maximum local appearance probability, thereby making EMAS an order of magnitude faster than the traditional detection methods. In addition, the proposed model is also suitable for incorporating global context at a negligible extra computational cost. EMAS can also incorporate fusion of multiple features, which greatly improves its performance in detecting multiple object categories. Our experiments show that the proposed algorithm can perform detection of 1000 object classes in less than one minute per image on the Image Net ILSVRC2012 dataset and for 107 object classes in less than 5 seconds per image for the SUN09 dataset using a single CPU. © 2013 IEEE.
Source Title:	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
URI:	http://scholarbank.nus.edu.sg/handle/10635/70108
ISSN:	10636919
DOI:	10.1109/CVPR.2013.410
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.