Please use this identifier to cite or link to this item:
https://doi.org/10.1109/ICME.2017.8019376
Title: | GEOGRAPHIC INFORMATION USE IN WEAKLY-SUPERVISED DEEP LEARNING FOR LANDMARK RECOGNITION | Authors: | Yin, Yifang Liu, Zhenguang Zimmermann, Roger |
Keywords: | Science & Technology Technology Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic Computer Science Engineering |
Issue Date: | 1-Jan-2017 | Publisher: | IEEE | Citation: | Yin, Yifang, Liu, Zhenguang, Zimmermann, Roger (2017-01-01). GEOGRAPHIC INFORMATION USE IN WEAKLY-SUPERVISED DEEP LEARNING FOR LANDMARK RECOGNITION. IEEE International Conference on Multimedia and Expo (ICME) : 1015-1020. ScholarBank@NUS Repository. https://doi.org/10.1109/ICME.2017.8019376 | Abstract: | The successful deep convolutional neural networks for visual object recognition typically rely on a massive number of training images that are well annotated by class labels or object bounding boxes with great human efforts. Here we explore the use of the geographic metadata, which are automatically retrieved from sensors such as GPS and compass, in weakly-supervised learning techniques for landmark recognition. The visibility of a landmark in a frame can be calculated based on the camera's field-of-view and the landmark's geometric information such as location and height. Subsequently, a training dataset is generated as the union of the frames with presence of at least one target landmark. To reduce the impact of the intrinsic noise in the geo-metadata, we present a frame selection method that removes the mistakenly labeled frames with a two-step approach consisting of (1) Gaussian Mixture Model clustering based on camera location followed by (2) outlier removal based on visual consistency. We compare the classification results obtained from the ground truth labels and the noisy labels derived from the raw geo-metadata. Experiments show that training based on the raw geo-metadata achieves a Mean Average Precision (MAP) of 0.797. Moreover, by applying our proposed representative frame selection method, the MAP can be further improved by 6.4%, which indicates the promising use of the geo-metadata in weakly-supervised learning techniques. | Source Title: | IEEE International Conference on Multimedia and Expo (ICME) | URI: | https://scholarbank.nus.edu.sg/handle/10635/200729 | ISBN: | 9781509060672 | ISSN: | 19457871 1945788X |
DOI: | 10.1109/ICME.2017.8019376 |
Appears in Collections: | Staff Publications Elements |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
paper_306.pdf | 523.32 kB | Adobe PDF | OPEN | Post-print | View/Download |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.