Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICME.2017.8019376
Title: GEOGRAPHIC INFORMATION USE IN WEAKLY-SUPERVISED DEEP LEARNING FOR LANDMARK RECOGNITION
Authors: Yin, Yifang 
Liu, Zhenguang
Zimmermann, Roger
Keywords: Science & Technology
Technology
Computer Science, Software Engineering
Computer Science, Theory & Methods
Engineering, Electrical & Electronic
Computer Science
Engineering
Issue Date: 1-Jan-2017
Publisher: IEEE
Citation: Yin, Yifang, Liu, Zhenguang, Zimmermann, Roger (2017-01-01). GEOGRAPHIC INFORMATION USE IN WEAKLY-SUPERVISED DEEP LEARNING FOR LANDMARK RECOGNITION. IEEE International Conference on Multimedia and Expo (ICME) : 1015-1020. ScholarBank@NUS Repository. https://doi.org/10.1109/ICME.2017.8019376
Abstract: The successful deep convolutional neural networks for visual object recognition typically rely on a massive number of training images that are well annotated by class labels or object bounding boxes with great human efforts. Here we explore the use of the geographic metadata, which are automatically retrieved from sensors such as GPS and compass, in weakly-supervised learning techniques for landmark recognition. The visibility of a landmark in a frame can be calculated based on the camera's field-of-view and the landmark's geometric information such as location and height. Subsequently, a training dataset is generated as the union of the frames with presence of at least one target landmark. To reduce the impact of the intrinsic noise in the geo-metadata, we present a frame selection method that removes the mistakenly labeled frames with a two-step approach consisting of (1) Gaussian Mixture Model clustering based on camera location followed by (2) outlier removal based on visual consistency. We compare the classification results obtained from the ground truth labels and the noisy labels derived from the raw geo-metadata. Experiments show that training based on the raw geo-metadata achieves a Mean Average Precision (MAP) of 0.797. Moreover, by applying our proposed representative frame selection method, the MAP can be further improved by 6.4%, which indicates the promising use of the geo-metadata in weakly-supervised learning techniques.
Source Title: IEEE International Conference on Multimedia and Expo (ICME)
URI: https://scholarbank.nus.edu.sg/handle/10635/200729
ISBN: 9781509060672
ISSN: 19457871
1945788X
DOI: 10.1109/ICME.2017.8019376
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
paper_306.pdf523.32 kBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.