Please use this identifier to cite or link to this item: https://doi.org/10.1145/3123266.3123380
Title: Video Visual Relation Detection
Authors: Xindi Shang 
Tongwei Ren
Jingfan Guo
Hanwang Zhang 
Tat-Seng Chua 
Keywords: Relational association
Video visual relation
Visual relation detection
Visual relation tagging
Issue Date: 23-Oct-2017
Publisher: Association for Computing Machinery, Inc
Citation: Xindi Shang, Tongwei Ren, Jingfan Guo, Hanwang Zhang, Tat-Seng Chua (2017-10-23). Video Visual Relation Detection. ACM Multimedia Conference 2017 : 1300-1308. ScholarBank@NUS Repository. https://doi.org/10.1145/3123266.3123380
Abstract: As a bridge to connect vision and language, visual relations between objects in the form of relation triplet , such as "person-touch-dog" and "cat-above-sofa", provide a more comprehensive visual content understanding beyond objects. In this paper, we propose a novel vision task named Video Visual Relation Detection (VidVRD) to perform visual relation detection in videos instead of still images (ImgVRD). As compared to still images, videos provide a more natural set of features for detecting visual relations, such as the dynamic relations like "A-follow-B" and "A-towards-B", and temporally changing relations like "A-chase-B" followed by "A-hold-B". However, VidVRD is technically more challenging than ImgVRD due to the difficulties in accurate object tracking and diverse relation appearances in video domain. To this end, we propose a VidVRD method, which consists of object tracklet proposal, short-term relation prediction and greedy relational association. Moreover, we contribute the first dataset for VidVRD evaluation, which contains 1,000 videos with manually labeled visual relations, to validate our proposed method. On this dataset, our method achieves the best performance in comparison with the state-of-the-art baselines. © 2017 ACM.
Source Title: ACM Multimedia Conference 2017
URI: https://scholarbank.nus.edu.sg/handle/10635/167294
ISBN: 9781450349062
DOI: 10.1145/3123266.3123380
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Video Visual Relation Detection.pdf2.61 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.