Please use this identifier to cite or link to this item: https://doi.org/10.1109/TCSVT.2011.2130230
Title: Localized multiple kernel learning for realistic human action recognition in videos
Authors: Song, Y.
Zheng, Y.-T.
Tang, S.
Zhou, X.
Zhang, Y.
Lin, S.
Chua, T.-S. 
Keywords: Action recognition
localized classifier
multiple kernel learning
Issue Date: 2011
Source: Song, Y., Zheng, Y.-T., Tang, S., Zhou, X., Zhang, Y., Lin, S., Chua, T.-S. (2011). Localized multiple kernel learning for realistic human action recognition in videos. IEEE Transactions on Circuits and Systems for Video Technology 21 (9) : 1193-1202. ScholarBank@NUS Repository. https://doi.org/10.1109/TCSVT.2011.2130230
Abstract: Realistic human action recognition in videos has been a useful yet challenging task. Video shots of same actions may present huge intra-class variations in terms of visual appearance, kinetic patterns, video shooting, and editing styles. Heterogeneous feature representations of videos pose another challenge on how to effectively handle the redundancy, complementariness and disagreement in these features. This paper proposes a localized multiple kernel learning (L-MKL) algorithm to tackle the issues above. L-MKL integrates the localized classifier ensemble learning and multiple kernel learning in a unified framework to leverage the strengths of both. The basis of L-MKL is to build multiple kernel classifiers on diverse features at subspace localities of heterogeneous representations. L-MKL integrates the discriminability of complementary features locally and enables localized MKL classifiers to deliver better performance in its own region of expertise. Specifically, L-MKL develops a locality gating model to partition the input space of heterogeneous representations to a set of localities of simpler data structure. Each locality then learns its localized optimal combination of Mercer kernels of heterogeneous features. Finally, the gating model coordinates the localized multiple kernel classifiers globally to perform action recognition. Experiments on two datasets show that the proposed approach delivers promising performance. © 2011 IEEE.
Source Title: IEEE Transactions on Circuits and Systems for Video Technology
URI: http://scholarbank.nus.edu.sg/handle/10635/39867
ISSN: 10518215
DOI: 10.1109/TCSVT.2011.2130230
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

19
checked on Dec 11, 2017

WEB OF SCIENCETM
Citations

16
checked on Dec 11, 2017

Page view(s)

58
checked on Dec 9, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.