Please use this identifier to cite or link to this item: https://doi.org/10.1109/TCSVT.2012.2198129
Title: Multioriented video scene text detection through bayesian classification and boundary growing
Authors: Shivakumara, P. 
Sreedhar, R.P.
Phan, T.Q. 
Lu, S.
Tan, C.L. 
Keywords: Bayesian classifier
boundary growing
Laplacian-Sobel product (LSP)
maximum gradient difference
multioriented video scene text detection
text candidate detection
Issue Date: 2012
Source: Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Lu, S., Tan, C.L. (2012). Multioriented video scene text detection through bayesian classification and boundary growing. IEEE Transactions on Circuits and Systems for Video Technology 22 (8) : 1227-1235. ScholarBank@NUS Repository. https://doi.org/10.1109/TCSVT.2012.2198129
Abstract: Multioriented text detection in video frames is not as easy as detection of captions or graphics or overlaid texts, which usually appears in the horizontal direction and has high contrast compared to its background. Multioriented text generally refers to scene text that makes text detection more challenging and interesting due to unfavorable characteristics of scene text. Therefore, conventional text detection methods may not give good results for multioriented scene text detection. Hence, in this paper, we present a new enhancement method that includes the product of Laplacian and Sobel operations to enhance text pixels in videos. To classify true text pixels, we propose a Bayesian classifier without assuming a priori probability about the input frame but estimating it based on three probable matrices. Three different ways of clustering are performed on the output of the enhancement method to obtain the three probable matrices. Text candidates are obtained by intersecting the output of the Bayesian classifier with the Canny edge map of the input frame. A boundary growing method is introduced to traverse the multioriented scene text lines using text candidates. The boundary growing method works based on the concept of nearest neighbors. The robustness of the method has been tested on a variety of datasets that include our own created data (nonhorizontal and horizontal text data) and two publicly available data, namely, video frames of Hua and complex scene text data of ICDAR 2003 competition (camera images). Experimental results show that the performance of the proposed method is encouraging compared with results of existing methods in terms of recall, precision, F-measures, and computational times. © 2012 IEEE.
Source Title: IEEE Transactions on Circuits and Systems for Video Technology
URI: http://scholarbank.nus.edu.sg/handle/10635/39498
ISSN: 10518215
DOI: 10.1109/TCSVT.2012.2198129
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

51
checked on Dec 11, 2017

WEB OF SCIENCETM
Citations

37
checked on Dec 11, 2017

Page view(s)

71
checked on Dec 9, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.