Please use this identifier to cite or link to this item: https://doi.org/10.1109/TCSVT.2013.2255396
Title: Gradient vector flow and grouping-based method for arbitrarily oriented scene text detection in video images
Authors: Shivakumara, P.
Phan, T.Q. 
Lu, S.
Tan, C.L. 
Keywords: Arbitrarily oriented text detection
candidate text components (CTC)
dominant text pixel
gradient vector flow (GVF)
text candidates (TC)
text components
Issue Date: 2013
Citation: Shivakumara, P., Phan, T.Q., Lu, S., Tan, C.L. (2013). Gradient vector flow and grouping-based method for arbitrarily oriented scene text detection in video images. IEEE Transactions on Circuits and Systems for Video Technology 23 (10) : 1729-1739. ScholarBank@NUS Repository. https://doi.org/10.1109/TCSVT.2013.2255396
Abstract: Text detection in videos is challenging due to low resolution and complex background of videos. Besides, an arbitrary orientation of scene text lines in video makes the problem more complex and challenging. This paper presents a new method that extracts text lines of any orientations based on gradient vector flow (GVF) and neighbor component grouping. The GVF of edge pixels in the Sobel edge map of the input frame is explored to identify the dominant edge pixels which represent text components. The method extracts edge components corresponding to dominant pixels in the Sobel edge map, which we call text candidates (TC) of the text lines. We propose two grouping schemes. The first finds nearest neighbors based on geometrical properties of TC to group broken segments and neighboring characters which results in word patches. The end and junction points of skeleton of the word patches are considered to eliminate false positives, which output the candidate text components (CTC). The second is based on the direction and the size of the CTC to extract neighboring CTC and to restore missing CTC, which enables arbitrarily oriented text line detection in video frame. Experimental results on different datasets, including arbitrarily oriented text data, nonhorizontal and horizontal text data, Hua's data and ICDAR-03 data (camera images), show that the proposed method outperforms existing methods in terms of recall, precision and f-measure. © 1991-2012 IEEE.
Source Title: IEEE Transactions on Circuits and Systems for Video Technology
URI: http://scholarbank.nus.edu.sg/handle/10635/77864
ISSN: 10518215
DOI: 10.1109/TCSVT.2013.2255396
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.