Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/41523
DC FieldValue
dc.titleScript and language identification in degraded and distorted document images
dc.contributor.authorLu, S.
dc.contributor.authorTan, C.L.
dc.date.accessioned2013-07-04T08:29:32Z
dc.date.available2013-07-04T08:29:32Z
dc.date.issued2006
dc.identifier.citationLu, S.,Tan, C.L. (2006). Script and language identification in degraded and distorted document images. Proceedings of the National Conference on Artificial Intelligence 1 : 769-774. ScholarBank@NUS Repository.
dc.identifier.isbn1577352815
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/41523
dc.description.abstractThis paper reports a statistical identification technique that differentiates scripts and languages in degraded and distorted document images. We identify scripts and languages through document vectorization, which transforms each document image into an electronic document vector that characterizes the shape and frequency of the contained character and word images. We first identify scripts based on the density and distribution of vertical runs between character strokes and a vertical scan line. Latin-based languages are then differentiated using a set of word shape codes constructed using horizontal word runs and character extremum points. Experimental results show that our method is tolerant to noise, document degradation, and slight document skew and attains an average identification rate over 95%. Copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.
dc.sourceScopus
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleProceedings of the National Conference on Artificial Intelligence
dc.description.volume1
dc.description.page769-774
dc.description.codenPNAIE
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Page view(s)

105
checked on Nov 24, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.