Language identification in degraded and distorted document images

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/41050

DC Field	Value
dc.title	Language identification in degraded and distorted document images
dc.contributor.author	Lu, S.
dc.contributor.author	Tan, C.L.
dc.contributor.author	Huang, W.
dc.date.accessioned	2013-07-04T08:18:27Z
dc.date.available	2013-07-04T08:18:27Z
dc.date.issued	2006
dc.identifier.citation	Lu, S.,Tan, C.L.,Huang, W. (2006). Language identification in degraded and distorted document images. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3872 LNCS : 232-242. ScholarBank@NUS Repository.
dc.identifier.isbn	3540321403
dc.identifier.issn	03029743
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/41050
dc.description.abstract	This paper presents a language identification technique that differentiates Latin-based languages in degraded and distorted document images. Different from the reported methods that transform word images through a character shape coding process, our method directly captures word shapes with the local extremum points and the horizontal intersection numbers, which are both tolerant of noise, character segmentation errors, and slight skew distortions. For each language studied, a word shape template and a word frequency template are firstly constructed based on the proposed word shape coding scheme. Identification is then accomplished based on Bray Curtis or Hamming distance between the word shape code of query images and the constructed word shape and frequency templates. Experiments show the average identification rate upon eight Latin-based languages reaches over 99%. © Springer-Verlag Berlin Heidelberg 2006.
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.sourcetitle	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.description.volume	3872 LNCS
dc.description.page	232-242
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM