Please use this identifier to cite or link to this item: https://doi.org/10.1023/A:1023245904128
DC FieldValue
dc.titleText retrieval from document images based on word shape analysis
dc.contributor.authorTan, C.L.
dc.contributor.authorHuang, W.
dc.contributor.authorSung, S.Y.
dc.contributor.authorYu, Z.
dc.contributor.authorXu, Y.
dc.date.accessioned2013-07-04T07:40:51Z
dc.date.available2013-07-04T07:40:51Z
dc.date.issued2003
dc.identifier.citationTan, C.L., Huang, W., Sung, S.Y., Yu, Z., Xu, Y. (2003). Text retrieval from document images based on word shape analysis. Applied Intelligence 18 (3) : 257-270. ScholarBank@NUS Repository. https://doi.org/10.1023/A:1023245904128
dc.identifier.issn0924669X
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39403
dc.description.abstractIn this paper, we propose a method of text retrieval from document images using a similarity measure based on word shape analysis. We directly extract image features instead of using optical character recognition. Document images are segmented into word units and then features called vertical bar patterns are extracted from these word units through local extrema points detection. All vertical bar patterns are used to build document vectors. Lastly, we obtain the pair-wise similarity of document images by means of the scalar product of the document vectors. Four corpora of news articles were used to test the validity of our method. During the test, the similarity of document images using this method was compared with the result of ASCII version of those documents based on the N-gram algorithm for text documents.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1023/A:1023245904128
dc.sourceScopus
dc.subjectDocument image analysis
dc.subjectDocument vector
dc.subjectSimilarity measure
dc.subjectText retrieval
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1023/A:1023245904128
dc.description.sourcetitleApplied Intelligence
dc.description.volume18
dc.description.issue3
dc.description.page257-270
dc.description.codenAPITE
dc.identifier.isiut000182096600003
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.