Please use this identifier to cite or link to this item: https://doi.org/10.1109/DIAL.2006.22
Title: Improving OCR text categorization accuracy with electronic abstracts
Authors: Liniin, L.
Chew, L.T. 
Issue Date: 2006
Source: Liniin, L.,Chew, L.T. (2006). Improving OCR text categorization accuracy with electronic abstracts. Proceedings - Second International Conference on Document Image Analysis for Libraries, DIAL 2006 2006 : 82-87. ScholarBank@NUS Repository. https://doi.org/10.1109/DIAL.2006.22
Abstract: Categorization of imaged documents is a useful technique for building document image based digital libraries. This paper investigates techniques to improve categorization accuracy on OCR text, particularly that of biomedical imaged documents. Experiments with different feature selection methods were run to explore their effect on the categorization performance. The result shows that document frequency is a good feature selection method in terms of eliminating OCR errors. Furthermore, our categorization scheme IMP that combines OCR text and electronic abstracts shows consistent improvement on the accuracy as compared to categorizing on either abstracts or OCR text alone. © 2006 IEEE.
Source Title: Proceedings - Second International Conference on Document Image Analysis for Libraries, DIAL 2006
URI: http://scholarbank.nus.edu.sg/handle/10635/41781
ISBN: 0769525318
DOI: 10.1109/DIAL.2006.22
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

3
checked on Jan 16, 2018

Page view(s)

49
checked on Jan 14, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.