Please use this identifier to cite or link to this item:
|Title:||Improving OCR text categorization accuracy with electronic abstracts|
|Source:||Liniin, L.,Chew, L.T. (2006). Improving OCR text categorization accuracy with electronic abstracts. Proceedings - Second International Conference on Document Image Analysis for Libraries, DIAL 2006 2006 : 82-87. ScholarBank@NUS Repository. https://doi.org/10.1109/DIAL.2006.22|
|Abstract:||Categorization of imaged documents is a useful technique for building document image based digital libraries. This paper investigates techniques to improve categorization accuracy on OCR text, particularly that of biomedical imaged documents. Experiments with different feature selection methods were run to explore their effect on the categorization performance. The result shows that document frequency is a good feature selection method in terms of eliminating OCR errors. Furthermore, our categorization scheme IMP that combines OCR text and electronic abstracts shows consistent improvement on the accuracy as compared to categorizing on either abstracts or OCR text alone. © 2006 IEEE.|
|Source Title:||Proceedings - Second International Conference on Document Image Analysis for Libraries, DIAL 2006|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Jan 16, 2018
checked on Jan 14, 2018
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.