Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/180254
Title: | KNOWLEDGE ACQUISITION FOR MEDICAL EXPERT SYSTEM | Authors: | LEE HIAN BENG | Issue Date: | 1999 | Citation: | LEE HIAN BENG (1999). KNOWLEDGE ACQUISITION FOR MEDICAL EXPERT SYSTEM. ScholarBank@NUS Repository. | Abstract: | Knowledge acquisition is one of the most difficult tasks in the development of expert systems. Both domain experts and knowledge engineers are needed in building rules for expert systems. It is highly desirable to extract automatically knowledge from books with little human assistance. Such a project is formidable in terms of complexity and diversity, but achievable given a limited medical domain. The system uses a smrut optical character recognition (OCR) module to extract the text, followed by a natural language processing (NLP) module to understand the text, and lastly a generator to create the rules for the expert system. Such a system must have an accurate OCR frontend. At low level processing where recognition is done on a character by character basis, the performance can never match that of a human reader. High-level context must be used to enhance the recognition system. A contextual processor is applied to provide dynamic feedback to enhance the recognition accuracy in OCR. The contextual processor uses natural language (NL) rules to generate possible candidates from the input word(s) and also to filter away unlikely ones so that the most likely word is selected at the end. The words already recognized in a given sentence are used to widen the context to make the processing more effective. The statistical classifier uses contextual feedback at runtime to adjust itself. The character recognition rate is improved from 97.2% to 99.3%. The error rate can be cut down drastically. The effort to bring forward a portion of the NL rules to the contextual processing is small since a full-fledged NL processor has to be implemented, The building of such a contextual processor and NL processor is supported by a machine-readable dictionary which provides syntactic and semantic information. Besides conventional lexical information, the dictionary also provides compact codes representing selectional restrictions and subject domain which are used by the contextual processor. Together with the information derived from interpreting definition text of the words using a bootstrapping process, the program can generate semantic frames to form the knowledge base. For NLP, parsing is based on an augmented phrase structure grammar. This allows the syntactic analysis to support semantic interpretation. The syntactic analyzer uses a fast deterministic LALR(1) parsing scheme which is generated by YACC. By subdividing token classes, and using lexical lookahead, we can tackle LR parsing conflicts. In the context of written medical text, the grammar is able to achieve about 81% coverage because medical text is fairly precise in its exposition and it has less literary constructions. In order to allow the parser to proceed and to have a single parse tree only, POS ambiguity must be resolved. This is achieved by using a statistical part-of-speech tagger. The resolution of word sense ambiguities is attempted by using the restriction of syntactic and semantic constraints. Alternatively, exemplar-based and probabilistic sense disambiguation using tagged corpora can produce good results. | URI: | https://scholarbank.nus.edu.sg/handle/10635/180254 |
Appears in Collections: | Ph.D Theses (Restricted) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
b21602608.pdf | 5.24 MB | Adobe PDF | RESTRICTED | None | Log In |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.