Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/34443
Title: | Combining Speech with textual methods for arabic diacritization | Authors: | AISHA SIDDIQA AZIM | Keywords: | Arabic, NLP, speech,diacritics,interpolation,multi-modal | Issue Date: | 20-Jan-2012 | Citation: | AISHA SIDDIQA AZIM (2012-01-20). Combining Speech with textual methods for arabic diacritization. ScholarBank@NUS Repository. | Abstract: | The majority of studies on Arabic diacritization have employed textually inferred features alone. This thesis proposes a novel approach, where the weighted combination of speech with a text-based model is used to allow linguistically-insensitive acoustic information to correct and complement the errors generated by the text model's diacritic predictions. The acoustic model is based on Hidden Markov Models and the textual model on Conditional Random Fields. The combination brings significant reduction in error rates across all metrics, especially in case endings, which are the most difficult to predict. It gives results superior to those of conventional methods, with diacritic and word error rates of 1.6 and 5.2 inclusive of case endings, and 1.0 and 3.0 exclusive of them. Additionally, an interesting comparison is made between the diacritized solutions provided by two of the most popular morphological tools in the field of Arabic NLP, in the context of our combined system. | URI: | http://scholarbank.nus.edu.sg/handle/10635/34443 |
Appears in Collections: | Master's Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
AishaSA_MScThesis.pdf | 2.39 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.