Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/34443
Title: Combining Speech with textual methods for arabic diacritization
Authors: AISHA SIDDIQA AZIM
Keywords: Arabic, NLP, speech,diacritics,interpolation,multi-modal
Issue Date: 20-Jan-2012
Source: AISHA SIDDIQA AZIM (2012-01-20). Combining Speech with textual methods for arabic diacritization. ScholarBank@NUS Repository.
Abstract: The majority of studies on Arabic diacritization have employed textually inferred features alone. This thesis proposes a novel approach, where the weighted combination of speech with a text-based model is used to allow linguistically-insensitive acoustic information to correct and complement the errors generated by the text model's diacritic predictions. The acoustic model is based on Hidden Markov Models and the textual model on Conditional Random Fields. The combination brings significant reduction in error rates across all metrics, especially in case endings, which are the most difficult to predict. It gives results superior to those of conventional methods, with diacritic and word error rates of 1.6 and 5.2 inclusive of case endings, and 1.0 and 3.0 exclusive of them. Additionally, an interesting comparison is made between the diacritized solutions provided by two of the most popular morphological tools in the field of Arabic NLP, in the context of our combined system.
URI: http://scholarbank.nus.edu.sg/handle/10635/34443
Appears in Collections:Master's Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
AishaSA_MScThesis.pdf2.39 MBAdobe PDF

OPEN

NoneView/Download

Page view(s)

207
checked on Dec 11, 2017

Download(s)

335
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.