Please use this identifier to cite or link to this item: https://doi.org/10.1109/access.2019.2924250
Title: Resource Mention Extraction for MOOC Discussion Forums
Authors: An, Ya-Hui
Pan, Liangming
Kan, Min-Yen 
Dong, Qiang
Fu, Yan
Keywords: cs.CL
cs.CL
cs.AI
cs.DL
cs.IR
cs.SI
Issue Date: 2019
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: An, Ya-Hui, Pan, Liangming, Kan, Min-Yen, Dong, Qiang, Fu, Yan (2019). Resource Mention Extraction for MOOC Discussion Forums. IEEE Access 7 : 87887-87900. ScholarBank@NUS Repository. https://doi.org/10.1109/access.2019.2924250
Abstract: In discussions hosted on discussion forums for MOOCs, references to online learning resources are often of central importance. They contextualize the discussion, anchoring the discussion participants' presentation of the issues and their understanding. However they are usually mentioned in free text, without appropriate hyperlinking to their associated resource. Automated learning resource mention hyperlinking and categorization will facilitate discussion and searching within MOOC forums, and also benefit the contextualization of such resources across disparate views. We propose the novel problem of learning resource mention identification in MOOC forums. As this is a novel task with no publicly available data, we first contribute a large-scale labeled dataset, dubbed the Forum Resource Mention (FoRM) dataset, to facilitate our current research and future research on this task. We then formulate this task as a sequence tagging problem and investigate solution architectures to address the problem. Importantly, we identify two major challenges that hinder the application of sequence tagging models to the task: (1) the diversity of resource mention expression, and (2) long-range contextual dependencies. We address these challenges by incorporating character-level and thread context information into a LSTM-CRF model. First, we incorporate a character encoder to address the out-of-vocabulary problem caused by the diversity of mention expressions. Second, to address the context dependency challenge, we encode thread contexts using an RNN-based context encoder, and apply the attention mechanism to selectively leverage useful context information during sequence tagging. Experiments on FoRM show that the proposed method improves the baseline deep sequence tagging models notably, significantly bettering performance on instances that exemplify the two challenges.
Source Title: IEEE Access
URI: https://scholarbank.nus.edu.sg/handle/10635/157128
ISSN: 2169-3536
DOI: 10.1109/access.2019.2924250
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
08743365.pdfPublished version1.15 MBAdobe PDF

OPEN

PublishedView/Download
1811.08853v1.pdfSubmitted version654.91 kBAdobe PDF

OPEN

Pre-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.