Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/238651
Title: | SELF-ENHANCED VOCABULARY LEARNING LATENT DIRECHLET ALLOCATION | Authors: | HUANG YI HSIANG | Keywords: | tweets, LDA, BTM, NLP, topic, text | Issue Date: | 31-Aug-2022 | Citation: | HUANG YI HSIANG (2022-08-31). SELF-ENHANCED VOCABULARY LEARNING LATENT DIRECHLET ALLOCATION. ScholarBank@NUS Repository. | Abstract: | Many studies have been conducted to identify valuable information with sentiments and classifications. In the latter, topics models such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003) and Biterm Topic Model (BTM) (Yan et al., 2013) are conventional probabilistic models designed to unveil latent topic structure within texts. In the paper, we proposed a novel topic model to process short texts called Self-Enhanced Vocabulary Learning Latent Dirichlet Allocation (SVL-LDA), which is an extension from LDA by incorporating rolling window training of parameters of prior distribution to handle the sparsity problem. Especially, a larger base of information is utilized to finetune the corpus-level parameters. Then the LDA module will identify the coherence of topics-documents and word-topics. Empirical results show that our approach appears to outperform the baseline methods under certain conditions and demonstrate the practical application to social media. | URI: | https://scholarbank.nus.edu.sg/handle/10635/238651 |
Appears in Collections: | Master's Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
HuangYH.pdf | 1.6 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.