Please use this identifier to cite or link to this item: https://doi.org/10.1145/2019618.2019624
Title: A multilabel text classification algorithm for labeling risk factors in sec form 10-K
Authors: Huang, K.-W. 
Li, Z.
Keywords: Annual reports
Multilabel classification
Risk factors
Text classification
Text mining
Issue Date: 2011
Source: Huang, K.-W.,Li, Z. (2011). A multilabel text classification algorithm for labeling risk factors in sec form 10-K. ACM Transactions on Management Information Systems 2 (3). ScholarBank@NUS Repository. https://doi.org/10.1145/2019618.2019624
Abstract: This study develops, implements, and evaluates a multilabel text classification algorithm called the multilabel categorical K-nearest neighbor (ML-CKNN). The proposed algorithm is designed to automatically identify 25 types of risk factors with specific meanings reported in Section 1A of SEC form 10-K. The idea of ML-CKNN is to compute a categorical similarity score for each label by the K-nearest neighbors in that category. ML-CKNN is tailored to achieve the goal of extracting risk factors from 10Ks. The proposed algorithm can perfectly classify 74.94% of risk factors and 98.75% of labels. Moreover, ML-CKNN is empirically shown to outperform ML-KNN and other multilabel algorithms. The extracted risk factors could be valuable to empirical studies in accounting or finance. © 2011 ACM.
Source Title: ACM Transactions on Management Information Systems
URI: http://scholarbank.nus.edu.sg/handle/10635/42467
ISSN: 2158656X
DOI: 10.1145/2019618.2019624
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

15
checked on Dec 12, 2017

Page view(s)

68
checked on Dec 8, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.