Please use this identifier to cite or link to this item:
|Title:||A multilabel text classification algorithm for labeling risk factors in sec form 10-K|
|Authors:||Huang, K.-W. |
|Source:||Huang, K.-W.,Li, Z. (2011). A multilabel text classification algorithm for labeling risk factors in sec form 10-K. ACM Transactions on Management Information Systems 2 (3). ScholarBank@NUS Repository. https://doi.org/10.1145/2019618.2019624|
|Abstract:||This study develops, implements, and evaluates a multilabel text classification algorithm called the multilabel categorical K-nearest neighbor (ML-CKNN). The proposed algorithm is designed to automatically identify 25 types of risk factors with specific meanings reported in Section 1A of SEC form 10-K. The idea of ML-CKNN is to compute a categorical similarity score for each label by the K-nearest neighbors in that category. ML-CKNN is tailored to achieve the goal of extracting risk factors from 10Ks. The proposed algorithm can perfectly classify 74.94% of risk factors and 98.75% of labels. Moreover, ML-CKNN is empirically shown to outperform ML-KNN and other multilabel algorithms. The extracted risk factors could be valuable to empirical studies in accounting or finance. © 2011 ACM.|
|Source Title:||ACM Transactions on Management Information Systems|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Dec 12, 2017
checked on Dec 8, 2017
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.