Please use this identifier to cite or link to this item: https://doi.org/10.1145/956750.956832
Title: Carpenter: Finding closed patterns in long biological datasets
Authors: Pan, F.
Cong, G. 
Tung, A.K.H. 
Yang, J.
Zaki, M.J.
Keywords: Closed pattern
Frequent pattern
Row enumeration
Issue Date: 2003
Citation: Pan, F.,Cong, G.,Tung, A.K.H.,Yang, J.,Zaki, M.J. (2003). Carpenter: Finding closed patterns in long biological datasets. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining : 637-642. ScholarBank@NUS Repository. https://doi.org/10.1145/956750.956832
Abstract: The growth of bioinformatics has resulted in datasets with new characteristics. These datasets typically contain a large number of columns and a small number of rows. For example, many gene expression datasets may contain 10,000-100,000 columns but only 100-1000 rows.Such datasets pose a great challenge for existing (closed) frequent pattern discovery algorithms, since they have an exponential dependence on the average row length. In this paper, we describe a new algorithm called CARPENTER that is specially designed to handle datasets having a large number of attributes and relatively small number of rows. Several experiments on real bioinformatics datasets show that CARPENTER is orders of magnitude better than previous closed pattern mining algorithms like CLOSET and CHARM. Copyright 2003 ACM.
Source Title: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
URI: http://scholarbank.nus.edu.sg/handle/10635/40124
DOI: 10.1145/956750.956832
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.