Please use this identifier to cite or link to this item: https://doi.org/10.1109/TKDE.2007.190660
Title: Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: A one-to-one correspondence and mining algorithms
Authors: Li, J.
Liu, G. 
Li, H.
Wong, L. 
Keywords: Bioinformatics (genome or protein) database
Closed patterns
Maximal biclique subgraphs
Mining methods and algorithms
Issue Date: 2007
Citation: Li, J., Liu, G., Li, H., Wong, L. (2007). Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: A one-to-one correspondence and mining algorithms. IEEE Transactions on Knowledge and Data Engineering 19 (12) : 1625-1636. ScholarBank@NUS Repository. https://doi.org/10.1109/TKDE.2007.190660
Abstract: Maximal biclique (also known as complete bipartite) subgraphs can model many applications in Web mining, business, and bioinformatics. Enumerating maximal biclique subgraphs from a graph is a computationally challenging problem, as the size of the output can become exponentially large with respect to the vertex number when the graph grows. In this paper, we efficiently enumerate them through the use of closed patterns of the adjacency matrix of the graph. For an undirected graph G without self-loops, we prove that 1) the number of closed patterns in the adjacency matrix of G is even, 2) the number of the closed patterns is precisely double the number of maximal biclique subgraphs of G, and 3) for every maximal biclique subgraph, there always exists a unique pair of closed patterns that matches the two vertex sets of the subgraph. Therefore, the problem of enumerating maximal bicliques can be solved by using efficient algorithms for mining closed patterns, which are algorithms extensively studied in the data mining field. However, this direct use of existing algorithms causes a duplicated enumeration. To achieve high efficiency, we propose an O(mn) time delay algorithm for a nonduplicated enumeration, in particular, for enumerating those maximal bicliques with a large size, where m and n are the number of edges and vertices of the graph, respectively. We evaluate the high efficiency of our algorithm by comparing it to stateof-the-art algorithms on three categories of graphs: randomly generated graphs, benchmarks, and a real-life protein interaction network. In this paper, we also prove that if self-loops are allowed in a graph, then the number of closed patterns in the adjacency matrix is not necessarily even, but the maximal bicliques are exactly the same as those of the graph after removing all the self-loops. © 2007 IEEE.
Source Title: IEEE Transactions on Knowledge and Data Engineering
URI: http://scholarbank.nus.edu.sg/handle/10635/41728
ISSN: 10414347
DOI: 10.1109/TKDE.2007.190660
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.