Efficiently extracting frequent subgraphs using MapReduce

Please use this identifier to cite or link to this item: https://doi.org/10.1109/BigData.2013.6691633

DC Field	Value
dc.title	Efficiently extracting frequent subgraphs using MapReduce
dc.contributor.author	Lu, W.
dc.contributor.author	Chen, G.
dc.contributor.author	Tung, A.K.H.
dc.contributor.author	Zhao, F.
dc.date.accessioned	2014-07-04T03:12:40Z
dc.date.available	2014-07-04T03:12:40Z
dc.date.issued	2013
dc.identifier.citation	Lu, W.,Chen, G.,Tung, A.K.H.,Zhao, F. (2013). Efficiently extracting frequent subgraphs using MapReduce. Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013 : 639-647. ScholarBank@NUS Repository. <a href="https://doi.org/10.1109/BigData.2013.6691633" target="_blank">https://doi.org/10.1109/BigData.2013.6691633</a>
dc.identifier.isbn	9781479912926
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/78122
dc.description.abstract	Frequent subgraph extraction from a large number of small graphs is a primitive operation for many data mining applications. To extract frequent subgraphs, existing techniques need to enumerate a large number of subgraphs which is superlinear with the cardinality of the dataset. Given the rapid growing volume of graph data, it is difficult to perform the frequent subgraph extraction on a centralized machine efficiently. In this paper, we investigate how to efficiently perform this extraction over very large datasets using MapReduce. Parallelizing existing techniques directly using MapReduce does not yield good performance as it is difficult to balance the workload among the compute nodes. We therefore propose a framework that adopts the breadth first search strategy to iteratively extract frequent subgraphs, i.e., all frequent size-(i+1) subgraphs are generated based on frequent size-i subgraphs at the ith iteration using a single MapReduce job. To efficiently extract frequent subgraphs, we propose an isomorphism-testing-free approach by properly maintaining how frequent subgraphs are mapped within each graph. Extensive experiments conducted on our in-house clusters demonstrate the superiority of our proposed solution in comparison with the baseline approach. © 2013 IEEE.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/BigData.2013.6691633
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.doi	10.1109/BigData.2013.6691633
dc.description.sourcetitle	Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
dc.description.page	639-647
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM